Hypertrace is a cloud native distributed tracing and observability platform designed to ingest and analyze large volumes of production trace data. various open-source and enterprise agents and tracers collect and send observability data directly from applications to Hypertrace for analysis. Data visualizations, reports, and dashboards are available in a web-based console to assist in monitoring cloud-native applications and resolving application and service performance problems.

Let's take a look at Hypertrace architecture and undestand each component in details.

space-1.jpg
Hypertrace Architecture

Working of Hypertrace starts with developers instrumenting their applications with tracing libraries (For ex., Zipkin, Jaeger). So the Hypertrace pipeline starts with Hypertrace OC Collector which is implementation based on OpenCensus Service. The Hypertrace OC Collector is a component that runs “nearby” (e.g. in the same VPC, AZ, etc.) a user’s application components and receives trace spans and metrics emitted by the tasks instrumented with Tracing libraries.

Hypertrace OC Collector writes spans from this processes/services to Kafka and then will be processed further by Hypertrace ingestion pipeline. Different tracers can have different span formats and as we support tracers like zipkin and jaeger the spans coming via hypertrace-oc-collector to kafka can have different fields in them. span-normalizer reads spans from kafka and adds more first class fields to span object like http url, http method, http status code, grpc method, grpc status message etc. so that platform downstream can access the values from the span. We call this normalized span raw-span which will be further processed by raw-spans-grouper. You can find the first class fields here.

raw-spans-grouper fetches this raw-spans from kafka and creates structured traces out of it based on the trace id and the parent span id in the spans. So from this point onward we have traces and not spans!

So suppose we have two spans with traceID:1234 then we will group them into a single trace.

Trace-enrichers are being used to enrich traces with entity information. hypertrace-trace-enricher service talks to entity-service which fetches entity information from Mongo as required. For example, Let's say we got a span which has http method related attribute method: /api/v1/user?name. So, in this case, if we already have Endpoint entity which refers to /api/v1/user, we fetch the id of that entity and add it to span. Now, span will have one more attribute like this method:/api/v1/user?name, api_id:1234.

In parallel we have Hypertrace view creator running which is a bootstrap job that runs once and creates required views in pinot - spanEventView, serviceCallView, backendEntityView, rawServiceView, rawTraceView.

Once these views are there in pinot, hypertrace-view-generator consumes enriched traces produced by Hypertrace trace enricher from kafka, applies required filters, transformations and then materializes them into pinot views.

This was whole backend process going on which got our data ready but let's see how query architecture is working which will serve user.

hypertrace-UI talks to hypertrace-GraphQL service which serves the GraphQL API which queries data from downstream services. GraphQL services talks to gateway service.

Attribute service fetches all attributes relevant to the scope of what is being shown on UI from Mongo, where it also stores attributes. All of this metadata assists the UI and backend to perform things in a generic fashion, like

  • All String attributes in the UI can have a different look at feel and also different operations that can be supported on them(in the explorer).
  • The backend also uses part of the information(like the sources) to figure out where to fetch data from i.e. query-service vs entity-service.

While hypertrace offers a simple service model, it also permits storing certain fields in different backends. Attributes include tags to help locate the service hosting their corresponding data.

Ex.

key: apiName,
value_kind: TYPE_STRING,
groupable : true,
display_name: Endpoint Name,
scope: API_TRACE,
sources: [QS],
type: ATTRIBUTE

For example, apiName is generated from raw trace data streams. sources: [QS] communicates this, as Query Service is the api that queries that data (via Pinot).

Gateway service routes queries to corresponding downstream service based on the source of attributes and then does appropriate type conversion of data returned by downstream services. Gateway service talks to Attribute service to find where to forward the request. Here Query service interfares with Apache Pinot datastore Whereas Entity service here provides CRUD operations for raw or enriched entities, for its types, and their relations.

So if we query UI to search for trace with particular traceID using query like below

curl -s localhost:2020/graphql -H 'Content-Type: application/graphql' -d \
'{
  traces(
    type: API_TRACE
    between: {
      startTime: "2015-01-01T00:00:00.000Z"
      endTime: "2025-01-01T00:00:00.000Z"
    }
    filterBy: [
      {
        operator: EQUALS
        value: "348bae39282251a5"
        type: ID
        idType: API_TRACE
      }
    ]
  ) {
    total
  }
}'

What it will do is it will send this to Gateway service which will forward this to query service and Query service will check if this traceID is present in rawTraceView if it's there it will return 1 or will return 0!

This was brief about how User side of Hypertrace communicates with Backend i.e. Tracing and data collection side and both work in sync to provide great insights into the application for users!


Here are the articles in this section: