Registration is open - Live, Instructor-led Online Classes - Elasticsearch in March - Solr in April - OpenSearch in May. See all classes


OpenTracing: Jaeger as Distributed Tracer

In the previous two parts of OpenTracing series, we provided a good OpenTracing overview, explaining what OpenTracing is and does, how it works and what it aims to achieve and looked at Zipkin – a popular open-source distributed tracer.

In this blog post, we will look at Jaeger, a newer open-source distributed tracer developed under the CNCF umbrella.

To recap what we covered so far: the complexity of emerging software applications requires new approaches to understanding and then efficiently building, testing and debugging these systems. To that end, OpenTracing enables developers to instrument applications for distributed tracing with minimal effort. In order to instrument an application via OpenTracing API, it’s necessary to have an OpenTracing-compatible tracer correctly deployed and listening for incoming span requests.

The job of the OpenTracing API is to hide the differences between distributed tracer implementations, so you can easily swap them out at any time without needing to change your instrumentation. Zipkin and Jaeger are both such OpenTracing-compatible distributed tracers.

Prefer PDFs? Get the whole OpenTracing series as PDF: free OpenTracing eBook. Alternatively, follow @sematext if you are into observability in general.

Jaeger as distributed tracing system

Despite not being as mature as Zipkin, Jaeger is another distributed tracing system that’s in the process of massive adoption. The backend is implemented in Go language and it has support for in-memory, Cassandra and Elasticsearch span stores.

“Jaeger – young, but production-ready distributed tracer written in Go. Low overhead, dynamic sampling, designed with scale in mind. #opentracing”

Jaeger’s architecture is built with scalability and parallelism in mind. The client emits traces to the agent which listens for inbound spans and routes them to the collector. The responsibility of the collector is to validate, transform and store the spans to the persistent storage. To access tracing data from the storage, the query service exposes a REST API endpoints and the React based UI.

Jaeger opentracing architecture

Figure 1. Jaeger architecture

Jaeger can be installed from sources using the Go toolchain (go compiler, glide and yarn package managers are necessary to run the build process). Run these commands to fetch the latest Jaeger version and produce the binary:

$ export GOPATH=/opt/jaeger
$ go get -v github.com/jaegertracing/jaeger
$ cd $GOPATH/src/github.com/jaegertracing/jaeger
$ make install build-all-in-on-linux

The binary ships with all components embedded. Use the following command to run the agent, collector and query service with in-memory storage enabled:

$ $GOPATH/cmd/standalone/standalone-linux --span-storage.type memory

jaeger startup log opentracing

Figure 2. Jaeger startup log

If that’s too much pain, we can fetch the official Docker image and spawn a container by running the following command:

$ docker run -name jaeger -d -p 5775:5775/udp -p 6831:6831/udp -p 6832:6832/udp \

 -p 5778:5578 -p 16686:16686 -p 14268:14268 jaegertracing/all-in-one:latest

Additionally, you can run each of the components in a separate container by pulling the corresponding image.

To build the images manually and orchestrate the execution of the containers use this docker-compose deployment descriptor. You may notice the repository contains dedicated Dockerfiles for each Jaeger component. In order to orchestrate the workload via docker-compose we had to bake an additional tool (dockerize) into the image for the purpose of waiting for the availability of Elasticsearch API port. Otherwise, bootstrapping collector or query service will fail if they are not able to connect to an instance of Elasticsearch.

To explore the traces, navigate to http://localhost:16686.

jaeger ui open tracing

Figure 3: Main Jaeger UI

Span ingestion with Jaeger

The agent receives span requests from the client over UDP (port 5775) on the local machine.

The spans are batched, encoded as Thrift structures and submitted to the collector. Agent is able to poll for sampling strategies from the backend and propagate the sampling rate to all tracer clients. That’s an important design decision since it avoids establishing fixed sampling rates – crucial for environments with highly dynamic network behavior. Abstracting away the routing and discovery phase of the collectors from the client library is also the responsibility of the agent. Clients can also talk directly to the collector via HTTP. Again, the HTTP requests emitted by client contain Thrift batch payload with all reported spans grouped under it.

Jaeger can accept and handle Zipkin span requests transparently. To enable Zipkin HTTP inbound adapter, the collector has to be started with –collector.zipkin.http-port flag.

“Jaeger is compatible with Zipkin spans! Just start the Jaeger Collector with –collector.zipkin.http-port. #opentracing”

Storage with Jaeger

Jaeger collector supports various storage backends, including Cassandra, Elasticsearch, Kafka (for pushing a stream of traces to Kafka topic and consuming the spans from any Kafka client), as well as in-memory span store. There are also ongoing efforts to incorporate additional data stores such as ScyllaDB.

Using Jaeger with Elasticsearch

Elasticsearch storage is enabled by running the collector with the following flags (assuming Elasticsearch node is running on the local machine):

$ jaeger-collector --span-storage.type elasticsearch --es.username <username> --es.password <password>

Jaeger creates two indices in Elasticsearch – one for storing the services and the other one for the spans of given services.

Jaeger indices in elasticsearch open tracing

The mapping that describes the structure of the document for the span:

{
 "mappings":{
  "span":{
    "_all":{
      "enabled":false
    },
  "properties":{
    "duration":{
      "type":"long"
    },
  "flags":{
      "type":"integer"
   },
  "logs":{
  "properties":{
      "fields":{
        "type":"nested",
        "dynamic":"false",
        "properties":{
           "key":{
           "type":"keyword",
           "ignore_above":256
        },
      "tagType":{
         "type":"keyword",
         "ignore_above":256
      },
      "value":{
         "type":"keyword",
         "ignore_above":256
      }
    }
  },
  "timestamp":{
     "type":"long"
    }
   }
  },
  "operationName":{
     "type":"keyword",
     "ignore_above":256
  },
  "parentSpanID":{
     "type":"keyword",
     "ignore_above":256
   },
  "process":{
     "properties":{
        "serviceName":{
           "type":"keyword",
           "ignore_above":256
         },
     "tags":{
        "type":"nested",
        "dynamic":"false",
           "properties":{
              "key":{
                 "type":"keyword",
                 "ignore_above":256
              },
          "tagType":{
              "type":"keyword",
              "ignore_above":256
          },
          "value":{
              "type":"keyword",
              "ignore_above":256
          }
        }
      }
    }
  },
  "processID":{
     "type":"text",
     "fields":{
     "keyword":{
        "type":"keyword",
        "ignore_above":256
      }
    }
  },
  "references":{
     "type":"nested",
     "dynamic":"false",
     "properties":{
        "refType":{
        "type":"keyword",
        "ignore_above":256
      },
      "spanID":{
        "type":"keyword",
        "ignore_above":256
      },
  "traceID":{
      "type":"keyword",
      "ignore_above":256
     }
   }
  },
  "spanID":{
     "type":"keyword",
     "ignore_above":256
  },
  "startTime":{
     "type":"long"
  },
  "tags":{
     "type":"nested",
     "dynamic":"false",
     "properties":{
        "key":{
        "type":"keyword",
        "ignore_above":256
      },
      "tagType":{
        "type":"keyword",
        "ignore_above":256
      },
      "value":{
        "type":"keyword",
        "ignore_above":256
      }
    }
  },
  "traceID":{
       "type":"keyword",
       "ignore_above":256
     }
   }
  }
 }
}

Fields that comprise the body of the span’s document:

  • traceID – an unique identifier for the trace
  • spanID – span identifier
  • parentSpanID – the identifier of the parent span
  • operationName – human-readable operation name linked to the span
  • startTime – span start time expressed as UNIX epoch
  • duration – operation’s duration in millis
  • tags – list of annotations attached to the span

Here’s an example of a document indexed by Jaeger collector:

{

  "traceID" : "fd447889b1ad2e1e", 
  "spanID" : "9772c18c9d589627", 
  "parentSpanID" : "fd447889b1ad2e1e", 
  "flags" : 1, 
  "operationName" : "extract", 
  "references" : [ ], 
  "startTime" : 1502716789425000, 
  "duration" : 76, 
  "tags" : [ 
    { 
       "key" : "http.status_code", 
       "type" : "int64", 
       "value" : "200" 
    }, 
    { 
       "key" : "http.url", 
       "type" : "string", 
       "value" : "http://localhost:8081/extract" 
    } 
 ], 
 "logs" : [ ], 
 "processID" : "", 
 "process" : { 
 "serviceName" : "opentracing-extractor", 
 "tags" : [ 
    { 
       "key" : "hostname", 
       "type" : "string", 
       "value" : "archrabbit" 
    }, 
    { 
      "key" : "jaeger.version", 
      "type" : "string", 
      "value" : "Java-0.20.6" 
   }, 
   { 
      "key" : "ip", 
      "type" : "string", 
      "value" : "127.0.0.1" 
   } 
  ] 
 }, 
 "warnings" : null 
 } 
}

 Opentracing ebook sematext

Free OpenTracing eBook

Want to get useful how-to instructions, copy-paste code for tracer registration? We’ve prepared an OpenTracing eBook which puts all key OpenTracing information at your fingertips: from introducing OpenTracing, explaining what it is and does, how it works, to covering Zipkin followed by Jaeger, both being popular distributed tracers, and finally, compare Jaeger vs. Zipkin. Download yours.


Conclusion

Jaeger is a modern distributed tracer fully compatible with OpenTracing API. It has a number of clients for different programming languages, including Java, Go, Node.js, Python, PHP, and more.

Furthermore, it can also accept span requests from Zipkin clients, thus making it easy for existing Zipkin users to switch to Jaeger. Given its low resource overhead and a dozen of other attractive features, such as dynamic sampling strategies, most organizations are considering shifting from Zipkin to Jaeger tracer.

In our next post, we will compare Zipkin vs. Jaeger head to head!

Start Free Trial