At the end of November, we’ll be migrating the Sematext Logs backend from Elasticsearch to OpenSearch

OpenTracing Overview: Distributed Tracing’s Emerging Industry Standard

March 1, 2019

Table of contents

What was traditionally known as just Monitoring has clearly been going through a renaissance over the last few years. The industry as a whole is finally moving away from having Monitoring and Logging silos- something we’ve been doing and “preaching” for years at Sematext – and the term Observability emerged as the new moniker for everything that encompasses any form of infrastructure and application monitoring. Microservices have been around for over a decade under one name or another.

Now often deployed in separate containers it became obvious we need a way to trace transactions through various microservice layers, from the client all the way down to queues, storage, calls to external services, etc. This created a new interest in Transaction Tracing that, although not new, has now re-emerged as the third pillar of observability.

Traditionally, APM vendors had their own proprietary tracing agents and SDKs that would instrument applications, either automatically (blackbox instrumentation) or by having their users modify or annotate their apps’ source code (whitebox instrumentation). Long story short, this has issues such as vendor lock-in for users, and high costs associated with addition and maintenance of support for an ever-increasing number of technologies and their versions that need to be instrumented for vendors. Enter OpenTracing, vendor-neutral APIs and instrumentation for distributed tracing.

We’ll start this 5 part blog post series by introducing OpenTracing, explaining what it is and does, how it works, and why its adoption is growing. In subsequent posts we will first cover Zipkin, followed by Jaeger, both being popular distributed tracers, and finally, compare Jaeger vs. Zipkin.

In subsequent posts we will first cover Zipkin followed by Jaeger, both being popular distributed tracers, and finally, compare Jaeger vs. Zipkin.  If you prefer to read all four posts as a PDF you can also download it as a free OpenTracing eBook.  Alternatively, follow @sematext if you are into observability in general.

What is OpenTracing?

OpenTracing is a set of standards and techniques that allow for distributed tracing in a way that’s free of vendor lock-in. For that purpose, a coherent API specification is provided for numerous programming languages and frameworks.

Traditionally, the cross-cutting concerns responsible for code instrumentation were tightly coupled with underlying tracing platforms, making it hard to switch between tracing systems without incurring significant refactoring. This clearly exhibits a series of disadvantages that had notably influenced the embrace of white box-driven code observability.

The goal of OpenTracing is to put all tracers under the same roof so they converge on the common mechanism for trace description and propagation. Now, developers are able to incorporate tracing capabilities into, not only their applications but also any piece of software, ranging from web servers to modern service meshes can produce traces in a more vendor-neutral manner.

OpenTracing Benefits

OpenTracing aims to offer a consistent, unified and tracer-agnostic instrumentation API for a wide range of frameworks, platforms and programming languages. It abstracts away the differences among numerous tracer implementations, so shifting from an existing one to a new tracer system would only require configuration changes specific to that new tracer. For what it’s worth, we should mention the benefits of distributed tracing:

  • out of the box infrastructure overview: how the interactions between services are done and their dependencies
  • efficient and fast detection of latency issues
  • intelligent error reporting: Span transport error messages and stack traces. We can take advantage of that insight to identify root cause factors or cascading failures
  • trace data can be forwarded to log processing platforms for query and analysis.

Find out more about how OpenTracing works in the second part of the OpenTracing series.


 Opentracing ebook sematext

Free OpenTracing eBook

Want to get useful how-to instructions, copy-paste code for tracer registration? We’ve prepared an OpenTracing eBook which puts all key OpenTracing information at your fingertips: from introducing OpenTracing, explaining what it is and does, how it works, to covering Zipkin followed by Jaeger, both being popular distributed tracers, and finally, compare Jaeger vs. Zipkin. Download yours.


 

OpenTracing and Distributed Context Propagation

One of the most compelling and powerful features attributed to tracing systems is distributed context propagation. Context propagation composes the causal chain and dissects the transaction from inception to finalization – it illuminates the request’s path until its final destination.

From a technical point of view, context propagation is the ability for the system or application to extract the propagated span context from a variety of carriers like HTTP headers, AMQP message headers or Thrift fields, and then join the trace from that point. Context propagation is very efficient since it only involves propagating identifiers and baggage items. All other metadata like tags, logs, etc. isn’t propagated but transmitted asynchronously to the tracer system. It’s the responsibility of the tracer to assemble and construct the full trace from distinct spans that might be injected in-band / out-of-band.

OpenTracing standardizes context propagation across process boundaries by Inject/Extract pattern.

Adoption: Who uses OpenTracing

As organizations are embracing the cloud-native movement and thus migrating their applications from monolithic to microservice architectures, the need for general visibility and observability into software behavior becomes an essential requirement. Because the monolithic code base is segregated into multiple independent services running inside their own processes, which in addition can scale to various instances, such a trivial task as diagnosing the latency of an HTTP request issued from the client can end up being a serious deal. To fulfill the request, it has to propagate through load balancers, routers, gateways, cross machine’s boundaries to communicate with other microservices, send asynchronous messages to message brokers, etc.

Along with this pipeline, there could be a possible bottleneck, contention or communication issue in any of the aforementioned components. Debugging through such a complex workflow wouldn’t be feasible if not relying on some kind of tracing/instrumentation mechanism. That’s why distributed tracers like Zipkin, Jaeger or AppDash were born (most of them are inspired on Google’s Dapper large-scale distributed tracing platform). All of the after-mentioned tracers help engineers and operation teams to understand and reason about system behavior as the complexity of the infrastructure grows exponentially. Tracers expose the source of truth for the interactions originated within the system. Every transaction (if properly instrumented) might reflect performance anomalies in an early phase when new services are being introduced by (probably) independent teams with polyglot software stacks and continuous deployments.

Next steps

Cloud-native paradigm is creating a new reality and mindset of how software is built and deployed. Instead of static VM-centric infrastructures, the containers are first-class citizens in the world of programmable, automated, immutable infrastructure as code. Deployments are continuous and huge monolithic code bases are split into multiple independent microservices. Gaining deep visibility into our software stack is of critical importance.

OpenTracing is paving the way to make developers and DevOps engineers’ lives easier by helping us narrow down root cause problems or expose opportunities for optimization. Despite being still relatively young, OpenTracing is being widely adopted by big vendors as well as small organizations and teams.

We are looking forward to seeing OpenTracing become a universal tracing standard where each software component (be it a web framework, application server or the message broker) ships with built-in support for OpenTracing instrumentation to assemble and propagate traces out of the box.

In our next posts, we will cover Zipkin as OpenTracing-compatible Distributed Tracer, followed by Jaeger and a comparison of Jaeger vs Zipkin.

DevSecOps

Definition: What Is DevSecOps and How Does It Work? DevSecOps...

Datadog vs. Splunk: Which Is the Better Observability Solution [2024 Comparison]

Datadog and Splunk are among the most popular performance monitoring...

AWS Elasticsearch Service vs. Elasticsearch on EC2

Many of our customers use AWS EC2. In the context of...