Search site

How to Implement Distributed Tracing in Microservices with OpenTelemetry Auto-Instrumentation

OpenTelemetry

Updated on: February 15, 2026

This guide shows you how to implement OpenTelemetry’s auto-instrumentation for complete distributed tracing across your microservices, from initial setup through production optimization and troubleshooting.

How Distributed Tracing Works in Microservices

At its core, distributed tracing tracks requests as they flow through a distributed system. Each trace captures a complete journey, from the initial API gateway request to the last database write or message publication. Inside a trace, individual operations are represented as spans, each capturing duration, attributes and status. By visualizing this information, you can pinpoint latency bottlenecks, identify errors and understand dependencies between services.

A distributed trace showing a request flowing through multiple microservices. Each horizontal bar represents a span (operation), with the x-axis showing time and nested spans showing service dependencies. The trace shows: API Gateway (180ms total) coordinating Auth Service (30ms), Cart Service (50ms), Payment Service (70ms) with their respective database calls.

Imagine an e-commerce platform with an API gateway that calls authentication, cart, payment, and notification services. A simple checkout may involve 10–15 different components. If latency spikes, a trace will reveal whether the root cause is the database query in the payment service, a downstream timeout in the email service or an overloaded cache in the cart service. This type of visibility is impossible with logs or metrics alone.

Distributed tracing provides two essential benefits for SREs and DevOps engineers:

It dramatically reduces mean time to resolution (MTTR) by exposing the exact point of failure, and it enables continuous performance tuning through detailed latency analysis.
It also helps teams understand architectural dependencies that emerge organically over time, such as hidden service-to-service calls.

What is OpenTelemetry and How Does It Enable Distributed Tracing?

OpenTelemetry (OTel) is an open-source, CNCF graduated project for collecting telemetry data: traces, metrics, and logs, from any application, in any language. It provides the instrumentation libraries, SDKs, and exporters needed to collect and send data to any observability backend.

A span in OpenTelemetry represents a single operation, such as an HTTP request or a database query. A trace is a collection of spans that share the same trace ID, forming a tree that represents the full request path. OpenTelemetry attaches contextual metadata to each span following semantic conventions, such as service.name, environment, version and host, allowing you to group and filter traces later.

Diagram showing how a single trace contains multiple spans in a tree structure. The root span represents the initial request, with child spans for each service call, database query, and cache operation. Each span contains: Trace ID (shared), Span ID (unique), Parent Span ID, Start/End timestamps, Attributes (http.method, db.statement), and Status.

Context propagation is what allows traces to connect across service boundaries. When Service A calls Service B, the trace context (trace ID and parent span ID) is passed along, usually via the W3C traceparent header. Without proper propagation, spans appear isolated and the trace is incomplete.

Every OpenTelemetry setup involves an SDK, one or more exporters, and a collector or backend. The SDK manages spans, processors, and samplers. Exporters send the data to an endpoint using the OpenTelemetry Protocol (OTLP) over gRPC or HTTP. The OpenTelemetry Collector or agent receives this data, processes it, and forwards it to the observability platform.

How Does Auto-Instrumentation Work? Benefits and Implementation

Auto-instrumentation represents the most significant advancement in making distributed tracing accessible to production environments. Instead of manually adding tracing code throughout your application, auto-instrumentation agents detect and wrap common frameworks, libraries, and protocols automatically. This approach delivers immediate visibility with zero code changes, making it the recommended starting point for any OpenTelemetry implementation.

The magic happens through runtime manipulation, but each language uses a different approach to achieve zero-code instrumentation.

How Auto-Instrumentation Works by Language

Language	Instrumentation Method	How It Works	Agent Attachment
Java	Bytecode Instrumentation	Modifies JVM bytecode at runtime	-javaagent:agent.jar
Python	Monkey Patching	Replaces functions at import time	opentelemetry-instrument wrapper
Node.js	Module Wrapping	Patches require() and wraps exports	–require ./tracing.js
.NET	CLR Profiling API	Intercepts method calls via CLR	Environment variables or NuGet
Go	Manual wrapping required	No auto-instrumentation available	Compile-time wrapping
Ruby	Monkey Patching	Modifies classes at runtime	require ‘opentelemetry’
PHP	Extension hooks	Uses PHP extension API	extension=opentelemetry.so

Bytecode Instrumentation (Java, .NET)

Bytecode instrumentation is the most powerful auto-instrumentation method, working at the virtual machine level. The agent modifies the bytecode of classes as they’re loaded, inserting tracing code without changing source files. This happens transparently when you start your application with the agent:

# Java example
java -javaagent:opentelemetry-javaagent.jar -jar myapp.jar# The agent intercepts class loading and modifies methods like:
# - HttpServlet.service() → wrapped with span creation
# - PreparedStatement.execute() → wrapped with SQL capture
# - KafkaProducer.send() → wrapped with message tracing

This approach provides the deepest integration, capturing everything from servlet containers to JDBC drivers, with zero application code changes.

Monkey Patching (Python, Ruby)

Monkey patching dynamically modifies classes and modules at runtime by replacing their methods with instrumented versions. The OpenTelemetry SDK wraps your application startup, patching libraries before your code runs:

# Python wraps your app at startup
opentelemetry-instrument python myapp.py# Behind the scenes, it patches libraries:
# - requests.get → wrapped version with span creation
# - django.views → wrapped with request tracing
# - psycopg2.connect → wrapped with database tracing

This method is simple to implement but requires careful ordering – instrumentation must happen before libraries are imported.

Module Wrapping (Node.js)

Node.js auto-instrumentation works by intercepting the require() function and wrapping module exports. When your application loads a library, the instrumentation intercepts it and returns a wrapped version:

// Start with instrumentation
node --require ./tracing.js myapp.js// The tracing.js file hooks into require():
// - require('express') → returns wrapped Express with tracing
// - require('mysql') → returns wrapped MySQL client
// - require('@aws-sdk/client-s3') → returns wrapped AWS SDK

This approach uses Node.js’s module system, making it reliable across different package managers and module formats.

Libraries and Frameworks Covered by OpenTelemetry Auto-Instrumentation

What makes auto-instrumentation particularly powerful is its depth of coverage. The OpenTelemetry Java agent, for instance, instruments over 100 libraries and frameworks out of the box. It captures servlet containers like Tomcat and Jetty, HTTP clients including OkHttp and Apache HttpClient, JDBC connections to any database, message queues like Kafka and RabbitMQ, caching layers such as Redis and Memcached, and even AWS SDK calls. Each instrumentation module understands the semantics of what it’s tracing, adding appropriate attributes like http.method, db.statement, or messaging.destination that make traces immediately useful for debugging.

Example: What Gets Traced in a Spring Boot Microservice

Consider a typical Spring Boot microservice. With auto-instrumentation, a single HTTP request automatically generates spans for the incoming HTTP server request, any Spring MVC controller invocations, JDBC queries with full SQL statements, outgoing HTTP calls to other services, Redis cache operations, and Kafka message publications. The agent also ensures proper context propagation across all these operations, maintaining trace continuity even through asynchronous boundaries.

How OpenTelemetry Captures Errors and Performance Metrics Automatically

Auto-instrumentation goes beyond basic operation tracking. It captures exceptions and stack traces when errors occur, records response codes and status information, measures queue times and connection pool waiting, and adds resource attributes about the runtime environment. This rich context transforms raw timing data into actionable insights. When a database query shows high latency, you can immediately see the exact SQL statement, the connection pool state, and whether the delay was in acquiring a connection or executing the query itself.

Manual vs Auto-Instrumentation: When to Use Each Approach

Manual instrumentation still has its place, primarily for capturing business-specific operations that auto-instrumentation cannot understand. Examples include domain events like order processing stages, custom caching logic, batch job progress, or proprietary protocol interactions. The key is to use manual instrumentation to supplement auto-instrumentation, not replace it. Most production systems achieve excellent observability with 95% auto-instrumentation and 5% manual additions for critical business logic.

Aspect	Auto-Instrumentation	Manual Instrumentation
Setup Time	Minutes, just attach the agent	Hours to days, requires code changes
Code Changes	Zero, no application code modified	Extensive, spans added throughout code
Coverage	Automatic for all supported libraries	Only what you explicitly instrument
Maintenance	Automatically updated with agent	Requires ongoing code maintenance
Business Context	Limited to technical operations	Can capture business specific metrics
Performance Impact	~2-5% overhead	Variable, depends on implementation
Best For	HTTP calls, databases, queues, caches	Business events, custom protocols, domain logic

Table: Comparison of Auto-Instrumentation and Manual Instrumentation with OpenTelemetry

The optimal approach combines both: use auto-instrumentation for technical coverage, then add manual instrumentation for critical business operations that need specific context.

Manual Instrumentation Example

Here’s how to add manual spans to capture business context that auto-instrumentation misses:

// Manual span to augment auto-instrumentation
Span span = tracer.spanBuilder("order.validation").startSpan();
try (Scope scope = span.makeCurrent()) {
  validateInventory(order);
  validatePayment(order);
  span.setAttribute("order.total", order.getTotal());
  span.setAttribute("order.items", order.getItemCount());
} finally {
  span.end();
}

OpenTelemetry Auto-Instrumentation Setup for Microservices

Implementing auto-instrumentation varies by runtime, but the pattern remains consistent: attach an agent or SDK, configure the export destination, and start your application. The following examples demonstrate production-ready configurations for common platforms. For more detailed SDK documentation, see Sematext OpenTelemetry SDKs.

Java Microservices (Spring Boot, Quarkus, Micronaut)

The Java agent works with any JVM application, from Spring Boot to Quarkus to legacy servlet containers. Download the agent JAR and attach it via the -javaagent flag:

# Download the agent

curl -L https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar -o opentelemetry-javaagent.jar

# Run with full instrumentation

java -javaagent:./opentelemetry-javaagent.jar \
-Dotel.service.name=payment-service \
-Dotel.exporter.otlp.endpoint=http://your-collector:4318 \
-Dotel.exporter.otlp.protocol=http/protobuf \
-Dotel.metrics.exporter=none \
-Dotel.logs.exporter=none \
-Dotel.instrumentation.jdbc.statement-sanitizer.enabled=true \
-Dotel.instrumentation.common.db-statement-sanitizer.enabled=true \
-Dotel.resource.attributes=deployment.environment=production,service.version=2.5.1 \
-Dotel.propagators=tracecontext,baggage \
-Dotel.javaagent.debug=false \
-jar your-application.jar

For containerized environments, integrate the agent directly into your Docker image:

FROM eclipse-temurin:17-jre-alpine
RUN apk add --no-cache curl# Add the OpenTelemetry agent
RUN curl -L https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar \
-o /opt/opentelemetry-javaagent.jar# Copy your application
COPY target/payment-service.jar /opt/app.jar# Configure the agent via environment variables
ENV JAVA_TOOL_OPTIONS="-javaagent:/opt/opentelemetry-javaagent.jar"
ENV OTEL_SERVICE_NAME="payment-service"
ENV OTEL_EXPORTER_OTLP_ENDPOINT="http://sematext-agent:4318"
ENV OTEL_METRICS_EXPORTER="none"
ENV OTEL_LOGS_EXPORTER="none"ENTRYPOINT ["java", "-jar", "/opt/app.jar"]

Node.js Microservices (Express, Fastify, NestJS)

The Node.js instrumentation requires a small initialization file but then automatically instruments all supported packages.

// tracing.js - Initialize before your application code

const { NodeSDK } = require('@opentelemetry/sdk-node');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const { BatchSpanProcessor } = require('@opentelemetry/sdk-trace-base');

const traceExporter = new OTLPTraceExporter({
  url:
    process.env.OTEL_EXPORTER_OTLP_ENDPOINT ||
    'http://localhost:4318/v1/traces',
  headers: {},
});

const sdk = new NodeSDK({
  resource: new Resource({
    [SemanticResourceAttributes.SERVICE_NAME]:
      process.env.SERVICE_NAME || 'api-gateway',
    [SemanticResourceAttributes.SERVICE_VERSION]:
      process.env.SERVICE_VERSION || '1.0.0',
    [SemanticResourceAttributes.DEPLOYMENT_ENVIRONMENT]:
      process.env.NODE_ENV || 'development',
  }),

  spanProcessor: new BatchSpanProcessor(traceExporter, {
    maxQueueSize: 2048,
    maxExportBatchSize: 512,
    scheduledDelayMillis: 5000,
  }),

  instrumentations: [
    getNodeAutoInstrumentations({
      '@opentelemetry/instrumentation-fs': {
        enabled: false, // Too noisy for production
      },

      '@opentelemetry/instrumentation-http': {
        requestHook: (span, request) => {
          span.setAttribute(
            'http.request.body.size',
            request.headers['content-length']
          );
        },

        ignoreIncomingRequestHook: (request) => {
          // Ignore health checks and metrics endpoints
          return request.url?.match(/^\/(health|metrics|ready|live)/);
        },
      },

      '@opentelemetry/instrumentation-aws-sdk': {
        suppressInternalInstrumentation: true,
      },
    }),
  ],
});

sdk.start();

// Graceful shutdown
process.on('SIGTERM', () => {
  sdk
    .shutdown()
    .then(() => console.log('Tracing terminated'))
    .catch((error) =>
      console.log('Error terminating tracing', error)
    )
    .finally(() => process.exit(0));
});

Start your application with the initialization:

node --require ./tracing.js app.js

Python Microservices (FastAPI, Django, Flask)

Python auto-instrumentation uses the opentelemetry-instrument command to wrap your application:

# Install the required packages
pip install opentelemetry-distro[otlp] opentelemetry-instrumentation
# Bootstrap to install all available instrumentations
opentelemetry-bootstrap --action=install
# Run with auto-instrumentation
OTEL_SERVICE_NAME=cart-service \
OTEL_EXPORTER_OTLP_ENDPOINT=http://your-collector:4318 \
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf \
OTEL_METRICS_EXPORTER=none \
OTEL_LOGS_EXPORTER=none \
OTEL_RESOURCE_ATTRIBUTES="service.version=1.2.3,deployment.environment=production" \
opentelemetry-instrument python app.py

For production deployments using Gunicorn or uWSGI:

# gunicorn_config.py
import os
from opentelemetry import trace
from opentelemetry.instrumentation.auto_instrumentation import sitecustomizedef post_fork(server, worker):
# Force re-initialization after fork
sitecustomize.initialize()bind = "0.0.0.0:8000"
workers = 4
worker_class = "uvicorn.workers.UvicornWorker"

.NET Microservices (ASP.NET Core)

.NET instrumentation can be done via NuGet packages or using the automatic instrumentation agent:

// Program.cs

using OpenTelemetry.Exporter;
using OpenTelemetry.Instrumentation.AspNetCore;
using OpenTelemetry.Instrumentation.Http;
using OpenTelemetry.Instrumentation.SqlClient;
using OpenTelemetry.Resources;
using OpenTelemetry.Trace;

var builder = WebApplication.CreateBuilder(args);

// Configure OpenTelemetry
builder.Services
    .AddOpenTelemetry()
    .ConfigureResource(resource => resource
        .AddService("inventory-service", serviceVersion: "2.1.0")
        .AddAttributes(new Dictionary<string, object>
        {
            ["deployment.environment"] = builder.Environment.EnvironmentName,
            ["host.name"] = Environment.MachineName
        }))
    .WithTracing(tracing => tracing
        .AddAspNetCoreInstrumentation(options =>
        {
            options.Filter = httpContext =>
            {
                // Exclude health checks
                return !httpContext.Request.Path.Value?.Contains("health") ?? true;
            };

            options.RecordException = true;
        })
        .AddHttpClientInstrumentation()
        .AddSqlClientInstrumentation(options =>
        {
            options.SetDbStatementForText = true;
            options.RecordException = true;
            options.SetDbStatementForStoredProcedure = true;
        })
        .AddEntityFrameworkCoreInstrumentation()
        .AddRedisInstrumentation()
        .AddOtlpExporter(otlpOptions =>
        {
            otlpOptions.Endpoint = new Uri("http://your-collector:4318");
            otlpOptions.Protocol = OtlpExportProtocol.HttpProtobuf;
        })
        .SetSampler(new TraceIdRatioBasedSampler(0.1))); // 10% sampling

var app = builder.Build();

From Instrumentation to Insights: What’s Next?

With OpenTelemetry auto-instrumentation now running across your microservices, you’re collecting comprehensive trace data from every request, database query, and service interaction. The agents are capturing timing, errors, and context automatically. But instrumentation is just the foundation.

The real value of distributed tracing comes from using this data to:

Debug Production Issues – Traces reveal performance problems that are invisible in logs or metrics alone. Issues like N+1 database queries, connection pool exhaustion, service dependency bottlenecks, and timeout cascades become immediately apparent in trace visualizations. Learn how to diagnose these issues step-by-step in our guide to Troubleshooting Microservices with OpenTelemetry Distributed Tracing.

Optimize for Production Scale – While auto-instrumentation works out of the box, production deployments require careful tuning. From implementing intelligent sampling strategies to ensuring context propagation across async boundaries, there are proven patterns for running OpenTelemetry at scale. Learn these critical configurations and avoid common pitfalls in OpenTelemetry Instrumentation Best Practices for Microservices Observability.

Extract Operational Intelligence – Raw traces contain rich insights about your system’s behavior. By analyzing span relationships and attributes, you can build service dependency maps, identify critical paths that impact latency, detect performance regressions between deployments, and understand resource utilization patterns.

The following sections provide a foundation for using your newly instrumented traces effectively, with links to our detailed guides for deeper exploration.

How Sematext Uses OpenTelemetry

OpenTelemetry with auto-instrumentation provides extensive data collection, but you need a backend to store and analyze this data. While open-source options like Jaeger and Zipkin work well for development, and commercial APMs like Datadog require proprietary agents, Sematext Tracing offers a fully OpenTelemetry-native platform that handles the scale and cardinality of production microservices without vendor lock-in.

Frequently Asked Questions

Does OpenTelemetry impact microservices performance?

Auto-instrumentation typically adds 2-5% CPU overhead and 30-50MB memory per service according to official benchmarks. With 10% sampling, the impact is negligible for most production workloads. Performance impact can be further minimized by disabling noisy instrumentations and optimizing batch processor settings – see our guide to [OpenTelemetry Instrumentation Best Practices for Microservices Observability] for detailed performance tuning strategies.

OpenTelemetry vs commercial APM tools – what’s the difference?

OpenTelemetry provides vendor-neutral instrumentation that works with any backend. Commercial APMs use proprietary agents that lock you to their platform. OpenTelemetry gives you freedom to switch backends (i.e. observability vendors) without re-instrumenting your entire stack.

Can OpenTelemetry handle production scale?

Yes. Companies like Uber and Netflix use OpenTelemetry-based tracing at massive scale, processing billions of spans daily. The key is choosing a backend that can handle your data volume and implementing appropriate sampling strategies. Learn how to configure OpenTelemetry for high-volume production deployments in our comprehensive guide: [OpenTelemetry Instrumentation Best Practices for Microservices Observability].

Is OpenTelemetry production-ready?

OpenTelemetry tracing reached stability in 2021 and is production-ready for all major languages. Major cloud providers and observability vendors now support OTLP natively.

Conclusion

OpenTelemetry’s auto-instrumentation agents handle the complexity of trace collection, context propagation, and data formatting. They work across languages and frameworks, providing consistent telemetry regardless of your technology stack. The zero-code approach means you can instrument legacy services, third-party applications, and rapidly evolving microservices with equal ease.

By combining OpenTelemetry auto-instrumentation with an appropriate backend, you create a production-ready observability solution that scales from proof-of-concept to enterprise deployment. Auto-instrumentation provides the data, and modern backends provide the intelligence to deliver the visibility you need to operate distributed systems with confidence.

The future of observability isn’t about instrumenting more code, it’s about extracting more value from the instrumentation that happens automatically.

Start Free Trial

Log in

Search site

How to Implement Distributed Tracing in Microservices with OpenTelemetry Auto-Instrumentation

Table of contents

How Distributed Tracing Works in Microservices

What is OpenTelemetry and How Does It Enable Distributed Tracing?

How Does Auto-Instrumentation Work? Benefits and Implementation

How Auto-Instrumentation Works by Language

Bytecode Instrumentation (Java, .NET)

Monkey Patching (Python, Ruby)

Module Wrapping (Node.js)

Libraries and Frameworks Covered by OpenTelemetry Auto-Instrumentation

Example: What Gets Traced in a Spring Boot Microservice

How OpenTelemetry Captures Errors and Performance Metrics Automatically

Manual vs Auto-Instrumentation: When to Use Each Approach

Manual Instrumentation Example

OpenTelemetry Auto-Instrumentation Setup for Microservices

Java Microservices (Spring Boot, Quarkus, Micronaut)

Node.js Microservices (Express, Fastify, NestJS)

Python Microservices (FastAPI, Django, Flask)

From Instrumentation to Insights: What’s Next?

How Sematext Uses OpenTelemetry

Frequently Asked Questions

Does OpenTelemetry impact microservices performance?

OpenTelemetry vs commercial APM tools – what’s the difference?

Can OpenTelemetry handle production scale?

Is OpenTelemetry production-ready?

Conclusion

OpenTelemetry Production Monitoring: What Breaks, and How to Prevent It

Troubleshooting Microservices with OpenTelemetry Distributed Tracing

OpenTelemetry in Production: Design for Order, High Signal, Low Noise, and Survival

OpenTelemetry Production Monitoring: What Breaks, and How to Prevent It

Troubleshooting Microservices with OpenTelemetry Distributed Tracing

OpenTelemetry in Production: Design for Order, High Signal, Low Noise, and Survival

Search site

How to Implement Distributed Tracing in Microservices with OpenTelemetry Auto-Instrumentation

Table of contents

How Distributed Tracing Works in Microservices

What is OpenTelemetry and How Does It Enable Distributed Tracing?

How Does Auto-Instrumentation Work? Benefits and Implementation

How Auto-Instrumentation Works by Language

Bytecode Instrumentation (Java, .NET)

Monkey Patching (Python, Ruby)

Module Wrapping (Node.js)

Libraries and Frameworks Covered by OpenTelemetry Auto-Instrumentation

Example: What Gets Traced in a Spring Boot Microservice

How OpenTelemetry Captures Errors and Performance Metrics Automatically

Manual vs Auto-Instrumentation: When to Use Each Approach

Manual Instrumentation Example

OpenTelemetry Auto-Instrumentation Setup for Microservices

Java Microservices (Spring Boot, Quarkus, Micronaut)

Node.js Microservices (Express, Fastify, NestJS)

Python Microservices (FastAPI, Django, Flask)

From Instrumentation to Insights: What’s Next?

How Sematext Uses OpenTelemetry

Frequently Asked Questions

Does OpenTelemetry impact microservices performance?

OpenTelemetry vs commercial APM tools – what’s the difference?

Can OpenTelemetry handle production scale?

Is OpenTelemetry production-ready?

Conclusion

OpenTelemetry Production Monitoring: What Breaks, and How to Prevent It

Troubleshooting Microservices with OpenTelemetry Distributed Tracing

OpenTelemetry in Production: Design for Order, High Signal, Low Noise, and Survival

Related posts:

Related posts:

OpenTelemetry Production Monitoring: What Breaks, and How to Prevent It

Troubleshooting Microservices with OpenTelemetry Distributed Tracing

OpenTelemetry in Production: Design for Order, High Signal, Low Noise, and Survival