Search site

Understanding OpenTelemetry: A Practical Guide

Observability is essential for understanding how modern applications perform and behave in production. OpenTelemetry has emerged as the industry standard for collecting, processing, and exporting telemetry data—traces, metrics, and logs—without vendor lock-in. This guide will walk you through OpenTelemetry’s core components, how it works, and why it’s a game-changer for observability.

What is OpenTelemetry?

OpenTelemetry is an open-source observability framework and toolkit designed to generate, collect, manage, and export telemetry data including – but not limited to – traces, metrics, and logs.

It is vendor- and tool-agnostic, meaning it seamlessly integrates with various components, from open-source tools like Jaeger and Prometheus to commercial solutions like Sematext.

Why Should You Care About OpenTelemetry?

Vendor-Agnostic & Open-Source: OpenTelemetry is not tied to any specific vendor. Put another way – there is no vendor lock-in. OpenTelemetry aims to remove the need for proprietary agents that collect vendor-specific observability data in a vendor-specific fashion. Thus, you can use – and easily switch – between Jaeger, Prometheus, Datadog, New Relic, or Sematext, just to list a few. Of course, each vendor still offers other features, so you’ll still want to compare vendors and choose the one whose features, costs, etc. you like best.
Standardized Instrumentation: Before OpenTelemetry, each monitoring tool had its instrumentation method, leading to vendor lock-in. OpenTelemetry eliminates this fragmentation, providing a universal standard for instrumentation across different languages and libraries.
Auto-Instrumentation for Faster Adoption: Manually adding instrumentation is tedious and error-prone. OpenTelemetry supports auto-instrumentation for many popular libraries and frameworks (e.g., Flask, FastAPI, Django, PostgreSQL), reducing the time and effort needed to get started.
Improved Debugging & Faster Issue Resolution: At one point in time metrics, logs, and traces went by the Three Pillars of Observability. These three pillars enable you to troubleshoot issues faster by correlating them, hopefully finding the root cause faster. However, each of these observability data was collected separately. There was no clear “connector” between them, so the correlation wasn’t seamless. OpenTelemetry fixes that. So now, by correlating traces, logs, and metrics, OpenTelemetry enables teams to pinpoint root causes faster, reducing MTTR (Mean Time to Resolution) and improving system reliability.

Basic Architecture

At a high level, OpenTelemetry collects telemetry data from applications via SDKs, processes it with an optional Collector, and exports it to observability backends.

Application SDKs – Libraries that instrument your code to collect traces, metrics and logs
Optional Collector – A standalone service that can receive, process and export telemetry data
Observability Backends – Systems that store and visualize your telemetry data

This simple pipeline provides flexibility in how you deploy OpenTelemetry. For detailed implementation options, see the “Architecture Approaches” section later in this guide, where we’ll explore different deployment models in depth.

Telemetry Signals

A signal refers to a stream of observability data. OpenTelemetry captures multiple types of telemetry data to give a complete picture of an application’s health and performance.

1. Traces: Traces track the journey of a request as it moves through different services and components in a system. They show how requests flow across services, helping developers identify bottlenecks and latency issues.

A trace consists of multiple spans, where each span represents a single operation or step in the request flow.

Traces help detect slow queries, network delays, and failures, making it easier to optimize performance and improve system reliability.

The image below shows a trace with spans representing different operations and their durations.

2. Metrics: Metrics provide numerical measurements of system and application performance over time. They reveal trends like CPU usage, memory consumption, request latency, and error rates.

Unlike traces which follow individual requests, metrics aggregate data to show system-wide patterns. OpenTelemetry supports counters, gauges, histograms and other metric types that can be exported to monitoring platforms like Prometheus, Sematext, and Datadog.

The image below shows CPU usage on the Sematext dashboard, displaying trends over time, process-specific usage, and resource consumption insights

3. Logs: Logs record events happening within an application, either in a structured or unstructured format. They capture important details such as errors, warnings, and system activities, making them essential for debugging.

OpenTelemetry enables logs to be correlated with traces, providing deeper context when troubleshooting issues. This correlation helps developers understand how specific events impact request flows.

Logs are also valuable for forensic analysis and long-term monitoring, allowing teams to track historical data and detect patterns over time.

Example of a structured log event:

{
  "timestamp": "2025-02-14T15:30:00Z",
  "level": "INFO",
  "message": "User login successful",
  "service": "auth-service",
  "user_id": "12345",
  "ip_address": "192.168.1.10",
  "request_id": "abc123-def456-ghi789"
}

4. Profiling (Experimental Feature): Profiling enhances observability by capturing detailed performance data at the code level. It helps developers analyze CPU usage, memory allocation, and execution time to identify inefficiencies – down to the line of code. OpenTelemetry’s continuous profiling runs with minimal overhead, making it suitable for production environments. By correlating profiles with traces and metrics, teams can connect high-level performance issues to specific code blocks, significantly accelerating troubleshooting and optimization

OpenTelemetry is still expanding its support for profiling across different programming languages, making this an evolving and exciting space in observability.

Getting Started

SDKs

Role of SDKs in Instrumenting Applications

OpenTelemetry SDKs provide the APIs and tools needed to:

Generate telemetry data (traces, metrics, logs)
Auto-instrument applications
Configure exporters, described shortly, to send data to observability tools

Each supported language has its own SDK, making it easy to integrate OpenTelemetry with different frameworks.

Language	Auto-Instrumentation	Manual Instrumentation	Supported Libraries & Frameworks
Java	Traces, Metrics, Logs	Traces, Metrics, Logs	Spring, Quarkus, Micronaut, Jakarta EE, JDBC, Hibernate, gRPC, Kafka, Tomcat, Jetty
Node.js	Traces, Metrics	Traces, Metrics	Express, Koa, Fastify, NestJS, GraphQL, MongoDB, Redis, PostgreSQL, MySQL, AWS SDK
Python	Traces, Metrics, Logs	Traces, Metrics	Django, Flask, FastAPI, SQLAlchemy, Requests, aiohttp, Celery, PyMongo, Tornado
.NET	Traces, Metrics, Logs	Traces, Metrics, Logs	ASP.NET Core, Entity Framework, gRPC, HttpClient, WCF
PHP	Traces	Traces, Metrics, Logs	Laravel, Symfony, Guzzle, PDO, Slim, Laminas, Doctrine
Ruby	–	Traces	ElasticSearch Client,GraphQL, Koala, LMDB
Go	WIP	Traces, Metrics	Gin-gonic, Echo, Fiber, Go-redis, Gorilla mux, Zap
C++	–	Traces, Metrics, Logs	httpd(Apache), Nginx, grpc
Rust	–	Traces (not stable yet)	Actix Web, Axum, Tide, Trillium
Erlang	–	Traces	Cowboy, Ecto, Elli, grpcbox, Oban
Swift	–	Traces	URLSession, NautilusTelemetry

OTLP Protocol

The OpenTelemetry Protocol (OTLP) is the default transport mechanism for OpenTelemetry. It standardizes how telemetry data is transmitted between applications, collectors, and observability platforms.

OTLP supports traces, metrics, and logs in a unified format and uses either gRPC or HTTP for data transfer. This ensures low latency and high throughput, making it suitable for large-scale distributed systems.

Instrumentation

To collect telemetry data, applications need to be instrumented with OpenTelemetry SDKs or agents. Instrumentation can be done in two ways:

Manual Instrumentation
Auto-Instrumentation

Manual vs. Auto-Instrumentation

Auto-Instrumentation: OpenTelemetry provides automatic instrumentation for many frameworks and libraries, requiring minimal or no code changes. Examples include:
- Java
- Python
- Node JS
- .NET
- PHP
Manual Instrumentation: Developers can use OpenTelemetry SDKs to manually define custom traces, metrics, or logs in their code.

Manual and Auto-Instrumentation Examples

To collect telemetry data, applications need to be instrumented with OpenTelemetry SDKs or agents. This can be done in two ways. We’ll use Python to illustrate both of them.

Auto-Instrumentation (Python)

Auto-instrumentation collects telemetry data without code changes.

1. Install dependencies

pip install opentelemetry-distro flask
opentelemetry-bootstrap -a install

2. Create a simple application (hello.py)

from flask import Flask
app = Flask(__name__)

@app.route("/hello")
def hello():
    return "Hello World"

if __name__ == "__main__":
    app.run(port=8080)

3. Configure OpenTelemetry Collector (config.yaml)

# Configuration for OpenTelemetry Collector with Sematext Exporter  
# For more details, see:  
# https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/sematextexporter  

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:

exporters:
  sematext:
    timeout: 500ms
    region: US 
    sending_queue:
        enabled: true
        num_consumers: 5
        queue_size: 100
    retry_on_failure:
      enabled: true
      initial_interval: 1s
      max_interval: 3s
      max_elapsed_time: 10s
    metrics:
      app_token: <METRICS_APP_TOKEN>
      payload_max_lines: 10000
      payload_max_bytes: 100000
    logs:
      app_token: <LOGS_APP_TOKEN>
  debug:
    verbosity: detailed

service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [sematext]
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [sematext]

4. Run with instrumentation

# Start collector
./otelcol-contrib --config=config.yaml

# Run instrumented application
opentelemetry-instrument \
  --traces_exporter otlp \
  --metrics_exporter otlp \
  --logs_exporter otlp \
  --service_name my_service \
  python hello.py

Manual Instrumentation (Python)

Manual instrumentation gives you precise control over what's traced.

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

# Set up the tracer
trace.set_tracer_provider(TracerProvider())
tracer = trace.get_tracer(__name__)

# Configure exporter
otlp_exporter = OTLPSpanExporter(endpoint="localhost:4317")
span_processor = BatchSpanProcessor(otlp_exporter)
trace.get_tracer_provider().add_span_processor(span_processor)

# Example function with manual instrumentation
def process_order(order_id):
    with tracer.start_as_current_span("process_order") as span:
        span.set_attribute("order.id", order_id)
        print(f"Processing order {order_id}")
       
        # Your business logic here
       
# Usage
process_order(123)

Both approaches send telemetry data to your observability platform like Sematext Cloud, where you can view traces, metrics, and logs.

Architecture Approaches

Collector-Based Architecture

The Collector-Based Architecture introduces an additional OpenTelemetry Collector component between the application and the observability backend. This collector plays a crucial role in handling telemetry data before it reaches its final destination.

How It Works

User Application (written in any language): The application is instrumented with the OpenTelemetry SDK. Auto-instrumentation collects telemetry data without requiring code modifications and transmits it via OTLP.
OpenTelemetry Collector: A centralized service that processes incoming telemetry data. It consists of:

Receivers – Accepts telemetry data from different sources.
Processors – Handles data enrichment, filtering, batching, and sampling.
Exporters – Transforms and sends data to multiple observability backends.

Vendor Backends: Telemetry data is forwarded to various backends such as Prometheus, Loki, Jaeger, Sematext, or Datadog.

Benefits

Protocol Translation – The collector can receive data in one format and export it in another, allowing integration with various systems.
Data Enrichment – It can add additional metadata, such as labels or resource attributes, before sending data to a backend.
Filtering & Sampling – Helps reduce data volume by discarding unnecessary logs, traces, or metrics.
Multiple Export Targets – Can send telemetry data to multiple destinations simultaneously.

Considerations

Additional Component to Manage – Requires deploying and maintaining an extra service.
More Complex Configuration – Needs proper setup to ensure optimal performance.
Higher Resource Usage – The collector itself consumes CPU and memory, adding overhead.

Direct Integration Architecture

The Direct Integration Architecture eliminates the OpenTelemetry Collector, allowing the application to send telemetry data directly to an observability backend. This results in a more lightweight setup with fewer moving parts.

How It Works

User Application (Any Language): The application is instrumented with the OpenTelemetry SDK. Auto-instrumentation collects telemetry data (traces, logs, and metrics) without requiring code modifications and transmits it via OTLP.
Agent (with built-in OTLP support): Acts as an intermediary, receiving OTLP data directly from the SDK.
Vendor Backends: Telemetry data is sent directly to a backend like Prometheus (metrics), Loki (logs), Jaeger (traces), Sematext, or Datadog.

Benefits

Simpler Deployment – No need for an additional collector, reducing setup complexity.
Lower Resource Footprint – Uses fewer CPU and memory resources.
Direct Communication – Reduces latency since data is sent straight to the backend.
Single Component to Manage – The agent is lightweight and easier to maintain.

Considerations

Backend-Specific Implementation – Requires an observability backend that supports OTLP.
No Sampling or Further Processing – Without an intermediary like the OpenTelemetry Collector, it is not possible to apply sampling or further processing before sending the data to the backend.

Log in

Search site

Understanding OpenTelemetry: A Practical Guide

Table of contents

What is OpenTelemetry?

Why Should You Care About OpenTelemetry?

Basic Architecture

Telemetry Signals

Getting Started

SDKs

OTLP Protocol

Instrumentation

Manual vs. Auto-Instrumentation

Manual and Auto-Instrumentation Examples

Auto-Instrumentation (Python)

Manual Instrumentation (Python)

Architecture Approaches

Collector-Based Architecture

Direct Integration Architecture

Infrastructure monitoring

Logging

Real user monitoring

Synthetic monitoring

Search site

Understanding OpenTelemetry: A Practical Guide

Table of contents

What is OpenTelemetry?

Why Should You Care About OpenTelemetry?

Basic Architecture

Telemetry Signals

Getting Started

SDKs

OTLP Protocol

Instrumentation

Manual vs. Auto-Instrumentation

Manual and Auto-Instrumentation Examples

Auto-Instrumentation (Python)

Manual Instrumentation (Python)

Architecture Approaches

Collector-Based Architecture

Direct Integration Architecture

Related posts:

Infrastructure monitoring

Logging

Real user monitoring

Synthetic monitoring