NEW: Service Map & Distributed Tracing with OpenTelemetry - see dependencies, find bottlenecks, track requests end-to-end.   Learn more

Using AI to Instrument Applications with OpenTelemetry

Updated on: May 21, 2026

Table of contents

OpenTelemetry is one of the best things that’s happened to observability in the last decade. It’s open. It has SDKs for every language that matters. It’s vendor neutral. The OTel community has been doing the hard work of standardizing how applications emit telemetry, so that you, the engineer, don’t have to learn five different agent formats to monitor five different services.

But there’s a part of the OTel pitch that often gets glossed over: somebody still has to instrument the application. And that part isn’t quick or easy. Even now.

The instrumentation tax

Modern applications aren’t a single binary anymore. At Sematext we run 30+ microservices to power the Sematext Cloud platform: alerts, metrics receivers and consumers, log receivers and consumers, user experience services, tracing pipelines, network map, various APIs and more, across Java (Spring Boot), Go, and a few other stacks.

That’s a lot of surface area, several languages, multiple frameworks, different build systems for different stacks. None of this is unusual for a mature system. This means that instrumentation effort grows with that diversity, so any mechanism that helps minimize instrumentation mistakes will be welcomed by engineers tasked with instrumentation.

If you want end-to-end tracing through that stack, the kind that actually tells you where a slow request spent its time, you can’t just instrument one service. You have to instrument the whole chain: frontend → API gateway → backend service A → backend service B → database. Skip a hop and the trace breaks. The dependency graph the trace gives you stops being useful exactly at the boundary you didn’t instrument.

So in practice, “let’s adopt OpenTelemetry” turns into a checklist of dozens of services that each need their own instrumentation work. The good news is that it doesn’t have to happen all at once and AI can help.

How much does OpenTelemetry instrumentation cost?

By cost, we mean the cost to an engineer, the team, and the organization. We can look at it as a non-monetary cost, but if we trace (pun intended!) this cost all the way down then yes, there is also a financial cost associated with this effort.

Three things make this hard, even with OTel:

Prioritization. Instrumenting a service competes with shipping features and fixing bugs. It’s preventive work; its value shows up the next time something breaks at 3am, not this sprint. That’s a hard sell to a product manager.

Unknown territory. When the chain spans services you didn’t write, in languages you don’t use day to day, you’re spending most of your time on context switch overhead. You’re not adding instrumentation; you’re re-learning a framework you saw once two years ago.

Time needed even for auto-instrumentation. “Auto-instrumentation” means no code changes. It doesn’t mean no work. For one service the loop typically goes:

  1. Read the right OTel SDK docs for your language
  2. Pick the right auto-instrumentation package (there are usually three options, only one of which is current)
  3. Install it in the build (pom.xml, package.json, requirements.txt, …)
  4. Configure the OTLP endpoint, the auth header, the service name
  5. Restart, hit the service, watch what happens
  6. Debug the first attempt: wrong port (4317 vs 4318 vs 4338), wrong protocol (http/protobuf vs grpc), wrong auth header (Bearer vs vendor-specific), region mismatch on the endpoint
  7. Verify the data lands in the right place in your observability tool
  8. Multiply by the number of services in your chain

Forty minutes to two hours per service if you’re moving carefully, and that’s just for traces and metrics. The OTel auto-instrumentation packages don’t ship logs in most SDKs. For logs you need to switch to manual instrumentation, which is another SDK init block per service.

And then there’s custom OpenTelemetry instrumentation

The above buys you “spans for every incoming HTTP request” and a generic metrics set. The moment you want anything specific (e.g., a custom span attribute for the user’s account tier, a business metric counting checkouts, a log enriched with the trace ID so you can correlate logs and traces for faster Root Cause Analysis), you’re back in manual-instrumentation land, writing SDK code in every service you care about. The auto path ends; the per-language SDK learning curve begins.

For one service that’s an afternoon. For thirty services that’s a quarter. What can we do about this?

Can we use AI to instrument applications with OpenTelemetry?

The instrumentation work (pick the SDK, set the env vars, debug the endpoint, verify it landed) is exactly the kind of structured, repetitive, well-documented task an AI agent does well. The blockers aren’t intellectual; they’re “look up the right thing, paste it in the right place, watch for the obvious gotcha.”

That’s not “let the AI do your engineering.” It’s “let the AI do the parts that already had a right answer, written down somewhere, and just needed someone to fetch it.”

We tried this for instrumenting applications against Sematext Cloud. The result is a small,but highly valuable open-source artifact: a Claude Code Agent Skill that walks an engineer through OTel instrumentation conversationally. It’s plain markdown, lives in our public Github repository, and works with any AI agent that can read a URL.

What the OTel instrumentation AI skill does

The Sematext OTel skill at sematext-otel-onboarding/blob/main/skills/sematext-otel.md is the AI-readable version of “how to wire your application to Sematext.” When loaded into Claude Code (or any agent that can fetch a markdown URL), it triages the user through six short questions:

  1. Sematext region (US or EU)
  2. Which App types you’re wiring (Tracing, Logs, Monitoring, any combination)
  3. Flow: managed OTLP endpoint or Sematext Agent
  4. Protocol: HTTP (default) or gRPC
  5. Language and deployment environment
  6. Auto or manual instrumentation

Then it produces the exact env-var block, parameterized to your answers, including:

  • The correct OTLP endpoint URL for your region and protocol
  • The Sematext-specific X-API-TOKEN header (different from the standard Authorization: Bearer … most OTel docs show, easy to miss)
  • One header per signal type, so you only configure what you’re using
  • A pointer to a runnable reference example in the same repo, in your language

Similarly, you can use the skill not only to add instrumentation to uninstrumented applications, but also to fix broken instrumentation that’s not really working. Auto-instrumentation doesn’t ship logs? The skill flags that and asks if you want to switch to manual. Region-token mismatch? The skill warns explicitly. Custom header convention? Documented. The skill is opinionated and aware of the setup required where the official OpenTelemetry docs may be silent or difficult to understand and follow.

What an instrumentation session looks like

In practice, an engineer with Claude Code in their editor opens their service’s directory and pastes:

Use https://github.com/sematext/sematext-otel-onboarding/blob/main/skills/sematext-otel.md to instrument this app for Sematext.
Region: US. App type: Tracing. Token: <pasted-from-Sematext-UI>.

Claude loads the skill, reads the project’s files to figure out the language and framework, asks the two remaining triage questions, then proposes the diff to the project (adds the OTel SDK to the build file, adds the env vars to docker-compose or systemd or .env or wherever they belong, and shows you what to expect in the Sematext UI within 60 seconds). You review the diff. You apply. You restart. You see traces.

Compared to the manual path (read docs, pick SDK, install, configure, debug, verify), we’d expect the instrumentation effort time to first data to drop from the typical 40 to 120 minutes per service to a handful of minutes. For an organization adopting OTel across dozens of services, that compounds quickly. A quarter of part-time effort becomes a couple of focused days. Thousands of dollars in engineering time drops to a much more sane number. The effort has a positive ROI.

What the skill doesn’t do

This is where AI posts usually start hand-waving. Here’s the honest list:

  • It doesn’t write custom span attributes or business metrics for you. It writes the boilerplate that gets you to the point where you can write those. The judgment about what to measure is still yours. The benefit is that the skill gives you all the scaffolding, a working instrumentation, so the effort of collecting custom/business metrics becomes significantly lower.
  • It doesn’t psychic-debug your network. If your service can’t reach the OTLP endpoint on first run (corporate proxy, missing TLS cert chain, wrong port), the skill points at the common causes, but you still have to look at the service’s own logs to confirm what happened.
  • It doesn’t change OTel’s reality. Auto-instrumentation still doesn’t ship logs in most SDKs. AI doesn’t fix the SDK. But it does tell you up front, so you don’t spend an hour wondering why your Logs App is empty.

Try it, the skill is vendor-agnostic

The skill is open-source and lives in our OTel onboarding repo. The same repo has runnable reference apps for Node.js, Java, Python, .NET, and PHP across baremetal, Docker, and Kubernetes deployments, so if you want to see what the skill is going to walk you through, the reference is right there.

If you have Claude Code, point it at the URL above. If you use a different AI agent, the skill is just markdown. Load it however your agent loads documentation. There’s no install step; there’s no vendor lock-in. The skill shared is not Sematext-specific. Sematext’s contribution is the knowledge, encoded in a format an AI can act on.

Where this is going

The OTel skill is one example of a broader pattern we think makes sense for observability tools: knowledge as something an AI can use, not just something a human can read.

A few directions we’re exploring:

  • Per-language sub-skills for deeper, opinionated guidance when a language has subtle gotchas (Node async hooks, Java agent attach, Python startup ordering)
  • In-product wiring so the App creation page in Sematext Cloud gives you a one-click “use AI to set this up” alongside the existing manual instructions, with your region and token pre-filled into the prompt
  • Skills for the rest of the observability journey (picking sensible default alerts, creating dashboards, interpreting RCA results), each as a small, auditable markdown file you can use, fork, or ignore

The bet is that the value of an observability platform isn’t just the data it collects; it’s how quickly an engineer can go from “we should monitor this” to “we’re monitoring it and we know what to do when it breaks.” AI doesn’t replace the judgment in that loop as of yet. But it can absolutely replace the busywork around it.

If you give the skill a try and find a gap, open a PR. The fastest way for this pattern to get good is for more people to use it on more apps.

Start Free Trial

OpenTelemetry Production Monitoring: What Breaks, and How to Prevent It

OpenTelemetry almost always works beautifully in staging, demos, and videos....

Troubleshooting Microservices with OpenTelemetry Distributed Tracing

Distributed tracing doesn’t just show you what happened. It shows...

OpenTelemetry in Production: Design for Order, High Signal, Low Noise, and Survival

A lot of talk around OpenTelemetry has to do with...