Portal Community

What Distributed Tracing Solves

When a workflow execution fails or runs slowly, logs and metrics tell you that something went wrong and what the error was. Distributed traces tell you exactly where in the execution chain the problem occurred — including which service call, which external API, and how long each component took.

Without Distributed Tracing

Engineer checks logs from 5 services separately, tries to correlate timestamps manually, cannot see the full execution chain or which external call caused the latency.

With Distributed Tracing

Engineer opens one trace in Tempo, sees the complete execution waterfall across all services, immediately identifies that DataFetchNode spent 4.2s on an external API call at step 3.

Key Concepts

ConceptDefinitionBizFirstGO Example
TraceComplete record of one end-to-end requestOne workflow execution from start to end
SpanOne unit of work within a trace, with start time and durationExecution of one workflow node
Root SpanThe outermost span — parent of all others in the traceworkflow.execute span
TraceId128-bit unique identifier for the entire traceAppears in logs, metrics exemplars, and Tempo
SpanId64-bit unique identifier for one spanLinks a log line to the specific span that emitted it
Trace contextTraceId + SpanId propagated via HTTP headerstraceparent header on all outbound HTTP calls

Why Tempo Instead of Jaeger or Zipkin?

FeatureTempoJaegerZipkin
Storage backendObject storage (S3/GCS/Blob) — very cheapCassandra or Elasticsearch — expensiveCassandra or Elasticsearch — expensive
OTel nativeYes — OTLP first-class citizenPartial — adapter requiredPartial — needs translation
IndexBloom filters (no separate index DB)Elasticsearch indexCassandra index
Grafana integrationNative — same team at Grafana LabsPlugin requiredPlugin required
Trace searchAttribute search via TraceQL (Tempo 2.0+)Yes (via Elasticsearch)Limited
Tempo Requires Object Storage for Production

Tempo uses local filesystem storage only for development. For production, configure an S3-compatible bucket. Tempo writes traces directly to object storage — there is no intermediate database. This makes Tempo extremely cost-effective: S3 Standard costs ~$0.023/GB/month, compared to Cassandra/Elasticsearch which require dedicated compute and storage clusters at 10–100x higher cost for the same trace volume.