Tempo Overview
Grafana Tempo is the distributed tracing backend for BizFirst Observe. It ingests traces from all BizFirstGO services via OTLP, stores them in object storage at minimal cost, and enables trace search and correlation with logs and metrics through Grafana.
What Distributed Tracing Solves
When a workflow execution fails or runs slowly, logs and metrics tell you that something went wrong and what the error was. Distributed traces tell you exactly where in the execution chain the problem occurred — including which service call, which external API, and how long each component took.
Without Distributed Tracing
Engineer checks logs from 5 services separately, tries to correlate timestamps manually, cannot see the full execution chain or which external call caused the latency.
With Distributed Tracing
Engineer opens one trace in Tempo, sees the complete execution waterfall across all services, immediately identifies that DataFetchNode spent 4.2s on an external API call at step 3.
Key Concepts
| Concept | Definition | BizFirstGO Example |
|---|---|---|
| Trace | Complete record of one end-to-end request | One workflow execution from start to end |
| Span | One unit of work within a trace, with start time and duration | Execution of one workflow node |
| Root Span | The outermost span — parent of all others in the trace | workflow.execute span |
| TraceId | 128-bit unique identifier for the entire trace | Appears in logs, metrics exemplars, and Tempo |
| SpanId | 64-bit unique identifier for one span | Links a log line to the specific span that emitted it |
| Trace context | TraceId + SpanId propagated via HTTP headers | traceparent header on all outbound HTTP calls |
Why Tempo Instead of Jaeger or Zipkin?
| Feature | Tempo | Jaeger | Zipkin |
|---|---|---|---|
| Storage backend | Object storage (S3/GCS/Blob) — very cheap | Cassandra or Elasticsearch — expensive | Cassandra or Elasticsearch — expensive |
| OTel native | Yes — OTLP first-class citizen | Partial — adapter required | Partial — needs translation |
| Index | Bloom filters (no separate index DB) | Elasticsearch index | Cassandra index |
| Grafana integration | Native — same team at Grafana Labs | Plugin required | Plugin required |
| Trace search | Attribute search via TraceQL (Tempo 2.0+) | Yes (via Elasticsearch) | Limited |
Tempo uses local filesystem storage only for development. For production, configure an S3-compatible bucket. Tempo writes traces directly to object storage — there is no intermediate database. This makes Tempo extremely cost-effective: S3 Standard costs ~$0.023/GB/month, compared to Cassandra/Elasticsearch which require dedicated compute and storage clusters at 10–100x higher cost for the same trace volume.