Trace Ingestion — Distributed Traces

Ingestion Path

# Ingestion flow:
BizFirst Service
  → OTel SDK (ActivitySource, auto-instrumentation)
    → OTLP/gRPC export to OTel Collector (port 4317)
      → OTel Collector processors (sampling, PII redaction)
        → OTLP/gRPC export to Tempo (port 4317)
          → Tempo WAL → Tempo object storage (S3)

OTel Collector — Tempo Exporter

# otel-collector-config.yaml — traces pipeline
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

processors:
  # Tail sampling: always sample errors, sample 10% of success traces
  tail_sampling:
    decision_wait: 10s
    num_traces: 100000
    policies:
      - name: "always-sample-errors"
        type: status_code
        status_code: {status_codes: [ERROR]}
      - name: "sample-10pct-success"
        type: probabilistic
        probabilistic: {sampling_percentage: 10}

  # Drop sensitive span attributes
  transform/trace-redact:
    trace_statements:
      - context: span
        statements:
          - delete_key(attributes, "user.password")
          - delete_key(attributes, "http.request.body")

exporters:
  otlp/tempo:
    endpoint: tempo:4317
    tls:
      insecure: true               # Use TLS in production

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [tail_sampling, transform/trace-redact, batch]
      exporters: [otlp/tempo]

Trace Context Propagation

The OTel SDK automatically propagates trace context when BizFirst services make HTTP calls to each other. The W3C TraceContext format is used via the traceparent header:

# W3C traceparent header format:
# traceparent: {version}-{traceId}-{parentSpanId}-{flags}
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01

# Breakdown:
# version: 00
# traceId: 4bf92f3577b34da6a3ce929d0e0e4736  (same across all services in a request)
# parentSpanId: 00f067aa0ba902b7               (the calling span's ID)
# flags: 01 (sampled)

The OTel SDK in each BizFirst service automatically:

Reads the traceparent header on inbound requests and continues the trace
Writes the traceparent header on outbound HttpClient requests (auto-instrumented)
Creates child spans that are properly linked to the parent

Sampling Strategy

Scenario	Sample Rate	Rationale
Development	100%	All traces needed for debugging
Production — error traces	100%	Never lose an error trace
Production — slow traces (>2s)	100%	Always capture performance issues
Production — success traces	5–10%	Sufficient for representative analysis
Production — health checks	0%	No value; filter out completely

Use Tail Sampling, Not Head Sampling

Head sampling decides at the start of a request whether to sample it. This means you might sample a request that turns out to fail — or skip one that turns out to be slow. Tail sampling (via the OTel Collector) collects all spans, waits for the trace to complete (10-second decision window), then decides whether to keep it based on the final outcome. This ensures 100% of error and slow traces are retained even at low sample rates.

← Deploying Tempo Next: BizFirst Trace Structure →