BizFirst Observe
Thanos for Metrics HA
Thanos extends Prometheus with long-term storage in object storage (S3), high-availability via multiple Prometheus replicas with deduplication, and a global query view across multiple Prometheus instances. It is the standard approach for enterprise-grade Prometheus deployments.
Thanos Architecture
| Component | Role | Required For |
|---|---|---|
| Thanos Sidecar | Runs alongside Prometheus; uploads TSDB blocks to S3; exposes gRPC StoreAPI | Long-term storage, HA |
| Thanos Store Gateway | Serves historical data from S3 — makes old blocks queryable | Long-term storage |
| Thanos Querier | Federates queries across Sidecar + Store Gateway + multiple Prometheus replicas; deduplicates | HA, global view |
| Thanos Compact | Compacts and downsamples historical S3 blocks | Cost optimization |
| Thanos Ruler | Evaluates recording rules against federated query results | Cross-cluster rules |
Thanos Sidecar Deployment
# kubernetes/prometheus-deployment.yaml — add Thanos sidecar to Prometheus pod:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: prometheus
spec:
template:
spec:
containers:
# Existing Prometheus container:
- name: prometheus
image: prom/prometheus:latest
args:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--storage.tsdb.retention.time=2h' # Short local: Thanos handles long-term
- '--storage.tsdb.min-block-duration=2h'
- '--storage.tsdb.max-block-duration=2h'
- '--web.enable-lifecycle'
# Thanos sidecar container (add alongside Prometheus):
- name: thanos-sidecar
image: thanosio/thanos:v0.35.0
args:
- sidecar
- --tsdb.path=/prometheus
- --prometheus.url=http://localhost:9090
- --objstore.config-file=/etc/thanos/s3.yaml
- --http-address=0.0.0.0:10902
- --grpc-address=0.0.0.0:10901
volumeMounts:
- name: prometheus-data
mountPath: /prometheus
- name: thanos-objstore-config
mountPath: /etc/thanos
Thanos Querier — Global Query Endpoint
# Point Grafana at Thanos Querier instead of Prometheus directly:
# This gives access to both recent data (from Prometheus sidecar) and
# historical data (from Store Gateway / S3):
# grafana-provisioning/datasources/prometheus.yaml:
datasources:
- name: Prometheus (Thanos)
type: prometheus
url: http://thanos-querier:10902 # Thanos Querier, not Prometheus directly
jsonData:
httpMethod: GET
# Thanos Querier deployment:
thanos query \
--http-address=0.0.0.0:10902 \
--grpc-address=0.0.0.0:10901 \
--query.replica-label=prometheus_replica \ # Label for deduplication
--store=prometheus-0:10901 \ # Sidecar on replica 0
--store=prometheus-1:10901 \ # Sidecar on replica 1
--store=thanos-store-gateway:10901 # Historical blocks from S3
Grafana Queries Thanos Querier, Not Prometheus Directly
After adding Thanos, update the Grafana Prometheus data source URL from the Prometheus endpoint to the Thanos Querier endpoint (http://thanos-querier:10902). The Thanos Querier transparently deduplicates data from multiple Prometheus replicas and serves historical data from S3 — no dashboard changes required.