Prometheus Metric Retention — Data Retention & Archive

Prometheus Local Retention

# Set via command-line flags when starting Prometheus:
prometheus \
  --storage.tsdb.path=/var/lib/prometheus \
  --storage.tsdb.retention.time=90d \         # Keep 90 days of data locally
  --storage.tsdb.retention.size=100GB \       # OR: delete oldest when > 100GB
  --storage.tsdb.wal-compression \            # Enable WAL compression
  --web.enable-lifecycle                      # Allow hot reload

# In Docker Compose:
services:
  prometheus:
    image: prom/prometheus:latest
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--storage.tsdb.retention.time=90d'
      - '--storage.tsdb.retention.size=100GB'
      - '--web.enable-lifecycle'
    volumes:
      - ./data/prometheus:/prometheus

Prometheus TSDB Storage Sizing

Metric	Formula	Example
Bytes per sample	~1.5 bytes (compressed)	-
Samples per series per day	samples/day = 86400 / scrape_interval	86400 / 15 = 5760 samples/day
Bytes per series per day	5760 × 1.5 = ~8.6 KB/series/day	-
Total storage (90 days, 10,000 series)	10,000 × 8.6 KB × 90 = ~7.7 GB	Very manageable on local disk
Total storage (90 days, 500,000 series)	500,000 × 8.6 KB × 90 = ~387 GB	High cardinality — requires disk planning

Thanos for Long-Term Metric Storage

Prometheus alone is not suitable for retention beyond 1-2 years at high cardinality. Thanos extends Prometheus with object storage and HA:

# Thanos architecture for BizFirst:
# 1. Thanos Sidecar runs alongside Prometheus — uploads TSDB blocks to S3
# 2. Thanos Store Gateway reads historical blocks from S3
# 3. Thanos Querier federates queries across Prometheus + Store Gateway

# prometheus.yml — remote write to Thanos Receive (alternative approach):
remote_write:
  - url: http://thanos-receive:19291/api/v1/receive
    queue_config:
      max_samples_per_send: 5000
      max_shards: 10
      capacity: 10000

# Thanos Sidecar (runs alongside Prometheus):
thanos sidecar \
  --tsdb.path=/prometheus \
  --prometheus.url=http://localhost:9090 \
  --objstore.config-file=/etc/thanos/s3.yaml \
  --http-address=0.0.0.0:10902 \
  --grpc-address=0.0.0.0:10901

# S3 object store config for Thanos:
# /etc/thanos/s3.yaml
type: S3
config:
  bucket: bizfirst-prometheus-thanos
  endpoint: s3.amazonaws.com
  region: us-east-1
  access_key: ${AWS_ACCESS_KEY}
  secret_key: ${AWS_SECRET_KEY}

Downsampling for Long-Term Storage

Thanos Compact can downsample metrics over time — reducing resolution for older data to save storage while preserving trends: raw (15s) → 5-minute averages (after 40 days) → 1-hour averages (after 10 months). Configure downsampling in Thanos Compact to dramatically reduce long-term storage costs.

← Loki Log Retention Next: Tempo Trace Retention →