BizFirst Observe
Prometheus Metric Retention
Prometheus stores metrics in a local time-series database (TSDB) with configurable retention. For long-term metric storage beyond 90 days, use Thanos — which adds remote write to object storage and HA deduplication.
Prometheus Local Retention
# Set via command-line flags when starting Prometheus:
prometheus \
--storage.tsdb.path=/var/lib/prometheus \
--storage.tsdb.retention.time=90d \ # Keep 90 days of data locally
--storage.tsdb.retention.size=100GB \ # OR: delete oldest when > 100GB
--storage.tsdb.wal-compression \ # Enable WAL compression
--web.enable-lifecycle # Allow hot reload
# In Docker Compose:
services:
prometheus:
image: prom/prometheus:latest
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--storage.tsdb.retention.time=90d'
- '--storage.tsdb.retention.size=100GB'
- '--web.enable-lifecycle'
volumes:
- ./data/prometheus:/prometheus
Prometheus TSDB Storage Sizing
| Metric | Formula | Example |
|---|---|---|
| Bytes per sample | ~1.5 bytes (compressed) | - |
| Samples per series per day | samples/day = 86400 / scrape_interval | 86400 / 15 = 5760 samples/day |
| Bytes per series per day | 5760 × 1.5 = ~8.6 KB/series/day | - |
| Total storage (90 days, 10,000 series) | 10,000 × 8.6 KB × 90 = ~7.7 GB | Very manageable on local disk |
| Total storage (90 days, 500,000 series) | 500,000 × 8.6 KB × 90 = ~387 GB | High cardinality — requires disk planning |
Thanos for Long-Term Metric Storage
Prometheus alone is not suitable for retention beyond 1-2 years at high cardinality. Thanos extends Prometheus with object storage and HA:
# Thanos architecture for BizFirstGO:
# 1. Thanos Sidecar runs alongside Prometheus — uploads TSDB blocks to S3
# 2. Thanos Store Gateway reads historical blocks from S3
# 3. Thanos Querier federates queries across Prometheus + Store Gateway
# prometheus.yml — remote write to Thanos Receive (alternative approach):
remote_write:
- url: http://thanos-receive:19291/api/v1/receive
queue_config:
max_samples_per_send: 5000
max_shards: 10
capacity: 10000
# Thanos Sidecar (runs alongside Prometheus):
thanos sidecar \
--tsdb.path=/prometheus \
--prometheus.url=http://localhost:9090 \
--objstore.config-file=/etc/thanos/s3.yaml \
--http-address=0.0.0.0:10902 \
--grpc-address=0.0.0.0:10901
# S3 object store config for Thanos:
# /etc/thanos/s3.yaml
type: S3
config:
bucket: bizfirst-prometheus-thanos
endpoint: s3.amazonaws.com
region: us-east-1
access_key: ${AWS_ACCESS_KEY}
secret_key: ${AWS_SECRET_KEY}
Downsampling for Long-Term Storage
Thanos Compact can downsample metrics over time — reducing resolution for older data to save storage while preserving trends: raw (15s) → 5-minute averages (after 40 days) → 1-hour averages (after 10 months). Configure downsampling in Thanos Compact to dramatically reduce long-term storage costs.