Portal Community

Prometheus Command-Line Configuration

# Docker Compose — Prometheus service definition
prometheus:
  image: prom/prometheus:v2.51.0
  command:
    - "--config.file=/etc/prometheus/prometheus.yml"
    - "--storage.tsdb.path=/prometheus"
    - "--storage.tsdb.retention.time=90d"         # Keep 90 days of metrics
    - "--storage.tsdb.retention.size=100GB"        # Cap storage at 100GB
    - "--web.enable-remote-write-receiver"          # Accept remote write (for OTel Collector)
    - "--web.enable-lifecycle"                      # Enable /-/reload endpoint
    - "--web.enable-admin-api"                      # Enable snapshot API
    - "--rules.alert.resend-delay=1m"
  volumes:
    - ./prometheus.yml:/etc/prometheus/prometheus.yml
    - ./alert-rules.yml:/etc/prometheus/alert-rules.yml
    - prometheus-data:/prometheus
  ports:
    - "9090:9090"

prometheus.yml — Main Configuration

# prometheus.yml
global:
  scrape_interval: 15s        # Default scrape frequency
  evaluation_interval: 15s    # Alert rule evaluation frequency
  scrape_timeout: 10s

  # Labels added to all metrics scraped by this Prometheus instance
  external_labels:
    cluster: 'bizfirst-prod-us-east-1'
    environment: 'production'

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets: ['alertmanager:9093']

# Alert and recording rules files
rule_files:
  - "/etc/prometheus/alert-rules.yml"
  - "/etc/prometheus/recording-rules.yml"

# Scrape configurations — see 02-scrape-config.html for full details
scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'otel-collector'
    static_configs:
      - targets: ['otel-collector:8888']

  - job_name: 'bizfirst-processengine'
    static_configs:
      - targets: ['processengine:8080']
    metrics_path: '/metrics'
    scrape_interval: 15s

Storage Sizing

Prometheus TSDB requires SSD storage — it performs many small random reads and writes. Rotational disk (HDD) will cause severe performance degradation.

ScenarioActive SeriesRetentionDisk RequiredRAM Required
Development~5,00015 days10 GB SSD512 MB
Small production~50,00090 days50 GB SSD4 GB
Medium production~500,00090 days500 GB SSD16 GB
Large production5M+90 daysUse Thanos + object storage64 GB+

Remote Write — Sending Metrics to Prometheus

In addition to scraping, Prometheus can receive metrics via remote write — which is how the OTel Collector pushes metrics it receives via OTLP:

# otel-collector-config.yaml — remote write to Prometheus
exporters:
  prometheusremotewrite:
    endpoint: "http://prometheus:9090/api/v1/write"
    tls:
      insecure: true
    headers:
      X-Prometheus-Remote-Write-Version: "0.1.0"
    resource_to_telemetry_conversion:
      enabled: true  # Convert OTel resource attributes to Prometheus labels
Prometheus is Single-Node by Default

The default Prometheus deployment is a single instance with no built-in replication. If Prometheus goes down, metric collection stops (though services continue serving traffic). For HA deployments, run two Prometheus instances scraping the same targets, and use Thanos Querier to deduplicate the results. See Guide11: Enterprise Options.