Portal Community

The Pull Model

Prometheus is fundamentally different from push-based metrics systems (like StatsD or InfluxDB with line protocol). Instead of services sending metrics to Prometheus, Prometheus reaches out to each service's /metrics HTTP endpoint and scrapes the current values.

Pull Model Benefits

  • Prometheus controls scrape rate — no overload from misbehaving services
  • Immediate detection of down services (scrape fails → up{job="..."} = 0)
  • Simple for services — just expose /metrics, no connection to maintain

Pull Model Trade-offs

  • Services behind firewall need push gateway or reverse proxy for scraping
  • Short-lived jobs (batch workers) need Pushgateway to expose metrics
  • Network must allow Prometheus to reach all target services

Metric Types

TypeAlways Increases?BizFirstGO ExamplePrimary Operation
CounterYes (resets on restart)bizfirst_workflow_executions_totalrate()[5m] — compute per-second rate
GaugeNo (can go up or down)bizfirst_hil_pending_countDirect value — current backlog
HistogramYes (bucket counts)bizfirst_node_execution_duration_secondshistogram_quantile(0.99, ...) — P99 latency
SummaryYesRarely usedPre-computed quantiles (less flexible)

Prometheus /metrics Format

BizFirstGO services expose metrics in Prometheus text format at GET /metrics:

# HELP bizfirst_workflow_executions_total Total workflow executions
# TYPE bizfirst_workflow_executions_total counter
bizfirst_workflow_executions_total{tenant_id="t123",status="success"} 4821
bizfirst_workflow_executions_total{tenant_id="t123",status="failed"} 47
bizfirst_workflow_executions_total{tenant_id="t456",status="success"} 1205

# HELP bizfirst_node_execution_duration_seconds Node execution duration
# TYPE bizfirst_node_execution_duration_seconds histogram
bizfirst_node_execution_duration_seconds_bucket{node_type="DataFetchNode",le="0.1"} 812
bizfirst_node_execution_duration_seconds_bucket{node_type="DataFetchNode",le="0.5"} 1843
bizfirst_node_execution_duration_seconds_bucket{node_type="DataFetchNode",le="1.0"} 2104
bizfirst_node_execution_duration_seconds_bucket{node_type="DataFetchNode",le="+Inf"} 2211
bizfirst_node_execution_duration_seconds_sum{node_type="DataFetchNode"} 847.23
bizfirst_node_execution_duration_seconds_count{node_type="DataFetchNode"} 2211

# HELP bizfirst_hil_pending_count Current HIL tasks awaiting action
# TYPE bizfirst_hil_pending_count gauge
bizfirst_hil_pending_count{tenant_id="t123"} 12

How BizFirstGO Services Register Metrics

All BizFirstGO metrics are registered via the OTel Metrics API in MetricsRegistry.cs. The OTel SDK translates these to Prometheus format and exposes them via the /metrics endpoint:

// MetricsRegistry.cs — BizFirstGO metric definitions
public static class MetricsRegistry
{
    private static readonly Meter Meter = new Meter("BizFirst.ProcessEngine", "1.0");

    public static readonly Counter<long> WorkflowExecutions =
        Meter.CreateCounter<long>(
            "bizfirst.workflow.executions",
            unit: "{execution}",
            description: "Total workflow executions");

    public static readonly Histogram<double> NodeExecutionDuration =
        Meter.CreateHistogram<double>(
            "bizfirst.node.execution.duration",
            unit: "s",
            description: "Node execution duration in seconds");

    public static readonly ObservableGauge<int> HilPendingCount =
        Meter.CreateObservableGauge<int>(
            "bizfirst.hil.pending.count",
            () => HilService.GetPendingCount(),
            unit: "{task}",
            description: "HIL tasks awaiting action");
}