Label Strategy
The label strategy is the most important design decision for Loki. Correct labels enable fast queries and low storage overhead. Incorrect labels (too many, too high cardinality) cause serious performance problems that are difficult to fix without re-ingesting all data.
The Golden Rule: Low-Cardinality Labels Only
A label's cardinality is the number of unique values it can take. Labels should have bounded, low cardinality:
- Good:
environment— 3 values (production, staging, development) - Good:
level— 6 values (trace, debug, info, warn, error, fatal) - Acceptable:
tenant_id— hundreds of values (not thousands) - Bad:
execution_id— millions of unique values - Bad:
user_id— millions of unique values - Bad:
request_id— unique per request
BizFirstGO Standard Label Set
| Label | Source | Example Values | Cardinality | Purpose |
|---|---|---|---|---|
job | OTel resource service.name | processengine, edgestream, octopus, api, worker | <10 | Broad service category filtering |
service | OTel resource service.name | flow-studio-api, processengine-worker | <50 | Specific service name |
environment | OTel resource deployment.environment | production, staging, development | <5 | Environment separation |
level | OTel LogRecord severity | trace, debug, info, warn, error, fatal | 6 | Log severity filtering |
tenant_id | OTel log attribute tenant.id | tenant-abc, tenant-xyz | Hundreds | Multi-tenant isolation |
Stream Count Estimation
Total unique streams = product of unique values of each label. For BizFirstGO:
# Stream count estimate:
jobs: 5 (processengine, edgestream, octopus, api, worker)
environments: 3 (prod, staging, dev)
levels: 6 (trace, debug, info, warn, error, fatal)
tenants: 500 (example: 500 tenant deployment)
Total streams = 5 × 3 × 6 × 500 = 45,000 streams
# This is within Loki's comfortable range (<100,000 streams).
# Adding execution_id as a label (say, 1M executions/day) would create:
# 5 × 3 × 6 × 500 × 1,000,000 = 45 BILLION streams → catastrophic
What Goes in the Log Line (Not Labels)
High-cardinality context belongs in the structured log body. Loki can filter on log line content using |=, |~, | json — just more slowly than label filtering. This trade-off is acceptable because you almost always start a query with a label selector that narrows the result set before content filtering.
# High-cardinality values in the log body (JSON format)
{
"timestamp": "...",
"level": "error", ← label ✓
"service": "processengine", ← label ✓
"tenant_id": "t123", ← label ✓
"message": "Node failed",
"executionId": "exec-abc123", ← log body, not label ✓
"nodeKey": "approval-01", ← log body, not label ✓
"traceId": "4bf92f...", ← log body, not label ✓
"workflowId": "wf-xyz" ← log body, not label ✓
}
Start with label selector (fast, indexed) → then filter on log body content (slower, but applied to small result set):
{job="processengine", tenant_id="t123", level="error"} |= "executionId=exec-abc123"
Label Naming Conventions
- Use
snake_casefor label names (e.g.,tenant_id, nottenantIdorTenantId) - Never use dots in label names — they cause issues in some Grafana versions
- Keep label names short and meaningful — they appear in every LogQL query
- Be consistent across all BizFirstGO services — the same label name for the same concept