Grafana Loki
Grafana Loki is the log aggregation component of the default BizFirst Observe stack. It ingests structured JSON logs from all BizFirstGO services and makes them queryable via LogQL. Its label-based indexing model keeps storage costs low while maintaining fast query performance.
How Loki Differs from Elasticsearch
The most important thing to understand about Loki is what it does not do: it does not full-text index log content. This is a deliberate design decision that has significant implications:
| Capability | Loki | Elasticsearch |
|---|---|---|
| Index log content | No — log lines are not indexed | Yes — every word is indexed |
| Index labels/fields | Yes — labels only | Yes — all fields |
| Full-text search | Regex/substring filter (grep-style, slower) | Full inverted index (fast) |
| Storage cost | Low (compression only, no index) | High (index is 1–3x raw data size) |
| Ingest performance | Excellent (no indexing overhead) | High CPU cost at ingest |
| Query pattern | Best for labeled stream queries | Best for arbitrary field search |
For BizFirstGO's structured log format — where all important context is in labels (service, tenant, environment) and log content is filtered by executionId or error message — Loki's model is a natural fit.
Loki's Data Model
Loki organizes logs into streams. A stream is a sequence of log entries that all share the same set of label key-value pairs. Every log line belongs to exactly one stream.
# A Loki stream is identified by its label set:
{
"job": "processengine",
"service": "flow-studio",
"tenant_id": "tenant-abc",
"environment": "production",
"level": "error"
}
# All log lines with these labels form one stream.
# Adding a new label value creates a new stream.
BizFirstGO Label Strategy for Loki
Labels in Loki must be low-cardinality. The following label strategy is used across all BizFirstGO services:
| Label | Values | Cardinality | Rationale |
|---|---|---|---|
job | processengine, edgestream, octopus, api | Very low (<10) | Service category for broad filtering |
service | Specific service name | Low (<50) | Narrows to exact service |
tenant_id | Tenant identifiers | Medium (hundreds) | Multi-tenant isolation |
environment | production, staging, development | Very low (<5) | Environment separation |
level | trace, debug, info, warn, error, fatal | Very low (<6) | Log severity filtering |
Do not use execution_id, user_id, trace_id, or request_id as Loki labels. Each unique value creates a new stream. Thousands of unique streams degrade Loki performance significantly. Put these values in the log line body instead, and use |= "execution_id=abc" filter expressions to find them.
LogQL — Quick Reference
# Stream selector — required first step
{job="processengine", environment="production"}
# Add a filter — substring match
{job="processengine"} |= "error"
# Case-insensitive filter
{job="processengine"} |~ "(?i)failed"
# Exclude a pattern
{job="processengine"} != "health check"
# JSON parsing — extract fields from JSON log lines
{job="processengine"} | json | level="error"
# Find logs for a specific execution
{job="processengine"} |= "execution_id=exec-d1e2f3a4"
# Count errors per minute (metric query)
rate({job="processengine"} |= "error"[1m])
This page covers Loki as part of the default stack. For the complete Loki reference — deployment options, advanced LogQL, alert rules, and retention configuration — see Guide3: Loki.