Visualization Layer
Grafana is the single pane of glass for BizFirst Observe. It connects to Loki, Prometheus, and Tempo as data sources and provides dashboards, ad-hoc exploration, alerting, and access control — all in one UI.
Grafana as the Unified UI
A key architectural decision in BizFirst Observe is that users never interact directly with Loki, Prometheus, or Tempo. All observability workflows go through Grafana. This simplifies access control (one place to manage permissions), provides a consistent UX, and enables cross-signal correlation in a single view.
Data Source Connections
Grafana connects to each storage backend via its configured data source. BizFirst Observe provisions these connections automatically at startup via Grafana provisioning:
# grafana-provisioning/datasources/bizfirst-observe.yaml
apiVersion: 1
datasources:
- name: Loki
type: loki
url: http://loki:3100
access: proxy
jsonData:
maxLines: 1000
derivedFields:
- datasourceName: Tempo
matcherRegex: '"traceId":"(\w+)"'
name: TraceID
url: '$${__value.raw}'
datasourceUid: tempo-uid
- name: Prometheus
type: prometheus
url: http://prometheus:9090
access: proxy
jsonData:
timeInterval: 15s
exemplarTraceIdDestinations:
- name: traceID
datasourceUid: tempo-uid
- name: Tempo
type: tempo
url: http://tempo:3200
access: proxy
uid: tempo-uid
jsonData:
tracesToLogs:
datasourceName: Loki
tags: ['service.name', 'tenant_id']
spanStartTimeShift: -5m
spanEndTimeShift: 5m
serviceMap:
datasourceUid: prometheus-uid
search:
hide: false
Pre-Built BizFirstGO Dashboards
BizFirst Observe ships with 10 pre-built Grafana dashboards covering the most common observability scenarios for BizFirstGO:
| Dashboard | Primary Metrics | Data Sources Used |
|---|---|---|
| Flow Studio Overview | Execution rate, error rate, p50/p95/p99 latency | Prometheus + Loki |
| Node Performance | Per-node-type execution time heatmap, error rates by node type | Prometheus |
| HIL Analytics | Suspension duration, approval rates, timeout rates, backlog gauge | Prometheus + Loki |
| EdgeStream Throughput | Messages/sec per topic, subscriber counts, delivery latency | Prometheus |
| Octopus Agent Performance | LLM call duration, token usage, memory access rates | Prometheus |
| Tenant Health | Per-tenant workflow success rate, resource consumption | Prometheus |
| API Latency | P99 latency per API endpoint, error rate | Prometheus |
| Error Analysis | Error log volume, top error patterns | Loki |
| Trace Explorer | Trace search by service/duration/status | Tempo |
| Infrastructure | CPU, memory, disk usage for all nodes | Prometheus (Node Exporter) |
Alert Management
BizFirst Observe supports two alerting paths, which can be used simultaneously:
Grafana Unified Alerting
Define alert rules directly in Grafana dashboards. Supports Loki, Prometheus, and Tempo queries as alert conditions. Manages contact points (Slack, email, PagerDuty) natively.
Prometheus Alertmanager
Classic Prometheus-style alert rules in alert-rules.yml. Routes to Alertmanager which handles deduplication, grouping, and routing. Preferred for complex routing logic.
Grafana Access Control
Grafana's built-in role system controls who can see which dashboards and data sources:
| Role | Dashboard Access | Can Create Dashboards | Can Manage Alerts |
|---|---|---|---|
| Viewer | Read-only: assigned dashboards | No | No |
| Editor | Read/write: assigned dashboards | Yes (in assigned folders) | Yes |
| Admin | All dashboards + data sources | Yes | Yes |
Grafana stores its configuration (data sources, dashboards, users) in its own SQL database (default: SQLite, production: PostgreSQL). The dashboards and data sources defined in provisioning YAML files are automatically loaded at startup — this means you can manage the entire Grafana configuration as code in your repository and Grafana becomes stateless from a deployment perspective.