Pre-Built BizFirst Dashboards — Grafana Dashboards

Dashboard 1: Flow Studio Overview

The primary dashboard for operations teams. Shows the overall health of workflow execution in real time.

Panel	Metric / Query	Panel Type
Execution Rate	`sum(rate(bizfirst_workflow_executions_total[5m]))`	Time Series
Error Rate %	`sum(rate(...{status="failed"}[5m])) / sum(rate(...[5m])) * 100`	Time Series + threshold coloring
P50 Latency	`histogram_quantile(0.50, ...workflow_execution_duration...)`	Time Series
P95 Latency	`histogram_quantile(0.95, ...)`	Time Series
P99 Latency	`histogram_quantile(0.99, ...)`	Time Series
Active Executions	`sum(bizfirst_active_executions)`	Stat
Recent Errors Log	`{job="processengine", level="error"} \| json`	Logs

Dashboard 2: Node Performance

Deep-dive into execution node performance. Identifies slow or error-prone node types.

Panel	Description
P99 Latency by Node Type	Heatmap showing latency distribution for each node type over time
Error Rate by Node Type	Bar chart comparing error rates across all node types
Execution Volume by Node Type	Time series of execution counts per node type
Slowest Nodes (Top 10)	Table showing node types ranked by P99 latency

Dashboard 3: HIL Analytics

Human-in-the-Loop workflow management. Used by process owners and operations to monitor approval backlogs and SLA compliance.

Panel	Description
Current Backlog	Gauge showing total pending HIL tasks (red if > threshold)
Overdue Tasks	Stat panel — tasks past SLA deadline (red alert coloring)
Suspension Duration Distribution	Histogram of how long HIL tasks take to complete
Approval Rate	Pie chart: approved vs. rejected vs. timeout outcomes
Backlog by Tenant	Bar chart — identifies tenants with high HIL backlogs

Dashboard 4: EdgeStream Throughput

Panel	Description
Messages/sec by Topic	Time series per topic — shows throughput trends
Delivery Latency P99	P99 message delivery time per topic
Subscriber Count	Active subscriber connections per topic
Queue Depth	Messages awaiting delivery (delivery lag indicator)
Failed Deliveries	Error rate for message delivery failures

Dashboard 5: Octopus Agent Performance

Panel	Description
LLM Call Rate	API calls per minute per model (GPT-4, Claude, etc.)
LLM Response Latency P99	P99 LLM call duration — critical for agent responsiveness
Token Usage	Input and output tokens per minute (cost tracking)
Active Agent Sessions	Current active Octopus agent sessions
Memory Access Operations	Read/write operations per memory type (episodic, semantic, etc.)

All Dashboards Use $tenant and $environment Variables

Every pre-built dashboard includes $tenant and $environment template variables in the top bar. Select a specific tenant to scope all panels to that tenant's data — or leave $tenant on "All" to see aggregate metrics across all tenants. This is essential for multi-tenant deployments where you need to investigate a specific tenant's behavior.

← Connecting Data Sources Next: Importing Dashboards →