Pre-Built BizFirstGO Dashboards
BizFirst Observe ships with 10 pre-built Grafana dashboards that cover the complete observability surface of BizFirstGO. Each dashboard is designed around a specific audience and use case — from high-level SLA monitoring to detailed per-node performance analysis.
Dashboard 1: Flow Studio Overview
The primary dashboard for operations teams. Shows the overall health of workflow execution in real time.
| Panel | Metric / Query | Panel Type |
|---|---|---|
| Execution Rate | sum(rate(bizfirst_workflow_executions_total[5m])) | Time Series |
| Error Rate % | sum(rate(...{status="failed"}[5m])) / sum(rate(...[5m])) * 100 | Time Series + threshold coloring |
| P50 Latency | histogram_quantile(0.50, ...workflow_execution_duration...) | Time Series |
| P95 Latency | histogram_quantile(0.95, ...) | Time Series |
| P99 Latency | histogram_quantile(0.99, ...) | Time Series |
| Active Executions | sum(bizfirst_active_executions) | Stat |
| Recent Errors Log | {job="processengine", level="error"} | json | Logs |
Dashboard 2: Node Performance
Deep-dive into execution node performance. Identifies slow or error-prone node types.
| Panel | Description |
|---|---|
| P99 Latency by Node Type | Heatmap showing latency distribution for each node type over time |
| Error Rate by Node Type | Bar chart comparing error rates across all node types |
| Execution Volume by Node Type | Time series of execution counts per node type |
| Slowest Nodes (Top 10) | Table showing node types ranked by P99 latency |
Dashboard 3: HIL Analytics
Human-in-the-Loop workflow management. Used by process owners and operations to monitor approval backlogs and SLA compliance.
| Panel | Description |
|---|---|
| Current Backlog | Gauge showing total pending HIL tasks (red if > threshold) |
| Overdue Tasks | Stat panel — tasks past SLA deadline (red alert coloring) |
| Suspension Duration Distribution | Histogram of how long HIL tasks take to complete |
| Approval Rate | Pie chart: approved vs. rejected vs. timeout outcomes |
| Backlog by Tenant | Bar chart — identifies tenants with high HIL backlogs |
Dashboard 4: EdgeStream Throughput
| Panel | Description |
|---|---|
| Messages/sec by Topic | Time series per topic — shows throughput trends |
| Delivery Latency P99 | P99 message delivery time per topic |
| Subscriber Count | Active subscriber connections per topic |
| Queue Depth | Messages awaiting delivery (delivery lag indicator) |
| Failed Deliveries | Error rate for message delivery failures |
Dashboard 5: Octopus Agent Performance
| Panel | Description |
|---|---|
| LLM Call Rate | API calls per minute per model (GPT-4, Claude, etc.) |
| LLM Response Latency P99 | P99 LLM call duration — critical for agent responsiveness |
| Token Usage | Input and output tokens per minute (cost tracking) |
| Active Agent Sessions | Current active Octopus agent sessions |
| Memory Access Operations | Read/write operations per memory type (episodic, semantic, etc.) |
Every pre-built dashboard includes $tenant and $environment template variables in the top bar. Select a specific tenant to scope all panels to that tenant's data — or leave $tenant on "All" to see aggregate metrics across all tenants. This is essential for multi-tenant deployments where you need to investigate a specific tenant's behavior.