Performance Patterns
Indexes optimized for workflow query patterns, connection pool configuration, pagination strategies for large result sets, and query tuning guidance for Data Ocean SQL Server.
Index Strategy for Workflow Queries
Workflow queries in Flow Studio follow predictable patterns — they always filter by TenantId first, then by a secondary filter (status, date range, or a specific ID). Indexes should be designed around this pattern:
The Composite Index Rule
Every table should have composite indexes that start with TenantId and cover the most common secondary filters. The leading column of every non-clustered index should be TenantId:
-- Standard index pattern for Data Ocean tables
-- Index 1: Active record lookup by tenant + status
CREATE INDEX IX_Customer_TenantId_Status
ON Customer (TenantId, Status, UpdatedAt DESC)
INCLUDE (Name, Email, ClassificationLabel)
WHERE IsDeleted = 0;
-- Index 2: Single record lookup by tenant + PK (needed when PK is GUID)
CREATE INDEX IX_Customer_TenantId_CustomerId
ON Customer (TenantId, CustomerId)
WHERE IsDeleted = 0;
-- Index 3: AI enrichment targeting — find un-processed records
CREATE INDEX IX_Customer_TenantId_AiProcessedAt
ON Customer (TenantId, AiProcessedAt)
WHERE AiProcessedAt IS NULL AND IsDeleted = 0;
-- Index 4: Date-range queries (common in reporting workflows)
CREATE INDEX IX_Customer_TenantId_CreatedAt
ON Customer (TenantId, CreatedAt DESC)
WHERE IsDeleted = 0;
Connection Pool Configuration
The datasource connection factory uses ADO.NET connection pooling. Configure the pool size in the connection string parameters:
-- Connection string with pool settings
Server=your-server;Database=DataOcean_CustomerData;
User Id=BizFirstApp;Password=...;
Min Pool Size=5;
Max Pool Size=100;
Connection Timeout=15;
MultipleActiveResultSets=True;
| Parameter | Recommended Value | Notes |
|---|---|---|
Min Pool Size | 5 | Keeps 5 connections warm — reduces cold-start latency on workflow burst |
Max Pool Size | 100 | Cap at 100 to protect the SQL Server from connection exhaustion. Increase if your workflow concurrency regularly saturates this. |
Connection Timeout | 15 | Fail fast if the pool is exhausted — 15 seconds is aggressive but prevents cascading timeouts in the workflow engine |
MultipleActiveResultSets | True | Required for EF Core and async pattern compatibility |
Pagination Pattern
Workflow nodes that return lists should always paginate. Never return unbounded result sets from SQL nodes — this can exhaust workflow memory for large tables. Use OFFSET-FETCH pagination:
-- Paginated list query pattern
SELECT CustomerId, Name, Email, Status, CreatedAt
FROM Customer
WHERE TenantId = @tenantId
AND Status = @status
AND IsDeleted = 0
ORDER BY CreatedAt DESC
OFFSET @offset ROWS
FETCH NEXT @pageSize ROWS ONLY;
-- Accompanying count query (for UI pagination controls)
SELECT COUNT(*) AS TotalCount
FROM Customer
WHERE TenantId = @tenantId
AND Status = @status
AND IsDeleted = 0;
Use a page size of 25-50 for interactive UI workflows and 100-500 for batch processing workflows. Never exceed 1000 rows per page in interactive workflows — larger result sets are better handled by batch processing patterns that stream results through the workflow loop node.
READ_COMMITTED_SNAPSHOT Isolation
Enable READ_COMMITTED_SNAPSHOT (RCSI) on your Data Ocean database. This uses row versioning to eliminate read-write lock contention — critical for workflows that mix reads and writes on the same tables:
-- Enable RCSI (one-time setup — no downtime needed)
ALTER DATABASE DataOcean_CustomerData
SET READ_COMMITTED_SNAPSHOT ON;
-- Verify
SELECT name, is_read_committed_snapshot_on
FROM sys.databases
WHERE name = 'DataOcean_CustomerData';
Avoiding Common Performance Mistakes
| Mistake | Impact | Fix |
|---|---|---|
Missing TenantId in index leading column | Table scan on every tenant's data | All indexes start with TenantId |
No IsDeleted = 0 filtered index | Queries scan soft-deleted rows | Add WHERE IsDeleted = 0 to indexes |
Returning SELECT * in workflow queries | Transfers unnecessary data; index coverage fails | Always SELECT only the columns needed |
| No pagination on list queries | Memory exhaustion for large tenants | Mandatory OFFSET-FETCH on all list queries |
| Implicit string conversion in WHERE clause | Index cannot be used; full scan | Match parameter types exactly to column types |
| Long-running queries in synchronous workflow steps | Workflow timeouts; user-facing latency | Use async workflow steps; increase commandTimeoutSeconds; optimize query |
Query Performance Monitoring
-- Find top 20 slowest queries in the plan cache (run in SSMS)
SELECT TOP 20
SUBSTRING(qt.TEXT, (qs.statement_start_offset/2)+1,
((CASE qs.statement_end_offset
WHEN -1 THEN DATALENGTH(qt.TEXT)
ELSE qs.statement_end_offset END
- qs.statement_start_offset)/2)+1) AS QueryText,
qs.execution_count,
qs.total_logical_reads / qs.execution_count AS AvgLogicalReads,
qs.total_elapsed_time / qs.execution_count / 1000 AS AvgElapsedMs
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt
WHERE qt.dbid = DB_ID('DataOcean_CustomerData')
ORDER BY AvgElapsedMs DESC;