Portal Community

Index Strategy for Workflow Queries

Workflow queries in Flow Studio follow predictable patterns — they always filter by TenantId first, then by a secondary filter (status, date range, or a specific ID). Indexes should be designed around this pattern:

The Composite Index Rule

Every table should have composite indexes that start with TenantId and cover the most common secondary filters. The leading column of every non-clustered index should be TenantId:

-- Standard index pattern for Data Ocean tables
-- Index 1: Active record lookup by tenant + status
CREATE INDEX IX_Customer_TenantId_Status
    ON Customer (TenantId, Status, UpdatedAt DESC)
    INCLUDE (Name, Email, ClassificationLabel)
    WHERE IsDeleted = 0;

-- Index 2: Single record lookup by tenant + PK (needed when PK is GUID)
CREATE INDEX IX_Customer_TenantId_CustomerId
    ON Customer (TenantId, CustomerId)
    WHERE IsDeleted = 0;

-- Index 3: AI enrichment targeting — find un-processed records
CREATE INDEX IX_Customer_TenantId_AiProcessedAt
    ON Customer (TenantId, AiProcessedAt)
    WHERE AiProcessedAt IS NULL AND IsDeleted = 0;

-- Index 4: Date-range queries (common in reporting workflows)
CREATE INDEX IX_Customer_TenantId_CreatedAt
    ON Customer (TenantId, CreatedAt DESC)
    WHERE IsDeleted = 0;

Connection Pool Configuration

The datasource connection factory uses ADO.NET connection pooling. Configure the pool size in the connection string parameters:

-- Connection string with pool settings
Server=your-server;Database=DataOcean_CustomerData;
User Id=BizFirstApp;Password=...;
Min Pool Size=5;
Max Pool Size=100;
Connection Timeout=15;
MultipleActiveResultSets=True;
ParameterRecommended ValueNotes
Min Pool Size5Keeps 5 connections warm — reduces cold-start latency on workflow burst
Max Pool Size100Cap at 100 to protect the SQL Server from connection exhaustion. Increase if your workflow concurrency regularly saturates this.
Connection Timeout15Fail fast if the pool is exhausted — 15 seconds is aggressive but prevents cascading timeouts in the workflow engine
MultipleActiveResultSetsTrueRequired for EF Core and async pattern compatibility

Pagination Pattern

Workflow nodes that return lists should always paginate. Never return unbounded result sets from SQL nodes — this can exhaust workflow memory for large tables. Use OFFSET-FETCH pagination:

-- Paginated list query pattern
SELECT CustomerId, Name, Email, Status, CreatedAt
FROM Customer
WHERE TenantId = @tenantId
  AND Status = @status
  AND IsDeleted = 0
ORDER BY CreatedAt DESC
OFFSET @offset ROWS
FETCH NEXT @pageSize ROWS ONLY;

-- Accompanying count query (for UI pagination controls)
SELECT COUNT(*) AS TotalCount
FROM Customer
WHERE TenantId = @tenantId
  AND Status = @status
  AND IsDeleted = 0;
Recommended Page Size

Use a page size of 25-50 for interactive UI workflows and 100-500 for batch processing workflows. Never exceed 1000 rows per page in interactive workflows — larger result sets are better handled by batch processing patterns that stream results through the workflow loop node.

READ_COMMITTED_SNAPSHOT Isolation

Enable READ_COMMITTED_SNAPSHOT (RCSI) on your Data Ocean database. This uses row versioning to eliminate read-write lock contention — critical for workflows that mix reads and writes on the same tables:

-- Enable RCSI (one-time setup — no downtime needed)
ALTER DATABASE DataOcean_CustomerData
    SET READ_COMMITTED_SNAPSHOT ON;

-- Verify
SELECT name, is_read_committed_snapshot_on
FROM sys.databases
WHERE name = 'DataOcean_CustomerData';

Avoiding Common Performance Mistakes

MistakeImpactFix
Missing TenantId in index leading columnTable scan on every tenant's dataAll indexes start with TenantId
No IsDeleted = 0 filtered indexQueries scan soft-deleted rowsAdd WHERE IsDeleted = 0 to indexes
Returning SELECT * in workflow queriesTransfers unnecessary data; index coverage failsAlways SELECT only the columns needed
No pagination on list queriesMemory exhaustion for large tenantsMandatory OFFSET-FETCH on all list queries
Implicit string conversion in WHERE clauseIndex cannot be used; full scanMatch parameter types exactly to column types
Long-running queries in synchronous workflow stepsWorkflow timeouts; user-facing latencyUse async workflow steps; increase commandTimeoutSeconds; optimize query

Query Performance Monitoring

-- Find top 20 slowest queries in the plan cache (run in SSMS)
SELECT TOP 20
    SUBSTRING(qt.TEXT, (qs.statement_start_offset/2)+1,
        ((CASE qs.statement_end_offset
            WHEN -1 THEN DATALENGTH(qt.TEXT)
            ELSE qs.statement_end_offset END
            - qs.statement_start_offset)/2)+1) AS QueryText,
    qs.execution_count,
    qs.total_logical_reads / qs.execution_count AS AvgLogicalReads,
    qs.total_elapsed_time / qs.execution_count / 1000 AS AvgElapsedMs
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt
WHERE qt.dbid = DB_ID('DataOcean_CustomerData')
ORDER BY AvgElapsedMs DESC;