Rate Limiting — GuardRails

The Problem

A SaaS platform serves 500 tenants. Tenant A configures an automation that triggers on every new database row. A data migration at Tenant A inserts 10,000 rows in a minute, triggering 10,000 workflow executions that each call an external payment API.

The external payment API gets hammered and returns 429 Too Many Requests — for all tenants
The platform receives the bill for 10,000 API calls
Other tenants experience slowdowns because shared infrastructure is saturated

The Solution: Tenant-Scoped Rate Limiting

RateLimitingGuard with scope="tenant" creates per-tenant rate limit keys. Tenant A's 10,000 calls are throttled at the configured threshold. Other tenants are completely unaffected.

Configuration

{
  "guardRails": {
    "individual": [
      {
        "name": "RateLimitingGuard",
        "enabled": true,
        "order": 1,
        "config": {
          "rps": 50,         // 50 requests per second per tenant
          "window": 60,      // measured over 60-second sliding window (3000 max per window)
          "scope": "tenant"  // key = "tenant:{TenantId}"
        }
      }
    ]
  }
}

Three Scopes Explained

Scope	Rate Key	When to use	Isolation
`"global"`	`"global"`	Platform-wide cap on a shared external API key	None — all tenants share the limit
`"tenant"`	`"tenant:42"`	Per-tenant quota enforcement (most common)	Full — each tenant has its own counter
`"user"`	`"user:1099"`	Per-user limits (e.g., API gateway, interactive workflows)	Full — each user has their own counter

Layered Rate Limiting

You can layer multiple rate limits to protect at different granularities:

{
  "guardRails": {
    "individual": [
      // Global cap: platform-wide external API limit
      {
        "name": "RateLimitingGuard",
        "enabled": true,
        "order": 1,
        "config": { "rps": 500, "window": 60, "scope": "global" }
      },
      // Tenant cap: per-tenant fairness
      {
        "name": "RateLimitingGuard",
        "enabled": true,
        "order": 2,
        "config": { "rps": 50, "window": 60, "scope": "tenant" }
      },
      // Circuit breaker: protect when external API goes down
      {
        "name": "CircuitBreakerGuard",
        "enabled": true,
        "order": 3,
        "config": { "threshold": 5, "timeout": 60000 }
      }
    ]
  }
}

What the Block Response Looks Like

// When Tenant A exceeds 50 rps:
{
  "IsAllowed": false,
  "RetryAfterSeconds": 3,    // hint: try again in 3 seconds
  "ErrorMessage": "Rate limit exceeded",
  "Metadata": {
    "scope": "tenant",
    "requests_in_window": 52,
    "limit": 50
  }
}

The caller receives a structured error. The workflow engine can surface this to the user as: "This automation is running too frequently. Retry after 3 seconds."

Circuit Breaking Complement

When the external payment API starts returning 503 errors, CircuitBreakerGuard opens the circuit after 5 consecutive failures. Subsequent requests are blocked immediately without hitting the failing API — saving credits and reducing log noise.

Scenario	Guard that activates	Outcome
Tenant A sends 100 requests in 2 seconds	RateLimitingGuard (scope=tenant)	Requests 51–100 blocked; Tenant B unaffected
External API returns 503 × 5 consecutive	CircuitBreakerGuard	Circuit opens; all requests blocked for 60s; no more API calls
After 60s, circuit transitions to HalfOpen	CircuitBreakerGuard	One test request allowed; if success → closed; if fail → reopen

Production Rate Limiting The current in-memory implementation is suitable for single-instance deployments. For production multi-instance deployments, inject IRateLimitingOrchestrator from BizFirst.Platform.Operations.Guard — it provides Redis-backed distributed rate limiting with the same configuration interface.

← PII Protection Timeout & Resilience →

Multi-Tenant Rate Limiting

The Problem

The Solution: Tenant-Scoped Rate Limiting

Configuration

Three Scopes Explained

Layered Rate Limiting

What the Block Response Looks Like

Circuit Breaking Complement