Server Node as a Powerful Service
A server node is not just a scaled-out executor — it is a full-featured ASP.NET Core service with its own DI container, database connections, caches, background workers, and API surface. Understanding this distinction unlocks powerful design patterns.
What a Server Node Actually Is
A server node is an ordinary ASP.NET Core (or other runtime) web service that:
- Implements the Server Group health contract (
GET /healthreturns 200 when ready) - Exposes one or more HTTP endpoints that callers invoke
- Registers itself in the central Server Group registry at startup
- De-registers itself during graceful shutdown
It carries no other obligation. It can have its own database, its own cache, its own background threads, its own authentication model, its own GPU allocation — anything a normal microservice can have.
Self-Registration at Startup
// In your server node's Program.cs
var builder = WebApplication.CreateBuilder(args);
// ... register your services ...
var app = builder.Build();
// Standard health endpoint
app.MapGet("/health", () => Results.Ok(new { status = "ok" }));
// Your business endpoints
app.MapPost("/infer", InferenceEndpoint.Handle);
// Register with the central Server Group on startup
var lifetime = app.Lifetime;
var registry = app.Services.GetRequiredService<IServerGroupRegistrar>();
var config = app.Services.GetRequiredService<IOptions<ServerNodeConfig>>().Value;
lifetime.ApplicationStarted.Register(async () =>
{
await registry.RegisterAsync(new ServerNodeRegistration
{
GroupName = config.GroupName,
Name = config.NodeName,
BaseUrl = config.PublicBaseUrl,
HealthUrl = $"{config.PublicBaseUrl}/health",
Weight = config.Weight,
Metadata = config.Metadata
});
});
lifetime.ApplicationStopping.Register(async () =>
{
await registry.DeregisterAsync(config.GroupName, config.NodeName);
});
app.Run();
Common Server Node Patterns
Warm-Cache Node
Server node pre-loads expensive data at startup (product catalog, ML embeddings) into an in-memory cache. Callers get sub-millisecond responses instead of hitting the database on each request.
Stateful Session Node
Server node maintains per-session state (e.g. active browser sessions via Playwright). Sticky routing directs all calls for a given session to the same node instance.
GPU-Resident Inference Node
Server node loads an AI model onto GPU at startup and holds it there. Removing cold-start latency from every inference call.
Queue-Draining Worker Node
Server node polls a queue (Service Bus, SQS) in the background and exposes a /status endpoint. The workflow checks status rather than blocking.
Server Node vs Microservice
| Aspect | Traditional Microservice | BizFirstGO Server Node |
|---|---|---|
| Discovery | Service mesh / DNS | Server Group registry + process engine routing |
| Health monitoring | K8s liveness/readiness probes | Server Group controller polls /health |
| Load balancing | K8s Service / Ingress | Server Group load balancing strategy |
| Workflow integration | Custom — caller must know the URL | Built-in — workflow nodes call by group name |
| Agent tool integration | Custom MCP adapter | Register as MCP server pointing at group endpoint |
| Telemetry | Custom instrumentation | Server Group adds correlation headers automatically |
High-Throughput Design Patterns
Response Caching
// Cache expensive computation results keyed by input hash
app.MapPost("/classify", async (ClassifyRequest req, IMemoryCache cache) =>
{
var cacheKey = $"classify:{req.GetHashCode()}";
if (cache.TryGetValue(cacheKey, out ClassifyResponse? cached))
return Results.Ok(cached);
var result = await classifier.ClassifyAsync(req.Text);
cache.Set(cacheKey, result, TimeSpan.FromMinutes(10));
return Results.Ok(result);
});
Request Batching
// Buffer individual requests and process in batches
public class BatchingInferenceService : BackgroundService
{
private readonly Channel<InferenceRequest> _queue =
Channel.CreateBounded<InferenceRequest>(
new BoundedChannelOptions(1000)
{ FullMode = BoundedChannelFullMode.Wait });
public async Task<InferenceResult> EnqueueAsync(
InferenceRequest request, CancellationToken ct)
{
var tcs = new TaskCompletionSource<InferenceResult>();
request.Completion = tcs;
await _queue.Writer.WriteAsync(request, ct);
return await tcs.Task;
}
protected override async Task ExecuteAsync(CancellationToken ct)
{
var batch = new List<InferenceRequest>(32);
while (!ct.IsCancellationRequested)
{
batch.Clear();
// Drain up to 32 requests or wait 50ms
var deadline = DateTime.UtcNow.AddMilliseconds(50);
while (batch.Count < 32 && DateTime.UtcNow < deadline)
{
if (_queue.Reader.TryRead(out var req))
batch.Add(req);
else
await Task.Delay(5, ct);
}
if (batch.Count == 0) continue;
var results = await _model.InferBatchAsync(
batch.Select(r => r.Prompt).ToArray(), ct);
for (int i = 0; i < batch.Count; i++)
batch[i].Completion!.SetResult(results[i]);
}
}
}
initialDelaySeconds so the node is not routed to before it is ready.