Portal Community

What a Server Node Actually Is

A server node is an ordinary ASP.NET Core (or other runtime) web service that:

It carries no other obligation. It can have its own database, its own cache, its own background threads, its own authentication model, its own GPU allocation — anything a normal microservice can have.

Self-Registration at Startup

// In your server node's Program.cs
var builder = WebApplication.CreateBuilder(args);
// ... register your services ...

var app = builder.Build();

// Standard health endpoint
app.MapGet("/health", () => Results.Ok(new { status = "ok" }));

// Your business endpoints
app.MapPost("/infer", InferenceEndpoint.Handle);

// Register with the central Server Group on startup
var lifetime = app.Lifetime;
var registry = app.Services.GetRequiredService<IServerGroupRegistrar>();
var config   = app.Services.GetRequiredService<IOptions<ServerNodeConfig>>().Value;

lifetime.ApplicationStarted.Register(async () =>
{
    await registry.RegisterAsync(new ServerNodeRegistration
    {
        GroupName  = config.GroupName,
        Name       = config.NodeName,
        BaseUrl    = config.PublicBaseUrl,
        HealthUrl  = $"{config.PublicBaseUrl}/health",
        Weight     = config.Weight,
        Metadata   = config.Metadata
    });
});

lifetime.ApplicationStopping.Register(async () =>
{
    await registry.DeregisterAsync(config.GroupName, config.NodeName);
});

app.Run();

Common Server Node Patterns

Warm-Cache Node

Server node pre-loads expensive data at startup (product catalog, ML embeddings) into an in-memory cache. Callers get sub-millisecond responses instead of hitting the database on each request.

Stateful Session Node

Server node maintains per-session state (e.g. active browser sessions via Playwright). Sticky routing directs all calls for a given session to the same node instance.

GPU-Resident Inference Node

Server node loads an AI model onto GPU at startup and holds it there. Removing cold-start latency from every inference call.

Queue-Draining Worker Node

Server node polls a queue (Service Bus, SQS) in the background and exposes a /status endpoint. The workflow checks status rather than blocking.

Server Node vs Microservice

AspectTraditional MicroserviceBizFirstGO Server Node
DiscoveryService mesh / DNSServer Group registry + process engine routing
Health monitoringK8s liveness/readiness probesServer Group controller polls /health
Load balancingK8s Service / IngressServer Group load balancing strategy
Workflow integrationCustom — caller must know the URLBuilt-in — workflow nodes call by group name
Agent tool integrationCustom MCP adapterRegister as MCP server pointing at group endpoint
TelemetryCustom instrumentationServer Group adds correlation headers automatically

High-Throughput Design Patterns

Response Caching

// Cache expensive computation results keyed by input hash
app.MapPost("/classify", async (ClassifyRequest req, IMemoryCache cache) =>
{
    var cacheKey = $"classify:{req.GetHashCode()}";

    if (cache.TryGetValue(cacheKey, out ClassifyResponse? cached))
        return Results.Ok(cached);

    var result = await classifier.ClassifyAsync(req.Text);
    cache.Set(cacheKey, result, TimeSpan.FromMinutes(10));
    return Results.Ok(result);
});

Request Batching

// Buffer individual requests and process in batches
public class BatchingInferenceService : BackgroundService
{
    private readonly Channel<InferenceRequest> _queue =
        Channel.CreateBounded<InferenceRequest>(
            new BoundedChannelOptions(1000)
            { FullMode = BoundedChannelFullMode.Wait });

    public async Task<InferenceResult> EnqueueAsync(
        InferenceRequest request, CancellationToken ct)
    {
        var tcs = new TaskCompletionSource<InferenceResult>();
        request.Completion = tcs;
        await _queue.Writer.WriteAsync(request, ct);
        return await tcs.Task;
    }

    protected override async Task ExecuteAsync(CancellationToken ct)
    {
        var batch = new List<InferenceRequest>(32);
        while (!ct.IsCancellationRequested)
        {
            batch.Clear();
            // Drain up to 32 requests or wait 50ms
            var deadline = DateTime.UtcNow.AddMilliseconds(50);
            while (batch.Count < 32 && DateTime.UtcNow < deadline)
            {
                if (_queue.Reader.TryRead(out var req))
                    batch.Add(req);
                else
                    await Task.Delay(5, ct);
            }

            if (batch.Count == 0) continue;

            var results = await _model.InferBatchAsync(
                batch.Select(r => r.Prompt).ToArray(), ct);

            for (int i = 0; i < batch.Count; i++)
                batch[i].Completion!.SetResult(results[i]);
        }
    }
}
Server node startup time. The Server Group controller will not route requests to a node until its health check passes. For nodes with long startup times (GPU model loading), configure the readiness probe with a generous initialDelaySeconds so the node is not routed to before it is ready.