Octopus
LLM Routing via SK
The SemanticKernelPlugin wraps the SK kernel as an ILLMProvider, enabling Octopus agents to route different reasoning steps to different LLM models. The routing is configured per agent and per SK function call.
SKLLMProvider — The Bridge
The SKLLMProvider class implements Octopus's ILLMProvider interface, delegating to the SK kernel for completions. This means any Octopus agent can be configured to use SK as its LLM provider:
// SKLLMProvider — wraps SK kernel as an Octopus ILLMProvider
public class SKLLMProvider : ILLMProvider
{
private readonly Kernel _kernel;
private readonly string _serviceId; // Which SK service to use (e.g., "gpt-4o")
public SKLLMProvider(ISKKernelRegistry registry, string serviceId = "default")
{
_kernel = registry.Get(serviceId);
_serviceId = serviceId;
}
public async Task<LLMResponse> CompleteAsync(
IReadOnlyList<LLMMessage> messages,
IReadOnlyList<ToolDefinition>? tools,
LLMOptions options,
CancellationToken ct)
{
var chatHistory = BuildSKChatHistory(messages);
var settings = new OpenAIPromptExecutionSettings
{
MaxTokens = options.MaxTokens,
Temperature = options.Temperature,
ToolCallBehavior = tools?.Any() == true
? ToolCallBehavior.AutoInvokeKernelFunctions
: null
};
var result = await _kernel.InvokePromptAsync(
chatHistory.ToString(),
new KernelArguments(settings),
cancellationToken: ct);
return new LLMResponse
{
Content = result.GetValue<string>() ?? string.Empty,
ToolCalls = ExtractToolCalls(result)
};
}
public IAsyncEnumerable<string> StreamAsync(
IReadOnlyList<LLMMessage> messages,
LLMOptions options,
CancellationToken ct)
{
return StreamSKTokensAsync(messages, options, ct);
}
}
Multi-Model Routing
The SK kernel supports registering multiple AI services with distinct service IDs, enabling per-function model selection:
// OnStartAsync — register multiple models in the same kernel
kernelBuilder
.AddAzureOpenAIChatCompletion(
deploymentName: "gpt-4o",
endpoint: azureEndpoint,
apiKey: apiKey,
serviceId: "gpt-4o") // Expensive, high-quality
.AddAzureOpenAIChatCompletion(
deploymentName: "gpt-4o-mini",
endpoint: azureEndpoint,
apiKey: apiKey,
serviceId: "gpt-4o-mini"); // Cheaper, faster
// In an agent's tool handler — choose which model to use for a specific step
var result = await kernel.InvokePromptAsync(
prompt,
new KernelArguments(new OpenAIPromptExecutionSettings
{
ServiceId = "gpt-4o-mini" // Override for this specific call
}));
Routing Strategy Table
| Step Type | Recommended Model | Reason |
|---|---|---|
| Complex multi-step planning | gpt-4o | Requires strong reasoning and instruction following |
| Simple classification or intent detection | gpt-4o-mini | Low cost, fast response for structured tasks |
| RAG retrieval summarisation | gpt-4o-mini | Summarisation is cheaper with a smaller model |
| Code generation | gpt-4o | Code quality is significantly better with the full model |
| Embeddings | text-embedding-3-large | Separate service — routing does not apply |
Configuring an Agent to Use SKLLMProvider
// Configure an agent via API to use the SK LLM provider
PATCH /api/agents/{agentId}
{
"llmProvider": "SemanticKernel",
"llmModel": "gpt-4o",
"llmSettings": {
"serviceId": "gpt-4o",
"plannerEnabled": true
}
}
Fallback to Octopus native. If the SK kernel call fails (rate limit, service unavailable), the
SKLLMProvider does not automatically fall back to the Octopus native LLM provider. Configure retry policies at the SK kernel level for resilience.