Tool Call History in Context
When an agent calls a tool, the tool call request and its result must appear in the message history — LLM APIs require that tool call messages appear in the correct sequence. This page explains how Octopus manages tool call messages in working memory.
Tool Call Message Sequence
When the LLM produces a tool call, the message history must contain the tool call + result as a consecutive pair before the LLM can continue:
// Required message sequence for tool calls (Anthropic format):
[
{ "role": "user", "content": "Onboard TechCorp as a vendor" },
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "tc_01",
"name": "vendor_lookup",
"input": { "name": "TechCorp" }
}
]
},
{
"role": "tool",
"tool_call_id": "tc_01",
"content": "{\"result\": null, \"message\": \"Vendor not found\"}"
},
{
// LLM continues here with the tool result in context:
"role": "assistant",
"content": "TechCorp is not yet in the system. I'll create it now..."
}
]
Pruning Tool Call Pairs
Tool call + result messages are always pruned as a pair — never individually. Removing only one half would break the message sequence and cause LLM API errors:
// FIFOPruner handles tool call pairs
private IEnumerable<MessageGroup> GroupIntoRemovableUnits(IReadOnlyList<LLMMessage> messages)
{
// Each group is a removable unit:
// - Single user message (no tool calls)
// - Assistant message with tool calls + all their tool result messages
// - Single assistant message (no tool calls)
var groups = new List<MessageGroup>();
// ... grouping logic ...
return groups;
// Pruner removes entire groups — never individual messages within a group
}
Tool Call Results Token Impact
Tool results can be very large — a database query might return thousands of tokens of JSON. The ToolResultTruncator limits tool result sizes before they enter the context:
// Truncate tool results to fit budget
public class ToolResultTruncator
{
public string Truncate(string toolResult, int maxTokens = 2000)
{
int tokens = _counter.Count(toolResult);
if (tokens <= maxTokens) return toolResult;
// Truncate and add notice
int keepChars = maxTokens * 4; // approximate char count
return toolResult[..keepChars] +
$"\n[... truncated {tokens - maxTokens} tokens ...]";
}
}
Large tool results are a common source of unexpected token cost inflation. Always set appropriate size limits on MCP tool outputs. A tool that returns a 10,000-token JSON blob on every call will rapidly exhaust the context budget. Design tools to return only the data the LLM needs — not full database records.