Octopus
Memory Configuration
Memory types can be individually enabled or disabled per agent via AgentMemoryConfig. This page covers all configuration properties, recommended starting values for common agent types, and how to tune memory settings for cost and performance.
AgentMemoryConfig
Every AgentComposite carries an AgentMemoryConfig that controls all memory behaviour:
public class AgentMemoryConfig
{
// ── Working Memory ──────────────────────────────────────────
public int MaxWorkingMemoryTokens { get; set; } = 100_000;
public string PruningStrategy { get; set; } = "FIFO"; // FIFO | Summarize | SlidingWindow
public int SlidingWindowTurns { get; set; } = 10;
public int MinHistoryMessages { get; set; } = 2; // Never prune below this count
// ── Episodic Memory ─────────────────────────────────────────
public bool EpisodicEnabled { get; set; } = true;
public string EpisodicRecallMode { get; set; } = "Semantic"; // Recency | Semantic
public int EpisodicTopK { get; set; } = 3;
public int EpisodicRetentionDays { get; set; } = 90;
public bool EpisodicPIIDetection { get; set; } = false;
// ── Semantic Memory ─────────────────────────────────────────
public bool SemanticEnabled { get; set; } = true;
public int SemanticTopK { get; set; } = 5;
public float SemanticMinScore { get; set; } = 0.75f;
public bool HybridSearchEnabled { get; set; } = false;
public bool RerankerEnabled { get; set; } = false;
// ── Procedural Memory ───────────────────────────────────────
public bool ProceduralEnabled { get; set; } = true;
public bool ProceduralLearning { get; set; } = false; // Agent captures new procedures
public float ProceduralMinScore { get; set; } = 0.80f; // Min confidence for embedding match
}
Configuration Properties Reference
| Property | Default | Effect |
|---|---|---|
MaxWorkingMemoryTokens | 100,000 | Maximum tokens in the assembled context. Set to ~80% of the model's context window. |
PruningStrategy | FIFO | How message history is trimmed when over budget. |
EpisodicEnabled | true | Whether past conversation sessions are recalled. Disable for stateless bots. |
EpisodicTopK | 3 | Number of past episode snippets injected per turn. Higher = more context, more tokens. |
EpisodicRetentionDays | 90 | How long episodes are kept before soft-delete and purge. |
SemanticEnabled | true | Whether the knowledge base is queried. Disable if the agent has no indexed documents. |
SemanticTopK | 5 | Number of knowledge chunks retrieved per turn. Tune for answer quality vs. token cost. |
SemanticMinScore | 0.75 | Minimum cosine similarity to include a chunk. Lower = more results, more noise. |
HybridSearchEnabled | false | Enable RRF hybrid (vector + keyword) retrieval. Better for precise term matching. |
RerankerEnabled | false | Cross-encoder reranking of retrieved chunks. Improves precision at cost of latency. |
ProceduralEnabled | true | Whether procedures are matched before each LLM call. |
ProceduralLearning | false | Whether the agent captures new procedures from successful task completions. |
Recommended Configurations by Agent Type
| Agent Type | Episodic | Semantic | Procedural | Pruning |
|---|---|---|---|---|
| Knowledge Q&A bot (no memory) | Disabled | Enabled, TopK 5 | Disabled | FIFO |
| Customer service (returning users) | Enabled, TopK 3 | Enabled, TopK 5 | Optional | FIFO |
| Task automation agent | Disabled | Optional | Enabled | SlidingWindow (10) |
| Full enterprise assistant | Enabled, TopK 3 | Enabled, TopK 5, Hybrid | Enabled | Summarize |
| High-throughput batch agent | Disabled | Disabled | Disabled | FIFO (minimal history) |
Configuring via API
// PATCH /api/octopus/agents/{agentId}/memory-config
PATCH /api/octopus/agents/agent_hr_01/memory-config
Authorization: Bearer {adminToken}
Content-Type: application/json
{
"episodicEnabled": true,
"episodicTopK": 3,
"episodicRetentionDays": 90,
"semanticEnabled": true,
"semanticTopK": 5,
"semanticMinScore": 0.75,
"hybridSearchEnabled": false,
"proceduralEnabled": true,
"proceduralLearning": false,
"maxWorkingMemoryTokens": 100000,
"pruningStrategy": "FIFO"
}
// Response: 200 OK — agent uses new config on next turn
Tuning for Cost Reduction
Memory retrieval contributes tokens to every LLM call. To reduce token costs:
- Reduce
SemanticTopKfrom 5 to 3 — cuts 30–40% of injected knowledge tokens - Reduce
EpisodicTopKfrom 3 to 1 — one past episode snippet is usually sufficient - Disable
EpisodicEnabledif the agent does not need cross-session memory - Set
PruningStrategy = SlidingWindowwith a small window (6–8 turns) for task-focused sessions - Raise
SemanticMinScoreto 0.80–0.85 to inject only high-confidence knowledge chunks
Changes Take Effect Immediately
Memory configuration changes via the API or admin UI take effect on the next conversation turn — no server restart or session reset is required. Running conversations automatically pick up the new configuration.