Octopus
Retrieval
Retrieval is the per-turn phase that finds the most semantically relevant document chunks for the user's current message. It embeds the query, searches the vector store, applies a minimum score filter, and returns the top-K results for context injection.
Retrieval Flow
1
Embed Query
User message embedded via
IEmbeddingProvider.EmbedAsync. Must use same model as indexing.
2
Vector Search
ISemanticMemoryStore.SearchAsync queries the agent's collection for the top-K nearest neighbours by cosine similarity.
3
Score Filter
Results below
MinScore (default 0.75) are discarded. Prevents low-relevance chunks from polluting context.
4
Return Top-K
Up to
SemanticTopK chunks (default 5) returned as SemanticRetrievalResult.
SemanticMemoryService.RetrieveAsync
public class SemanticMemoryService
{
public async Task<SemanticRetrievalResult> RetrieveAsync(
AgentComposite agent,
string query,
CancellationToken ct)
{
if (!agent.MemoryConfig.SemanticEnabled)
return SemanticRetrievalResult.Empty;
// 1. Embed the query
float[] queryEmbedding = await _embedder.EmbedAsync(query, ct);
// 2. Search the vector store
string collection = $"agent_{agent.Id:N}";
var results = await _store.SearchAsync(
collection: collection,
embedding: queryEmbedding,
topK: agent.MemoryConfig.SemanticTopK,
minScore: agent.MemoryConfig.SemanticMinScore,
ct: ct);
return new SemanticRetrievalResult
{
Records = results,
QueryTokens = _counter.Count(query),
TotalTokens = results.Sum(r => _counter.Count(r.Content))
};
}
}
public class SemanticRetrievalResult
{
public static readonly SemanticRetrievalResult Empty = new();
public IReadOnlyList<MemoryRecord> Records { get; init; } = Array.Empty<MemoryRecord>();
public int QueryTokens { get; init; }
public int TotalTokens { get; init; }
}
Tuning Retrieval Parameters
| Parameter | Default | Effect of Increasing | Effect of Decreasing |
|---|---|---|---|
SemanticTopK | 5 | More context, more tokens, possible noise | Less context, fewer tokens, may miss relevant chunks |
SemanticMinScore | 0.75 | Stricter filter — only high-confidence matches | More results — includes lower-relevance chunks |
Category Filtering
Retrieval can be scoped to a specific document category when the query context is known:
// Filter to only retrieve from the "Leave" category
var results = await _store.SearchAsync(
collection: collection,
embedding: queryEmbedding,
topK: 5,
minScore: 0.75f,
filter: new MemoryFilter { Category = "Leave" },
ct: ct);
// Qdrant payload filter equivalent:
// { "must": [{ "key": "category", "match": { "value": "Leave" } }] }
Debugging Retrieval Quality
| Symptom | Likely Cause | Fix |
|---|---|---|
| Agent can't answer a known question | Chunk not retrieved — score below MinScore | Lower MinScore; check chunk quality; re-index with better chunking |
| Agent gives irrelevant answers | Wrong chunks retrieved — MinScore too low | Raise MinScore to 0.80–0.85 |
| Agent only uses part of a long document | TopK too low — relevant chunk not in top-5 | Increase SemanticTopK to 8–10 |
| High token cost per turn | TopK too high or chunks too large | Reduce TopK; use smaller chunk size (256 tokens) |