Portal Community

Retrieval Flow

1
Embed Query User message embedded via IEmbeddingProvider.EmbedAsync. Must use same model as indexing.
2
Vector Search ISemanticMemoryStore.SearchAsync queries the agent's collection for the top-K nearest neighbours by cosine similarity.
3
Score Filter Results below MinScore (default 0.75) are discarded. Prevents low-relevance chunks from polluting context.
4
Return Top-K Up to SemanticTopK chunks (default 5) returned as SemanticRetrievalResult.

SemanticMemoryService.RetrieveAsync

public class SemanticMemoryService
{
    public async Task<SemanticRetrievalResult> RetrieveAsync(
        AgentComposite agent,
        string query,
        CancellationToken ct)
    {
        if (!agent.MemoryConfig.SemanticEnabled)
            return SemanticRetrievalResult.Empty;

        // 1. Embed the query
        float[] queryEmbedding = await _embedder.EmbedAsync(query, ct);

        // 2. Search the vector store
        string collection = $"agent_{agent.Id:N}";
        var results = await _store.SearchAsync(
            collection:  collection,
            embedding:   queryEmbedding,
            topK:        agent.MemoryConfig.SemanticTopK,
            minScore:    agent.MemoryConfig.SemanticMinScore,
            ct:          ct);

        return new SemanticRetrievalResult
        {
            Records      = results,
            QueryTokens  = _counter.Count(query),
            TotalTokens  = results.Sum(r => _counter.Count(r.Content))
        };
    }
}

public class SemanticRetrievalResult
{
    public static readonly SemanticRetrievalResult Empty = new();

    public IReadOnlyList<MemoryRecord> Records     { get; init; } = Array.Empty<MemoryRecord>();
    public int                         QueryTokens  { get; init; }
    public int                         TotalTokens  { get; init; }
}

Tuning Retrieval Parameters

ParameterDefaultEffect of IncreasingEffect of Decreasing
SemanticTopK5More context, more tokens, possible noiseLess context, fewer tokens, may miss relevant chunks
SemanticMinScore0.75Stricter filter — only high-confidence matchesMore results — includes lower-relevance chunks

Category Filtering

Retrieval can be scoped to a specific document category when the query context is known:

// Filter to only retrieve from the "Leave" category
var results = await _store.SearchAsync(
    collection: collection,
    embedding:  queryEmbedding,
    topK:       5,
    minScore:   0.75f,
    filter:     new MemoryFilter { Category = "Leave" },
    ct:         ct);

// Qdrant payload filter equivalent:
// { "must": [{ "key": "category", "match": { "value": "Leave" } }] }

Debugging Retrieval Quality

SymptomLikely CauseFix
Agent can't answer a known questionChunk not retrieved — score below MinScoreLower MinScore; check chunk quality; re-index with better chunking
Agent gives irrelevant answersWrong chunks retrieved — MinScore too lowRaise MinScore to 0.80–0.85
Agent only uses part of a long documentTopK too low — relevant chunk not in top-5Increase SemanticTopK to 8–10
High token cost per turnTopK too high or chunks too largeReduce TopK; use smaller chunk size (256 tokens)