Semantic Memory Overview — Semantic Memory

What is Semantic Memory?

Semantic memory stores what the agent knows — general facts, domain knowledge, company policies, product documentation. Unlike episodic memory (past conversations) or working memory (current context), semantic memory is persistent and separate from any particular conversation.

Technically, semantic memory is a vector database where each record contains:

Content: A chunk of text (typically 200–500 tokens)
Embedding: A high-dimensional float vector representing the semantic meaning of the content
Metadata: Source document, chunk number, agent ID, tenant ID, date indexed

How Retrieval Works

User Message Arrives

The user's message is the query for semantic retrieval.

Embed the Query

IEmbeddingProvider.EmbedAsync(userMessage) → float[] vector representing the query's meaning.

Vector Similarity Search

The query vector is compared against all indexed document chunks using cosine similarity. Top-K most similar chunks are returned.

Inject into Context

Retrieved chunks are formatted as a [Retrieved Knowledge] block and placed in the LLM context before the user's message.

Grounded Response

The LLM answers based on the retrieved knowledge — not hallucinating, but citing actual content from the knowledge base.

ISemanticMemoryStore Interface

public interface ISemanticMemoryStore
{
    Task StoreAsync(
        MemoryRecord record,
        CancellationToken ct = default);

    Task<IReadOnlyList<MemoryRecord>> SearchAsync(
        string collection,         // agent-scoped collection name
        float[] queryEmbedding,
        int topK = 5,
        float minScore = 0.7f,
        MemoryFilter? filter = null,
        CancellationToken ct = default);

    Task DeleteAsync(
        string collection,
        string recordId,
        CancellationToken ct = default);

    Task<bool> CollectionExistsAsync(string collection, CancellationToken ct = default);
    Task CreateCollectionAsync(string collection, int vectorSize, CancellationToken ct = default);
}

public class MemoryRecord
{
    public string Id { get; set; }
    public string Collection { get; set; }    // "agent_{agentId}_{tenantId}"
    public string Content { get; set; }
    public float[] Embedding { get; set; }
    public MemoryMetadata Metadata { get; set; }
}

public class MemoryMetadata
{
    public string Source { get; set; }       // source document name
    public int ChunkIndex { get; set; }      // chunk position in source
    public string Category { get; set; }     // "policy", "manual", "faq"
    public DateTimeOffset IndexedAt { get; set; }
    public string TenantId { get; set; }
    public string AgentId { get; set; }
}

Vector Database Backends

Backend	Best For	Deployment	Scaling
Qdrant	High performance, large corpora	Docker / Qdrant Cloud	Horizontal sharding
PGVector	Simpler ops, existing PostgreSQL stack	PostgreSQL extension	Limited — vertical only
In-Memory	Development / testing only	In-process	Not for production

Collection per Agent

Each agent has its own vector collection, named agent_{agentId}_{tenantId}. This hard-isolates knowledge — an HR agent cannot accidentally retrieve documents from the Finance agent's knowledge base, even within the same tenant.

Next: Embedding Knowledge →