Memory System Overview — Memory Overview

The Four Memory Types

Working Memory

The active LLM context window — what the agent is "thinking about" right now. Transient, token-constrained, rebuilt fresh on every turn.

Deep dive → | Full guide →

Episodic Memory

Past conversation sessions stored in SQL. The agent's "autobiographical" memory — what happened in which conversation, with which user.

Deep dive → | Full guide →

Semantic Memory

Embedded knowledge in a vector database. Documents, policies, FAQs — retrieved by meaning similarity at query time.

Deep dive → | Full guide →

Procedural Memory

Stored step sequences for repeatable tasks. The agent's "skill library" — how to accomplish specific types of tasks.

Deep dive → | Full guide →

Master Comparison Table

Memory Type	Storage	Retrieval Method	Lifetime	Primary Use Case
Working	In-process (RAM)	Always included (pruned if too large)	One session turn	Current reasoning — what the LLM is processing now
Episodic	SQL Server	Recency or semantic search across past sessions	Configurable TTL (default 90 days)	Cross-session user continuity
Semantic	Qdrant / PGVector	Cosine similarity search on query embedding	Persistent (until deleted)	Knowledge base — answer factual questions
Procedural	SQL Server	Pattern match / embedding match on task description	Persistent (until deactivated)	Repeatable multi-step tasks

Memory Decision Guide

Which memory type should you configure for a given scenario?

Scenario	Memory Type
"Remember that I prefer formal language" (user preference)	Episodic — stored in past sessions, recalled next time
"What is the company's parental leave policy?"	Semantic — embedded knowledge base retrieval
"Onboard a new vendor" (multi-step task)	Procedural — recalled skill with step-by-step execution
"What did I say just two messages ago?"	Working — it's in the current context window
"Last month you helped me with an expense report"	Episodic — past session recall

MemoryOrchestrator

The MemoryOrchestrator is the central coordinator that queries all relevant memory types before each LLM call and assembles the results into working memory:

public class MemoryOrchestrator
{
    public async Task<MemoryAssembly> AssembleAsync(
        AgentComposite agent,
        ConversationComposite conversation,
        string userMessage,
        CancellationToken ct = default)
    {
        // Run memory retrievals in parallel (they are independent)
        var proceduralTask = agent.MemoryConfig.ProceduralEnabled
            ? _proceduralStore.FindMatchAsync(userMessage, agent.Id, agent.TenantId, ct)
            : Task.FromResult<Procedure?>(null);

        var semanticTask = agent.MemoryConfig.SemanticEnabled
            ? _semanticService.RetrieveAsync(agent, userMessage, ct)
            : Task.FromResult(SemanticRetrievalResult.Empty);

        var episodicTask = agent.MemoryConfig.EpisodicEnabled
            ? _episodicStore.SearchAsync(agent.Id, conversation.UserId, userMessage, topK: 3, ct)
            : Task.FromResult<IReadOnlyList<EpisodeSnippet>>(Array.Empty<EpisodeSnippet>());

        await Task.WhenAll(proceduralTask, semanticTask, episodicTask);

        return new MemoryAssembly
        {
            MatchedProcedure = proceduralTask.Result,
            SemanticResults  = semanticTask.Result,
            EpisodicSnippets = episodicTask.Result
        };
    }
}

Next: Working Memory →