Semantic Memory
Semantic memory in Octopus is the agent's long-term knowledge store — documents, policies, manuals, and facts — stored as vector embeddings in a vector database. Retrieval is by meaning similarity, not keyword matching.
What is Semantic Memory?
Semantic memory stores what the agent knows — general facts, domain knowledge, company policies, product documentation. Unlike episodic memory (past conversations) or working memory (current context), semantic memory is persistent and separate from any particular conversation.
Technically, semantic memory is a vector database where each record contains:
- Content: A chunk of text (typically 200–500 tokens)
- Embedding: A high-dimensional float vector representing the semantic meaning of the content
- Metadata: Source document, chunk number, agent ID, tenant ID, date indexed
How Retrieval Works
User Message Arrives
The user's message is the query for semantic retrieval.
Embed the Query
IEmbeddingProvider.EmbedAsync(userMessage) → float[] vector representing the query's meaning.
Vector Similarity Search
The query vector is compared against all indexed document chunks using cosine similarity. Top-K most similar chunks are returned.
Inject into Context
Retrieved chunks are formatted as a [Retrieved Knowledge] block and placed in the LLM context before the user's message.
Grounded Response
The LLM answers based on the retrieved knowledge — not hallucinating, but citing actual content from the knowledge base.
ISemanticMemoryStore Interface
public interface ISemanticMemoryStore
{
Task StoreAsync(
MemoryRecord record,
CancellationToken ct = default);
Task<IReadOnlyList<MemoryRecord>> SearchAsync(
string collection, // agent-scoped collection name
float[] queryEmbedding,
int topK = 5,
float minScore = 0.7f,
MemoryFilter? filter = null,
CancellationToken ct = default);
Task DeleteAsync(
string collection,
string recordId,
CancellationToken ct = default);
Task<bool> CollectionExistsAsync(string collection, CancellationToken ct = default);
Task CreateCollectionAsync(string collection, int vectorSize, CancellationToken ct = default);
}
public class MemoryRecord
{
public string Id { get; set; }
public string Collection { get; set; } // "agent_{agentId}_{tenantId}"
public string Content { get; set; }
public float[] Embedding { get; set; }
public MemoryMetadata Metadata { get; set; }
}
public class MemoryMetadata
{
public string Source { get; set; } // source document name
public int ChunkIndex { get; set; } // chunk position in source
public string Category { get; set; } // "policy", "manual", "faq"
public DateTimeOffset IndexedAt { get; set; }
public string TenantId { get; set; }
public string AgentId { get; set; }
}
Vector Database Backends
| Backend | Best For | Deployment | Scaling |
|---|---|---|---|
| Qdrant | High performance, large corpora | Docker / Qdrant Cloud | Horizontal sharding |
| PGVector | Simpler ops, existing PostgreSQL stack | PostgreSQL extension | Limited — vertical only |
| In-Memory | Development / testing only | In-process | Not for production |
Each agent has its own vector collection, named agent_{agentId}_{tenantId}. This hard-isolates knowledge — an HR agent cannot accidentally retrieve documents from the Finance agent's knowledge base, even within the same tenant.