Octopus
Embedding
Embedding converts text into a dense vector of floating-point numbers that encodes its semantic meaning. The embedding model used at indexing time must be the same model used at query time — any mismatch produces incompatible vectors and incorrect similarity scores.
IEmbeddingProvider Interface
public interface IEmbeddingProvider
{
// Embed a single text string
Task<float[]> EmbedAsync(string text, CancellationToken ct = default);
// Embed a batch of strings (more efficient — fewer API calls)
Task<IReadOnlyList<float[]>> EmbedBatchAsync(
IReadOnlyList<string> texts,
CancellationToken ct = default);
// Dimensions of the output vector (e.g. 1536, 3072, 768)
int Dimensions { get; }
}
// Usage in the indexing pipeline:
float[] chunkEmbedding = await _embedder.EmbedAsync(chunk.Content, ct);
// Usage in the retrieval pipeline:
float[] queryEmbedding = await _embedder.EmbedAsync(userMessage, ct);
Supported Embedding Models
| Provider | Model | Dimensions | Max Input Tokens | Notes |
|---|---|---|---|---|
| OpenAI | text-embedding-3-small | 1536 | 8191 | Recommended default — fast, cost-effective |
| OpenAI | text-embedding-3-large | 3072 | 8191 | Higher quality; 2x cost |
| Azure OpenAI | text-embedding-ada-002 | 1536 | 8191 | Azure-hosted; same quality as OpenAI ada |
| Local ONNX | all-MiniLM-L6-v2 | 384 | 512 | On-premise; no API cost; lower quality |
| Local ONNX | bge-large-en-v1.5 | 1024 | 512 | High quality local model; CPU-intensive |
Configuration
// appsettings.json — Embedding configuration
{
"Octopus": {
"Embedding": {
"Provider": "OpenAI", // OpenAI | AzureOpenAI | ONNX
"Model": "text-embedding-3-small",
"Dimensions": 1536,
"CredentialId": 42, // ICredentialResolver lookup for API key
"BatchSize": 100 // Max chunks per batch API call
}
}
}
// Azure OpenAI variant
{
"Octopus": {
"Embedding": {
"Provider": "AzureOpenAI",
"Endpoint": "https://my-instance.openai.azure.com",
"Deployment": "text-embedding-ada-002",
"Dimensions": 1536,
"CredentialId": 43
}
}
}
// Local ONNX variant (no API key required)
{
"Octopus": {
"Embedding": {
"Provider": "ONNX",
"ModelPath": "/models/all-MiniLM-L6-v2.onnx",
"Dimensions": 384
}
}
}
Batch Embedding for Efficiency
// Batch embed all chunks in a document before writing to vector store
public async Task IndexDocumentAsync(DocumentContent doc, Guid agentId, CancellationToken ct)
{
var chunks = _chunker.Chunk(doc.RawText, _chunkingConfig);
// Embed all chunks in batches of 100
var allEmbeddings = await _embedder.EmbedBatchAsync(
chunks.Select(c => c.Content).ToList(), ct);
// Write chunks + embeddings to vector store
var records = chunks.Select((chunk, i) => new MemoryRecord
{
Id = Guid.NewGuid().ToString(),
Content = chunk.Content,
Embedding = allEmbeddings[i],
Metadata = new MemoryMetadata
{
Source = doc.Metadata.Source,
AgentId = agentId.ToString(),
TenantId = doc.Metadata.TenantId.ToString()
}
}).ToList();
await _vectorStore.UpsertBatchAsync($"agent_{agentId:N}", records, ct);
}
Model Lock-In Warning
Once documents are indexed with a specific embedding model, you cannot switch models without re-indexing all documents. The stored vectors are incompatible with a different model's query vectors. Establish and document your embedding model choice before indexing any production data.