Privacy and PII in Episodic Memory
Conversation messages may contain personally identifiable information (PII). Octopus provides PII detection, redaction at storage time, per-user data deletion, and tenant-scoped isolation to meet privacy obligations.
PII Detection and Redaction
When PII detection is enabled, each message is scanned before storage. Detected PII entities are replaced with typed placeholders:
// PII detector processes messages at episode closure
public class PIIDetector
{
public async Task<PIIDetectionResult> DetectAndRedactAsync(string content)
{
// Named entity recognition for PII types
var entities = await _nerModel.ExtractAsync(content);
var redacted = content;
foreach (var entity in entities.OrderByDescending(e => e.StartIndex))
{
redacted = redacted
.Remove(entity.StartIndex, entity.Length)
.Insert(entity.StartIndex, $"[{entity.Type}]");
// e.g., "John Smith" → "[PERSON_NAME]"
// "078-123-4567" → "[PHONE_NUMBER]"
// "SSN 123-45-6789" → "SSN [US_SSN]"
}
return new PIIDetectionResult
{
OriginalContent = content,
RedactedContent = redacted,
DetectedEntities = entities,
ContainsPII = entities.Any()
};
}
}
PII Entity Types Detected
| Entity Type | Example | Placeholder |
|---|---|---|
| Person Name | John Smith | [PERSON_NAME] |
| Email Address | john@example.com | [EMAIL] |
| Phone Number | +1-800-555-0100 | [PHONE_NUMBER] |
| US SSN | 123-45-6789 | [US_SSN] |
| Credit Card | 4111 1111 1111 1111 | [CREDIT_CARD] |
| Date of Birth | born 1985-03-15 | [DATE_OF_BIRTH] |
| Address | 123 Main Street, NY | [ADDRESS] |
User Right to Erasure (GDPR/CCPA)
// Data erasure endpoint — hard-deletes all episodes for a user
public async Task EraseUserDataAsync(Guid userId, string tenantId)
{
await _db.Episodes
.Where(e => e.UserId == userId && e.TenantId == tenantId)
.ExecuteDeleteAsync();
// Also clear from vector store if summary embeddings were stored there
await _vectorStore.DeleteByFilterAsync(
collection: $"episodic_{tenantId}",
filter: $"userId == '{userId}'");
_logger.LogAudit("User data erased", userId, tenantId);
}
Tenant Isolation Enforcement
All episodic memory queries include a mandatory TenantId filter enforced by the EF Core global query filter. There is no API endpoint that allows cross-tenant episode access. Episode IDs from one tenant cannot be used to access data in another tenant — even if the episode ID is guessed.
If PII detection is disabled, the auto-generated episode summary may contain PII from the conversation. If you enable LLM-generated summaries, ensure your LLM provider agreement covers the data privacy requirements of your jurisdiction. Consider applying PII redaction to messages before sending to the summary LLM.