Portal Community

PII Detection and Redaction

When PII detection is enabled, each message is scanned before storage. Detected PII entities are replaced with typed placeholders:

// PII detector processes messages at episode closure
public class PIIDetector
{
    public async Task<PIIDetectionResult> DetectAndRedactAsync(string content)
    {
        // Named entity recognition for PII types
        var entities = await _nerModel.ExtractAsync(content);

        var redacted = content;
        foreach (var entity in entities.OrderByDescending(e => e.StartIndex))
        {
            redacted = redacted
                .Remove(entity.StartIndex, entity.Length)
                .Insert(entity.StartIndex, $"[{entity.Type}]");
            // e.g., "John Smith" → "[PERSON_NAME]"
            // "078-123-4567" → "[PHONE_NUMBER]"
            // "SSN 123-45-6789" → "SSN [US_SSN]"
        }

        return new PIIDetectionResult
        {
            OriginalContent = content,
            RedactedContent = redacted,
            DetectedEntities = entities,
            ContainsPII = entities.Any()
        };
    }
}

PII Entity Types Detected

Entity TypeExamplePlaceholder
Person NameJohn Smith[PERSON_NAME]
Email Addressjohn@example.com[EMAIL]
Phone Number+1-800-555-0100[PHONE_NUMBER]
US SSN123-45-6789[US_SSN]
Credit Card4111 1111 1111 1111[CREDIT_CARD]
Date of Birthborn 1985-03-15[DATE_OF_BIRTH]
Address123 Main Street, NY[ADDRESS]

User Right to Erasure (GDPR/CCPA)

// Data erasure endpoint — hard-deletes all episodes for a user
public async Task EraseUserDataAsync(Guid userId, string tenantId)
{
    await _db.Episodes
        .Where(e => e.UserId == userId && e.TenantId == tenantId)
        .ExecuteDeleteAsync();

    // Also clear from vector store if summary embeddings were stored there
    await _vectorStore.DeleteByFilterAsync(
        collection: $"episodic_{tenantId}",
        filter: $"userId == '{userId}'");

    _logger.LogAudit("User data erased", userId, tenantId);
}

Tenant Isolation Enforcement

All episodic memory queries include a mandatory TenantId filter enforced by the EF Core global query filter. There is no API endpoint that allows cross-tenant episode access. Episode IDs from one tenant cannot be used to access data in another tenant — even if the episode ID is guessed.

PII in LLM Summaries

If PII detection is disabled, the auto-generated episode summary may contain PII from the conversation. If you enable LLM-generated summaries, ensure your LLM provider agreement covers the data privacy requirements of your jurisdiction. Consider applying PII redaction to messages before sending to the summary LLM.