FlowRag — Examples

Example 1: Insert HR Policy into Knowledge Base

Scenario: A workflow processes a new HR policy document. Each paragraph is inserted into the hr-policies collection as an individual knowledge item.

{
  "credential_id": "pgvector-prod",
  "operation": "insert",
  "collection_name": "hr-policies",
  "item_id": "policy-remote-work-v4",
  "knowledge_item": "Employees are eligible for remote work up to 3 days per week after completing 90 days of employment. A minimum of 2 days per week must be spent in the office. Remote work requires manager pre-approval via the HR portal and a compliant home office setup meeting BizFirst ergonomic standards.",
  "metadata": {
    "category": "hr",
    "policy_type": "remote-work",
    "department": "all",
    "version": "4",
    "effective_date": "2025-01-01",
    "status": "active"
  }
}

Expected outcome: Policy text is embedded and stored. Future queries asking about remote work, WFH policies, or office attendance will retrieve this item with high similarity scores.

Example 2: Employee Policy Q&A Query

Scenario: An employee asks "How many days can I work from home?" in a chatbot workflow. Query the HR policies collection to retrieve relevant context.

{
  "credential_id": "pgvector-prod",
  "operation": "query",
  "collection_name": "hr-policies",
  "query": "{{ vars.employee_question }}",
  "top_k": 4,
  "min_score": 0.75,
  "metadata_filter": {
    "status": "active",
    "department": "{{ vars.employee_department }}"
  }
}

Expected outcome: Top 4 active, department-relevant policy items returned. The context_text output is injected into an HTTP Request node calling Claude with the employee's question for a precise, policy-grounded answer.

Example 3: Support FAQ Retrieval with Score Filtering

Scenario: A support ticket arrives. Before generating an AI response, check if the issue is covered by existing FAQs to provide a direct, known-good answer.

{
  "credential_id": "pgvector-prod",
  "operation": "query",
  "collection_name": "support-faqs",
  "query": "{{ vars.ticket_description }}",
  "top_k": 3,
  "min_score": 0.80,
  "metadata_filter": {
    "product": "{{ vars.product_name }}",
    "status": "published"
  }
}

Expected outcome: If result_count > 0, the top FAQ is used to generate a direct response (skipping the full AI generation step). If no FAQs meet the threshold, the workflow routes to the full AI generation path.

Example 4: Update Outdated Knowledge Item

Scenario: The remote work policy has been updated. Replace the existing knowledge item with the new policy text and update metadata to version 5.

{
  "credential_id": "pgvector-prod",
  "operation": "update",
  "collection_name": "hr-policies",
  "item_id": "policy-remote-work-v4",
  "knowledge_item": "Effective July 2025, employees may work remotely up to 4 days per week after completing 30 days of employment. One mandatory in-office day per week is required for all employees. Remote work days must be recorded in the HR portal by 8am on the day of remote work.",
  "metadata": {
    "category": "hr",
    "policy_type": "remote-work",
    "department": "all",
    "version": "5",
    "effective_date": "2025-07-01",
    "status": "active"
  }
}

Expected outcome: The knowledge item is re-embedded with the new text. Future queries about remote work will retrieve the updated policy — no retraining required. Old version's similarity pattern is replaced immediately.

Example 5: Complete RAG Pipeline — Legal Contract Clause Lookup

Scenario: A lawyer asks "Find clauses about limitation of liability in technology contracts." This example shows the FlowRag query as part of a two-step RAG pipeline with an AI node.

Step 1 — FlowRag Query:

{
  "credential_id": "qdrant-legal",
  "operation": "query",
  "collection_name": "contract-library",
  "query": "{{ vars.lawyer_query }}",
  "top_k": 5,
  "min_score": 0.72,
  "metadata_filter": {
    "contract_type": "technology",
    "status": "executed"
  }
}

Step 2 — HTTP Request to Claude (downstream node):

{
  "url": "https://api.anthropic.com/v1/messages",
  "method": "POST",
  "auth_type": "bearer",
  "auth_credentials": { "credential_id": "anthropic-api" },
  "body": {
    "model": "claude-opus-4-5",
    "max_tokens": 1024,
    "messages": [{
      "role": "user",
      "content": "You are a legal research assistant. Using the contract clauses below, answer the lawyer's query.\n\nRelevant contract clauses:\n{{ nodes.ragLegal.output.context_text }}\n\nQuery: {{ vars.lawyer_query }}\n\nCite the specific clause IDs in your response."
    }]
  }
}

Expected outcome: Claude receives the 5 most relevant contract clauses as context and generates a precise legal research response with specific clause citations — far more accurate and grounded than a general LLM response without RAG context.

Example 6: Delete Superseded Knowledge Item

Scenario: A product has been discontinued. Remove its knowledge items from the product catalog collection so they no longer appear in customer-facing AI responses.

{
  "credential_id": "pgvector-prod",
  "operation": "delete",
  "collection_name": "product-catalog",
  "item_id": "product-legacyXR-specs"
}

In practice, use a Loop node to delete multiple items when retiring a set of related knowledge items (e.g. all entries for a discontinued product line). Alternatively, update metadata to status: deprecated and use metadata_filter in queries to exclude them — this preserves the historical data while hiding it from active queries.

Example 7: Bulk Insert from a Workflow Variable Array

Scenario: A document processing workflow parses a PDF into individual sections and inserts each section as a separate knowledge item for finer-grained retrieval.

// Use a Loop node with items: $node.pdfParser.sections
// Each iteration runs FlowRag insert with:
{
  "credential_id": "pgvector-prod",
  "operation": "insert",
  "collection_name": "policy-documents",
  "item_id": "{{ vars.document_id }}-section-{{ loop.index }}",
  "knowledge_item": "{{ loop.item.text }}",
  "metadata": {
    "document_id": "{{ vars.document_id }}",
    "document_title": "{{ vars.document_title }}",
    "section_number": "{{ loop.index }}",
    "section_heading": "{{ loop.item.heading }}",
    "page_number": "{{ loop.item.page }}",
    "ingestion_date": "{{ vars.today }}"
  }
}

Chunk size guidance: Split documents into sections of 200–500 tokens for best retrieval quality. Very long sections dilute the embedding — the vector captures a blended average of all the section's topics, reducing precision. Very short sections (one sentence) often lack enough context to retrieve correctly. Paragraph-level chunks with a descriptive heading as metadata tend to work best.

Summary: Operation Selection Guide

When you want to…	Use operation
Add a new document, policy, or FAQ entry	`insert`
Update a changed policy or product description	`update`
Remove a discontinued or superseded entry	`delete`
Find relevant content for an AI prompt	`query`
Check if a document already exists before inserting	`query` with `min_score: 0.95` and the document title as query
Refresh all items in a collection	Loop over items → `update` each