FlowRag
Retrieval-Augmented Generation — store, update, and semantically query a domain knowledge base to provide contextual AI responses.
What is Retrieval-Augmented Generation (RAG)?
RAG is an AI pattern that enhances language model responses by first retrieving semantically relevant documents from a knowledge store, then including those documents as context in the AI prompt. This allows the AI to answer questions based on your specific, up-to-date business knowledge — rather than relying solely on its training data.
FlowRag handles the knowledge storage and retrieval half of this pattern. It converts text into numerical vector representations (embeddings), stores them in a vector database, and performs fast semantic similarity searches to find the most relevant content for any query.
Supported Operations
| Operation | Description |
|---|---|
insert | Add a new knowledge item to a collection. The text is embedded and stored with optional metadata tags. |
update | Update an existing knowledge item by ID — replace its text, re-embed it, and update metadata. |
delete | Remove a knowledge item from a collection by its ID. |
query | Perform a semantic similarity search across a collection using a natural language query. Returns the top-K most relevant items above a similarity threshold. |
Key Capabilities
- Automatic embedding generation — text is converted to high-dimensional vectors using a configured embedding model
- Semantic similarity search — find contextually related content even when exact keywords differ
- Top-K retrieval with configurable similarity threshold (
min_score) - Metadata filtering — filter results by tags, categories, and custom key-value pairs
- Structured output — each result includes text, similarity score, and all stored metadata
- Supports PostgreSQL with pgvector extension and Qdrant vector database
- Collection-based organisation — maintain separate knowledge bases per domain or application
Business Benefits
Domain-Specific AI Responses
Pre-load your business's policies, procedures, and product knowledge into FlowRag. When a customer asks a question, the query operation retrieves the relevant context and passes it to an LLM node — delivering accurate, company-specific answers rather than generic AI responses.
Knowledge Base Maintenance
Keep your AI knowledge current. When policies change or new products are added, use the update operation to refresh the corresponding knowledge items. Your AI-powered workflows will immediately reflect the latest information without re-training a model.
Scalable Document Retrieval
Replace manual document search with semantic similarity queries. A support agent workflow can retrieve the most relevant contract clauses, troubleshooting articles, or compliance requirements in milliseconds — dramatically reducing resolution time.
Multi-Collection Organisation
Organise knowledge into separate collections: one for HR policies, one for product documentation, one for legal contracts. Query only the relevant collection in each workflow context, improving retrieval precision and response quality.
Use Cases
Company Policy Q&A Bot
Build an employee self-service bot. Insert all HR, IT, and operations policies into FlowRag. When an employee asks a question, query the relevant collection and pass the results to a Claude/GPT node to generate a clear, policy-accurate answer.
Customer Support FAQ
Pre-load product FAQs and troubleshooting guides. When a support ticket arrives, use FlowRag to retrieve the three most relevant articles, then generate a personalised response using an AI node — with citations to the source articles.
Legal Document Retrieval
Index contract library contents. Lawyers and contract managers query FlowRag to find clauses, precedents, and similar contract language across thousands of documents in seconds.
Technical Documentation Search
Index API documentation, runbooks, and architecture decisions. Development teams query FlowRag to get contextually relevant technical guidance as part of automated ticket resolution workflows.
Product Recommendation
Store product descriptions, features, and customer use cases. When a customer describes their needs, semantically match to the most relevant products and surface personalised recommendations.
In This Guide
Configuration
Properties for all four operations: insert, update, delete, and query with scoring parameters.
Input & Output
Output ports, query result schema, and example retrieval output objects.
Examples
Five examples: policy insert, semantic query, support FAQ, metadata filter, and RAG pipeline.
Choosing a Vector Database Backend
| Backend | Best For | Strengths | Considerations |
|---|---|---|---|
| PostgreSQL + pgvector | Teams already running PostgreSQL | Single database, SQL + vector in one system, familiar operational model, ACID transactions | Requires pgvector extension; performance degrades at very large scale (>10M vectors) |
| Qdrant | High-scale, high-performance RAG | Purpose-built, excellent performance at scale, rich payload filtering, REST + gRPC APIs, cloud and self-hosted options | Additional infrastructure to manage; separate from application database |
Embedding Model Selection
| Model | Provider | Dimensions | Best For |
|---|---|---|---|
text-embedding-3-small | OpenAI | 1536 | Default — excellent balance of quality and cost for general business content |
text-embedding-3-large | OpenAI | 3072 | Highest accuracy for technical, legal, or scientific content where precision matters |
text-embedding-ada-002 | OpenAI (legacy) | 1536 | Legacy compatibility only — prefer text-embedding-3-small for new collections |
voyage-large-2 | Voyage AI | 1536 | Specialised for legal and financial document retrieval — outperforms OpenAI on dense text |
The embedding model is configured per collection at the BizFirst workspace level. All items in a collection must use the same model — mixing models produces invalid similarity scores. To change the model, delete all items and re-insert with the new model.
Knowledge Base Design Tips
- Split long documents into focused, single-topic paragraphs of 200–500 tokens before inserting
- Use descriptive
item_idvalues (e.g.policy-pto-2025-v3) to enable targeted updates without a full re-index - Store rich metadata tags (category, source, date, version, status) to enable precise metadata filtering in queries
- Use a
statusmetadata field (active/deprecated) to hide outdated content without deleting it — preserving history - Test query quality with real user questions before going live — adjust
min_scoreandtop_kbased on observed results - Maintain separate collections per domain (HR, products, legal, support) for better precision and cost control
Common RAG Pipeline Pattern
The most common FlowRag workflow pattern follows these steps:
- Trigger — user submits a question (WebhookTrigger, FormTrigger, or ChatTrigger)
- FlowRag query — embed the question and retrieve the top-K most relevant knowledge items from the appropriate collection
- IfCondition — check
result_count > 0to decide whether to use retrieved context or fall back to a generic answer - DataMapping — format the retrieved context into a prompt template combining question + context + instructions
- AI Chat or HTTP Request — send the formatted prompt to Claude or GPT and generate a grounded response
- Response — return the AI-generated answer to the user, optionally with citations from
sources
This pattern ensures AI responses are grounded in your actual business knowledge rather than the model's general training data, dramatically reducing hallucination on domain-specific questions.
Combine with FlowAiAgent for Autonomous RAG
Register a FlowRag query as a tool in the BizFirst Tool Registry, then give it to a FlowAiAgent node. The agent will autonomously decide when to query the knowledge base — calling it multiple times with different queries if the first retrieval doesn't fully answer the goal. This pattern enables knowledge-augmented agents that can both reason and retrieve, without you needing to hard-code which knowledge base to query at each step.
Related Nodes
| Node | Relationship to FlowRag |
|---|---|
| FlowAiAgent | Use FlowRag as a registered tool inside the agent's reasoning loop for autonomous knowledge retrieval |
| Chat | Pass FlowRag context_text as the system or user context to ground chat responses in domain knowledge |
| Loop | Loop over a document array and call FlowRag insert on each item for bulk knowledge base ingestion |
| IfCondition | Check result_count > 0 after a query to branch between "answer found" and "fallback" paths |
| DataMapping | Format retrieved results into a structured AI prompt before passing to a Chat or HTTP Request node |