Retrieval-Augmented Generation (RAG)¶
Strutex provides a built-in RAG system that allows you to index documents and perform structured queries against a knowledge base. This is particularly useful for extracting data from large sets of documents or when the answer requires cross-referencing information that doesn't fit in a single LLM context window.
Architecture¶
The RAG system in Strutex is built with three main components:
- Embedding Service: Uses
FastEmbedfor fast, local embedding generation. No API keys are required for embeddings by default. - Vector Store: Uses
Qdrantfor efficient similarity search. It supports in-memory, local disk, and remote Qdrant instances. - RAG Engine: Orchestrates the retrieval and generation flow using
LangGraph.
Installation¶
To use RAG features, install the rag extra:
Basic Usage¶
Ingesting Documents¶
You can ingest any document supported by Strutex into the vector store.
from strutex import DocumentProcessor
processor = DocumentProcessor(provider="gemini")
# Indexing a document
processor.rag_ingest("company_policy.pdf", collection_name="knowledge_base")
Querying with Structured Extraction¶
Once documents are indexed, you can perform queries that return structured data.
from strutex import DocumentProcessor, Object, String
processor = DocumentProcessor(provider="gemini")
schema = Object(properties={
"policy_name": String(),
"max_reimbursement": String()
})
result = processor.rag_query(
query="What is the travel reimbursement policy?",
schema=schema,
collection_name="knowledge_base"
)
print(result)
CLI Usage¶
The CLI provides commands for managing the RAG vector store.
Ingest¶
Query¶
API Usage¶
When running the Strutex server (strutex serve), the following endpoints are available:
POST /rag/ingest: Upload a file to ingest into a collection.POST /rag/query: Perform a RAG query and get structured JSON output.
Configuration¶
The default RAG configuration uses:
- Embeddings:
BAAI/bge-small-en-v1.5(via FastEmbed) - Vector Store: In-memory Qdrant instance
You can customize these by configuring the DocumentProcessor or using explicit service instances.