Production-ready Retrieval-Augmented Generation (RAG) pipelines powered by InterSystems IRIS Vector Search
Build intelligent applications that combine large language models with your enterprise data using battle-tested RAG patterns and native vector search capabilities.
π Production-Ready - Six proven RAG architectures ready to deploy, not research prototypes
β‘ Blazing Fast - Native IRIS vector search with HNSW indexing, no external vector databases needed
π§ Unified API - Swap between RAG strategies with a single line of code
π Enterprise-Grade - ACID transactions, connection pooling, and horizontal scaling built-in
π― 100% Compatible - Works seamlessly with LangChain, RAGAS, and your existing ML stack
π§ͺ Fully Validated - Comprehensive test suite with automated contract validation
| Pipeline Type | Use Case | Retrieval Method | When to Use |
|---|---|---|---|
| basic | Standard retrieval | Vector similarity | General Q&A, getting started, baseline comparisons |
| basic_rerank | Improved precision | Vector + cross-encoder reranking | Higher accuracy requirements, legal/medical domains |
| crag | Self-correcting | Vector + evaluation + web search fallback | Dynamic knowledge, fact-checking, current events |
| graphrag | Knowledge graphs | Vector + text + graph + RRF fusion | Complex entity relationships, research, medical knowledge |
| multi_query_rrf | Multi-perspective | Query expansion + reciprocal rank fusion | Complex queries, comprehensive coverage needed |
| pylate_colbert | Fine-grained matching | ColBERT late interaction embeddings | Nuanced semantic understanding, high precision |
# Clone repository
git clone https://github.com/intersystems-community/iris-rag-templates.git
cd iris-rag-templates
# Setup environment (requires uv package manager)
make setup-env
make install
source .venv/bin/activate# Start IRIS with Docker Compose
docker-compose up -d
# Initialize database schema
make setup-db
# Optional: Load sample medical data
make load-datacat > .env << 'EOF'
OPENAI_API_KEY=your-key-here
ANTHROPIC_API_KEY=your-key-here # Optional, for Claude models
IRIS_HOST=localhost
IRIS_PORT=1972
IRIS_NAMESPACE=USER
IRIS_USERNAME=_SYSTEM
IRIS_PASSWORD=SYS
EOFfrom iris_vector_rag import create_pipeline
# Create pipeline with automatic validation
pipeline = create_pipeline('basic', validate_requirements=True)
# Load your documents
from iris_rag.core.models import Document
docs = [
Document(
page_content="RAG combines retrieval with generation for accurate AI responses.",
metadata={"source": "rag_basics.pdf", "page": 1}
),
Document(
page_content="Vector search finds semantically similar content using embeddings.",
metadata={"source": "vector_search.pdf", "page": 5}
)
]
pipeline.load_documents(documents=docs)
# Query with LLM-generated answer
result = pipeline.query(
query="What is RAG?",
top_k=5,
generate_answer=True
)
print(f"Answer: {result['answer']}")
print(f"Sources: {result['sources']}")
print(f"Retrieved: {len(result['retrieved_documents'])} documents")Switch RAG strategies with one line - all pipelines share the same interface:
from iris_vector_rag import create_pipeline
# Start with basic
pipeline = create_pipeline('basic')
result = pipeline.query("What are the latest cancer treatment approaches?", top_k=5)
# Upgrade to basic_rerank for better accuracy
pipeline = create_pipeline('basic_rerank')
result = pipeline.query("What are the latest cancer treatment approaches?", top_k=5)
# Try graphrag for entity reasoning
pipeline = create_pipeline('graphrag')
result = pipeline.query("What are the latest cancer treatment approaches?", top_k=5)
# All pipelines return the same response format
print(f"Answer: {result['answer']}")
print(f"Sources: {result['sources']}")
print(f"Retrieved: {len(result['retrieved_documents'])} documents")100% LangChain & RAGAS compatible responses:
{
"query": "What is diabetes?",
"answer": "Diabetes is a chronic metabolic condition...", # LLM answer
"retrieved_documents": [Document(...)], # LangChain Documents
"contexts": ["context 1", "context 2"], # RAGAS contexts
"sources": ["medical.pdf p.12", "diabetes.pdf p.3"], # Source citations
"execution_time": 0.523,
"metadata": {
"num_retrieved": 5,
"pipeline_type": "basic",
"retrieval_method": "vector",
"generated_answer": True,
"processing_time": 0.523
}
}Each pipeline uses the same API - just change the pipeline type:
basic- Fast vector similarity search, great for getting startedbasic_rerank- Vector + cross-encoder reranking for higher accuracycrag- Self-correcting with web search fallback for current eventsgraphrag- Multi-modal: vector + text + knowledge graph fusionmulti_query_rrf- Query expansion with reciprocal rank fusionpylate_colbert- ColBERT late interaction for fine-grained matching
π Complete Pipeline Guide β - Decision tree, performance comparison, configuration examples
IRIS provides everything you need in one database:
- β Native vector search (no external vector DB needed)
- β ACID transactions (your data is safe)
- β SQL + NoSQL + Vector in one platform
- β Horizontal scaling and clustering
- β Enterprise-grade security and compliance
Automatic concurrency management:
from iris_rag.storage import IRISVectorStore
# Connection pool handles concurrency automatically
store = IRISVectorStore()
# Safe for multi-threaded applications
# Pool manages connections, no manual management neededDatabase schema created and migrated automatically:
pipeline = create_pipeline('basic', validate_requirements=True)
# β
Checks database connection
# β
Validates schema exists
# β
Migrates to latest version if needed
# β
Reports validation resultsMeasure your RAG pipeline performance:
# Evaluate all pipelines on your data
make test-ragas-sample
# Generates detailed metrics:
# - Answer Correctness
# - Faithfulness
# - Context Precision
# - Context Recall
# - Answer RelevanceAutomatic embedding generation with model caching - eliminates repeated model loading overhead for faster document vectorization.
Key Features:
- β‘ Intelligent model caching - models stay in memory across operations
- π― Multi-field vectorization - combine title, abstract, and content fields
- πΎ Automatic device selection - GPU, Apple Silicon (MPS), or CPU fallback
from iris_vector_rag import create_pipeline
# Enable IRIS EMBEDDING support
pipeline = create_pipeline(
'basic',
embedding_config='medical_embeddings_v1'
)
# Documents auto-vectorize on INSERT
pipeline.load_documents(documents=docs)π Complete IRIS EMBEDDING Guide β - Configuration, performance tuning, multi-field vectorization, troubleshooting
Expose RAG pipelines as MCP tools for Claude Desktop and other MCP clients - enables conversational RAG workflows where Claude queries your documents during conversations.
# Start MCP server
python -m iris_vector_rag.mcpAll pipelines available as MCP tools: rag_basic, rag_basic_rerank, rag_crag, rag_graphrag, rag_multi_query_rrf, rag_pylate_colbert.
π Complete MCP Integration Guide β - Claude Desktop setup, configuration, testing, production deployment
Framework-first design with abstract base classes (RAGPipeline, VectorStore) and concrete implementations for 6 production-ready pipelines.
Key Components: Core abstractions, pipeline implementations, IRIS vector store, MCP server, REST API, validation framework.
π Comprehensive Architecture Guide β - System design, component interactions, extension points
π Comprehensive documentation for every use case:
- User Guide - Complete installation and usage
- API Reference - Detailed API documentation
- Pipeline Guide - When to use each pipeline
- MCP Integration - Model Context Protocol setup
- Production Readiness - Deployment checklist
make test # Run comprehensive test suite
pytest tests/unit/ # Unit tests
pytest tests/integration/ # Integration testsThis implementation is based on peer-reviewed research:
- Basic RAG: Lewis et al., Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, NeurIPS 2020
- CRAG: Yan et al., Corrective Retrieval Augmented Generation, arXiv 2024
- GraphRAG: Edge et al., From Local to Global: A Graph RAG Approach, arXiv 2024
- ColBERT: Khattab & Zaharia, ColBERT: Efficient and Effective Passage Search, SIGIR 2020
We welcome contributions! See CONTRIBUTING.md for development setup, testing guidelines, and pull request process.
- π Issues: GitHub Issues
- π Documentation: Full Documentation
- π’ Enterprise Support: InterSystems Support
MIT License - see LICENSE for details.