RAG
A Rust library and CLI for Retrieval-Augmented Generation (RAG) that combines vector similarity, graph structure, and search-style retrieval rather than embeddings alone. Dense vectors cover semantic match, a knowledge graph encodes entities and relations, and configurable top-k plus metadata filtering make retrieval behave like a search layer over your corpus.
Project docs: SPEC.md (scope and requirements), ARCHITECTURE.md (modules and data flow), TODO.md (backlog).
Features
- Pure Rust implementation with async/await support
- Vector RAG: multiple embedding backends (OpenAI, Ollama), pluggable indexes and distance metrics (cosine, Euclidean, dot product, Manhattan)
- Graph RAG: graph store for nodes and edges, entity extraction hooks, and a
GraphRagEnginethat ties documents, vectors, and the graph together - In-memory vector stores with parallel batch search (
InMemoryVectorStore,MinimalVectorDB) - Search-oriented retrieval: configurable top-k, score-ranked results, and metadata filtering over stored chunks
- Ingestion helpers:
Sourceimplementations for PDF, codebase trees, and wiki-style URLs (ingestionmodule) - Multiple text chunking strategies (fixed-size, paragraph, sentence)
- CLI for ingest and query with persistent state (
RAG_STATE_DIR, default.rag): vector, hybrid-query (BM25 + embeddings), and graph subcommands - MCP server (
rag-mcp) with vector tools (rag_*) and graph or hybrid tools (graph_*) - Library API suitable for custom pipelines
Installation
From source
As a library
Add to your Cargo.toml:
[]
= { = "https://github.com/yingkitw/rag" }
Quick Start
State for the CLI lives under RAG_STATE_DIR (default .rag): vectors.json, optional graph.json and graph_rag.json.
CLI Usage
# Set your API key (OpenAI) or use Ollama
# Optional when using Ollama for CLI or rag-mcp-server:
# Add a document (persists chunks to $RAG_STATE_DIR/vectors.json)
# Vector-only query
# Vector + BM25 hybrid (alpha = vector weight in [0,1])
# Graph stats from a saved graph file
# Build GraphRAG snapshot from a file (writes graph_rag.json + graph.json)
# Query using saved GraphRAG snapshot
# List documents
# Count documents
Library Usage
use ;
async
Examples
See the examples/ directory, for example:
Configuration
Environment Variables
OPENAI_API_KEY: Your OpenAI API key (optional; if unset, embeddings use Ollama)OLLAMA_URL: Ollama server URL (default:http://localhost:11434)OLLAMA_MODEL: Embedding model when using Ollama (CLI,rag-mcp-server, and examples; default:nomic-embed-text)
MCP server
Run the stdio MCP server (for clients that spawn the process):
# or rely on Ollama + OLLAMA_URL / OLLAMA_MODEL
Vector tools: rag_add_document, rag_query, rag_list_documents, rag_count. Graph and hybrid tools: graph_build, graph_query, graph_get_entity, graph_get_neighbors, graph_info, graph_communities.
Chunking Strategies
FixedSizeChunker: Splits text into chunks of fixed size with overlapParagraphChunker: Splits text by paragraphs (double newlines)SentenceChunker: Splits text by sentences
Embedding Models
OpenAI
let model = new;
let model = with_model;
Ollama
let model = new;
let model = new
.with_base_url;
API Reference
Core Types
EmbeddingModel: Trait for embedding modelsVectorStore: Trait for vector storage backendsRetriever: Main interface for vector-centric RAG operationsGraphStore,GraphNode,GraphEdge: Graph storage and structure for graph-augmented retrievalGraphRagEngine,EntityExtractor: Orchestration and entity linking for graph RAGSource,ExtractedDocument: Ingestion from PDF, codebase, wiki, and other sourcesDocument: Represents a stored document with content, metadata, and optional embeddingTextChunker: Trait for text chunking strategiesRagMcpServer: MCP tool router combining vector store and graph (seemcpmodule)
Retriever Methods
add_document(content): Add a single documentadd_document_with_metadata(content, metadata): Add a document with metadataretrieve(query): Retrieve relevant chunksretrieve_with_scores(query): Retrieve chunks with similarity scoresretrieve_filtered(query, metadata_filter): Retrieve with metadata filtering
Development
Run tests:
Run examples:
License
Apache-2.0
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.