adk-rag
Give your AI agents a knowledge base. adk-rag adds Retrieval-Augmented Generation (RAG) to ADK-Rust so your agents can search documents and answer questions using your own data.
ADK RAG
The adk-rag crate provides Retrieval-Augmented Generation capabilities for the ADK-Rust workspace. It offers a modular, trait-based architecture for document chunking, embedding generation, vector storage, similarity search, reranking, and agentic retrieval. The crate follows the ADK-Rust conventions of feature-gated backends, async-trait interfaces, and builder-pattern configuration. It integrates with existing ADK crates (adk-gemini for embeddings, adk-core for the Tool trait) and supports multiple vector store backends (in-memory, Qdrant, LanceDB, pgvector, SurrealDB).
What is RAG?
RAG stands for Retrieval-Augmented Generation. Instead of relying only on what an LLM was trained on, RAG lets your agent look up relevant information from your documents before answering:
- Ingest — Documents are split into chunks, converted to vector embeddings, and stored
- Query — A user question is embedded and matched against stored chunks
- Generate — The most relevant chunks are passed to the LLM as context
Quick Start
The fastest way to get a working RAG pipeline. Uses Gemini for embeddings (free API key from Google AI Studio).
[]
= { = "0.4", = ["gemini"] }
= { = "1", = ["full"] }
use Arc;
use *;
async
GOOGLE_API_KEY=your-key-here
Agent with RAG Tool
The practical use case — an agent that searches your knowledge base to answer questions. The agent decides when to call rag_search and uses the retrieved context to generate answers.
[]
= { = "0.4", = ["gemini"] }
= "0.4"
= "0.4"
= "0.4"
= { = "1", = ["full"] }
use Arc;
use LlmAgentBuilder;
use GeminiModel;
use *;
async
The agent will automatically call rag_search when it needs information from your knowledge base.
How It Works
adk-rag is built from four pluggable components:
Documents → [Chunker] → [EmbeddingProvider] → [VectorStore]
↓
Query → [EmbeddingProvider] → [VectorStore search] → [Reranker] → Results
| Component | What it does | Built-in options |
|---|---|---|
| Chunker | Splits documents into smaller pieces | FixedSizeChunker, RecursiveChunker, MarkdownChunker |
| EmbeddingProvider | Converts text to vector embeddings | GeminiEmbeddingProvider¹, OpenAIEmbeddingProvider² |
| VectorStore | Stores and searches embeddings | InMemoryVectorStore, QdrantVectorStore³, LanceDBVectorStore⁴, PgVectorStore⁵, SurrealVectorStore⁶ |
| Reranker | Re-scores results after search | NoOpReranker (default), or write your own |
¹ gemini feature ² openai feature ³ qdrant feature ⁴ lancedb feature ⁵ pgvector feature ⁶ surrealdb feature
The RagPipeline wires these together. The RagTool wraps the pipeline as an adk_core::Tool so any ADK agent can call it.
Embedding Providers
Gemini (recommended)
Uses Google's gemini-embedding-001 model (3072 dimensions). Free tier available.
= { = "0.4", = ["gemini"] }
let provider = new?;
OpenAI
Uses text-embedding-3-small (1536 dimensions) by default. Supports dimension truncation via Matryoshka.
= { = "0.4", = ["openai"] }
// Default model
let provider = new?;
// Or read from OPENAI_API_KEY env var
let provider = from_env?;
// With a different model and custom dimensions
let provider = new?
.with_model
.with_dimensions;
Custom Embedding Provider
Implement the EmbeddingProvider trait to use any embedding model — a local model, a different API, or a mock for testing.
use async_trait;
use ;
Add async-trait = "0.1" to your Cargo.toml when implementing traits.
Vector Stores
InMemoryVectorStore (default)
No external dependencies. Good for development, testing, and small datasets. Data is lost when the process exits.
let store = new;
Qdrant
Production-ready vector database with filtering, snapshots, and clustering.
= { = "0.4", = ["qdrant"] }
let store = new.await?;
LanceDB
Embedded vector database with no server required. Data persists to disk.
= { = "0.4", = ["lancedb"] }
Requires
protocinstalled:brew install protobuf(macOS),apt install protobuf-compiler(Ubuntu).
let store = new.await?;
pgvector (PostgreSQL)
Use your existing PostgreSQL database for vector search.
= { = "0.4", = ["pgvector"] }
let store = new.await?;
SurrealDB
Embedded or remote multi-model database with built-in vector search.
= { = "0.4", = ["surrealdb"] }
let store = new_memory.await?;
// or
let store = new_rocksdb.await?;
Choosing a Chunker
| Chunker | Best for | How it splits |
|---|---|---|
FixedSizeChunker |
General text, logs | Every N characters with overlap |
RecursiveChunker |
Articles, docs, code | Paragraphs → sentences → words (natural boundaries) |
MarkdownChunker |
Markdown files, READMEs | By headers, preserving section hierarchy in metadata |
// Fixed: 512 chars per chunk, 100 char overlap
let chunker = new;
// Recursive: tries paragraph breaks first, then sentences
let chunker = new;
// Markdown: splits by headers, stores header path in metadata
let chunker = new;
Configuration
let config = builder
.chunk_size // max characters per chunk (default: 512)
.chunk_overlap // overlap between chunks (default: 100)
.top_k // number of results to return (default: 10)
.similarity_threshold // minimum score to include (default: 0.0)
.build?;
- chunk_size — Smaller chunks are more precise but may lose context. 200–500 is a good range.
- chunk_overlap — Prevents information loss at chunk boundaries. 10–20% of chunk_size works well.
- top_k — More results give the LLM more context but increase token usage.
- similarity_threshold — Filter out low-quality matches. 0.0 returns everything, 0.3–0.7 keeps strong matches only.
Writing a Custom Reranker
The default NoOpReranker passes results through unchanged. Write your own to improve precision:
[]
= { = "0.4", = ["gemini"] }
= "0.1"
= { = "1", = ["full"] }
use async_trait;
use ;
Use it in the pipeline:
let pipeline = builder
.config
.embedding_provider
.vector_store
.chunker
.reranker
.build?;
Feature Flags
# Core only (in-memory store, all chunkers, no external deps)
= "0.4"
# With Gemini embeddings (recommended)
= { = "0.4", = ["gemini"] }
# With OpenAI embeddings
= { = "0.4", = ["openai"] }
# With a persistent vector store
= { = "0.4", = ["gemini", "qdrant"] }
# Everything
= { = "0.4", = ["full"] }
| Feature | Enables | Extra dependency |
|---|---|---|
| (default) | Core traits, InMemoryVectorStore, all chunkers |
none |
gemini |
GeminiEmbeddingProvider |
adk-gemini |
openai |
OpenAIEmbeddingProvider |
reqwest |
qdrant |
QdrantVectorStore |
qdrant-client |
lancedb |
LanceDBVectorStore |
lancedb, arrow |
pgvector |
PgVectorStore |
sqlx |
surrealdb |
SurrealVectorStore |
surrealdb |
full |
All of the above | all |
Testing Without API Keys
For unit tests or CI where you don't have API keys, implement a deterministic mock embedder:
use async_trait;
use ;
;
This produces stable vectors so your tests are reproducible, but the similarity scores won't be meaningful. Use it for testing pipeline wiring, not search quality.
Examples
Run from the ADK-Rust workspace root:
| Example | What it shows | API key? | Command |
|---|---|---|---|
rag_basic |
Pipeline with mock embeddings | No | cargo run --example rag_basic --features rag |
rag_markdown |
Markdown chunking with header metadata | No | cargo run --example rag_markdown --features rag |
rag_agent |
LlmAgent with RagTool | Yes | cargo run --example rag_agent --features rag-gemini |
rag_recursive |
Codebase Q&A with RecursiveChunker | Yes | cargo run --example rag_recursive --features rag-gemini |
rag_reranker |
Custom keyword reranker | Yes | cargo run --example rag_reranker --features rag-gemini |
rag_multi_collection |
Multi-collection search | Yes | cargo run --example rag_multi_collection --features rag-gemini |
rag_surrealdb |
SurrealDB vector store | Yes | cargo run --example rag_surrealdb --features rag-surrealdb |
For examples that need an API key, set GOOGLE_API_KEY in your environment or .env file.
API Reference
Core Types
// A document to ingest
Document
// A chunk produced by a Chunker (with embedding attached after processing)
Chunk
// A search result with relevance score
SearchResult
Pipeline Methods
let pipeline = builder
.config
.embedding_provider
.vector_store
.chunker
.reranker // optional
.build?;
// Collection management
pipeline.create_collection.await?;
pipeline.delete_collection.await?;
// Ingestion (chunk → embed → store)
let chunks = pipeline.ingest.await?;
let chunks = pipeline.ingest_batch.await?;
// Query (embed → search → rerank → filter)
let results = pipeline.query.await?;
RagTool
Wraps a pipeline as an adk_core::Tool for agent use:
let tool = new;
// The agent calls it with JSON:
// { "query": "How do I reset my password?" }
// { "query": "pricing info", "collection": "faq", "top_k": 5 }
License
Apache-2.0
Part of ADK-Rust
This crate is part of the ADK-Rust framework for building AI agents in Rust.