docs.rs failed to build trueno-rag-0.2.3
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build:
trueno-rag-0.2.2
Table of Contents
Pure-Rust Retrieval-Augmented Generation Pipeline
SIMD-accelerated RAG pipeline built on Trueno compute primitives. Part of the Sovereign AI Stack.
Features
- Pure Rust - Zero Python/C++ dependencies
- Chunking - Recursive, Fixed, Sentence, Paragraph, Semantic, Structural
- Hybrid Retrieval - Dense (vector) + Sparse (BM25) search
- Fusion - RRF, Linear, DBSF, Convex, Union, Intersection
- Reranking - Lexical, cross-encoder, and composite rerankers
- Metrics - Recall, Precision, MRR, NDCG, MAP
- Semantic Embeddings - Production ONNX models via FastEmbed (optional)
- Nemotron Embeddings - NVIDIA Embed Nemotron 8B via GGUF (optional)
- Index Compression - LZ4/ZSTD compressed persistence (optional)
Installation
[]
= "0.1.8"
Usage
use ;
let mut pipeline = new
.chunker
.embedder
.reranker
.fusion
.build?;
let doc = new.with_title;
pipeline.index_document?;
let = pipeline.query_with_context?;
Examples
# Basic examples
# With semantic embeddings (downloads ~90MB ONNX model on first run)
# With compressed index persistence
# With NVIDIA Nemotron embeddings (requires GGUF model file)
NEMOTRON_MODEL_PATH=/path/to/model.gguf
API Reference
Semantic Embeddings (FastEmbed)
Production-quality vector embeddings via FastEmbed (ONNX Runtime):
= { = "0.1.8", = ["embeddings"] }
use ;
let embedder = new?;
let embedding = embedder.embed?;
// 384-dimensional embeddings
Available models:
AllMiniLmL6V2- Fast, 384 dims (default)AllMiniLmL12V2- Better quality, 384 dimsBgeSmallEnV15- Balanced, 384 dimsBgeBaseEnV15- Higher quality, 768 dimsNomicEmbedTextV1- Retrieval optimized, 768 dims
NVIDIA Embed Nemotron 8B
High-quality 4096-dimensional embeddings via GGUF model inference:
= { = "0.1.8", = ["nemotron"] }
use ;
let config = new
.with_gpu
.with_normalize;
let embedder = new?;
// Asymmetric retrieval - different prefixes for queries vs documents
let query_emb = embedder.embed_query?;
let doc_emb = embedder.embed_document?;
Index Compression
LZ4/ZSTD compressed index persistence:
= { = "0.1.8", = ["compression"] }
use ;
let bytes = index.to_compressed_bytes?;
// 4-6x compression ratio
Architecture
┌─────────────────────────────────────────────┐
│ RAG Pipeline API │
│ (RagPipelineBuilder, query) │
├──────────┬──────────┬───────────────────────┤
│ Chunking │ Embedding│ Retrieval │
│ (6 modes)│ (ONNX/ │ (Dense + BM25) │
│ │ GGUF) │ │
├──────────┴──────────┴───────────────────────┤
│ Fusion & Reranking │
│ (RRF, Linear, DBSF, Lexical, Cross-Enc) │
├─────────────────────────────────────────────┤
│ Storage & Indexing │
│ (BM25 inverted index, vector store, SQLite) │
├─────────────────────────────────────────────┤
│ Trueno SIMD Compute Primitives │
└─────────────────────────────────────────────┘
- Chunking Layer: Recursive, Fixed, Sentence, Paragraph, Semantic, and Structural chunkers
- Embedding Layer: Mock (testing), FastEmbed (ONNX), Nemotron (GGUF) embedders
- Retrieval Layer: Dense vector similarity + BM25 sparse retrieval with hybrid fusion
- Fusion/Reranking: RRF, Linear, DBSF, Convex combination; lexical and cross-encoder rerankers
- Storage: In-memory BM25 index with optional LZ4/ZSTD persistence and SQLite backend
Testing
Property-based tests cover chunking boundary conditions, BM25 scoring invariants, and fusion correctness.
Stack Dependencies
trueno-rag is part of the Sovereign AI Stack:
| Crate | Version | Purpose |
|---|---|---|
| trueno | 0.11 | SIMD/GPU compute primitives |
| trueno-db | 0.3.10 | GPU-first analytics database |
| realizar | 0.5.1 | GGUF/APR model inference |
| fastembed | 5.x | ONNX embeddings |
Development
Documentation
Contributing
Contributions are welcome! Please see the CONTRIBUTING.md guide for details.
MSRV
Minimum Supported Rust Version: 1.75
License
MIT