oxibonsai-rag
Pure Rust Retrieval-Augmented Generation (RAG) pipeline for OxiBonsai.
Self-contained RAG stack: document chunking (character, sentence, paragraph, semantic, hierarchical, sliding window, markdown), pure Rust embedders (identity, TF-IDF), in-memory vector store with cosine similarity, top-k retrieval, and end-to-end prompt-building pipeline.
Part of the OxiBonsai project.
Status
Stable — version 0.1.4, 871 tests passing (cargo nextest run -p oxibonsai-rag). Uplifted from Alpha in 0.1.2.
Features
RagPipeline— end-to-end index + query pipelineVectorStore— in-memory L2-normalized cosine similarity searchRetriever— document indexing and top-k chunk retrievalEmbeddertrait — pluggable embedding backendsIdentityEmbedder— hash-based embedder for testingTfIdfEmbedder— bag-of-words TF-IDF embedding- Chunking strategies: character window, sentence, paragraph, recursive, sliding window, markdown, semantic (cosine boundary), hierarchical
ChunkerRegistry— dynamic dispatch for pluggable chunking backends- Zero external API calls — fully self-contained
Usage
[]
= "0.1.4"
use RagPipeline;
let mut pipeline = default;
pipeline.index_document?;
let prompt = pipeline.build_prompt?;
License
Apache-2.0 — COOLJAPAN OU