Expand description
§cognis-rag
v2-beta RAG primitives: embeddings, vector stores, document loaders, text splitters, retrievers, and an indexing pipeline.
Top-level modules:
document— the universalDocumenttype.embeddings—Embeddingstrait + Fake/OpenAI/Ollama impls.vectorstore—VectorStoretrait +InMemoryVectorStore.loaders— text/markdown/json/directory/csv/html loaders.splitters— recursive-char + markdown-aware splitters.retrievers— vector / BM25 / ensemble retrievers (each is aRunnable).indexing— wire load → split → embed → store with one call.
Re-exports§
pub use cross_encoder::CrossEncoder;pub use cross_encoder::CrossEncoderReranker;pub use cross_encoder::FnCrossEncoder;pub use distance::Distance;pub use docstore::Docstore;pub use docstore::InMemoryDocstore;pub use document::Document;pub use embeddings::OllamaEmbeddings;pub use embeddings::OpenAIEmbeddings;pub use embeddings::BatchedEmbeddings;pub use embeddings::CachedEmbeddings;pub use embeddings::EmbeddingRouter;pub use embeddings::Embeddings;pub use embeddings::EmbeddingsRouter;pub use embeddings::FakeEmbeddings;pub use embeddings::FnRouter;pub use embeddings::LengthRouter;pub use example_selectors::AsyncExampleSelector;pub use example_selectors::EmbedMode;pub use example_selectors::MmrExampleSelector;pub use example_selectors::SemanticSimilarityExampleSelector;pub use indexing::IncrementalReport;pub use indexing::IndexingPipeline;pub use loaders::DirectoryLoader;pub use loaders::DocumentLoader;pub use loaders::DocumentStream;pub use loaders::JsonLoader;pub use loaders::MarkdownLoader;pub use loaders::TextLoader;pub use multi_vector::MultiVectorIndexer;pub use record_manager::fingerprint;pub use record_manager::InMemoryRecordManager;pub use record_manager::RecordManager;pub use retrievers::BM25Retriever;pub use retrievers::CachingRetriever;pub use retrievers::CompressorPipeline;pub use retrievers::EnsembleRetriever;pub use retrievers::MultiVectorRetriever;pub use retrievers::ParentDocumentRetriever;pub use retrievers::QueryTranslatorRetriever;pub use retrievers::VectorRetriever;pub use splitters::CharacterSplitter;pub use splitters::CodeLanguage;pub use splitters::CodeSplitter;pub use splitters::HtmlSplitter;pub use splitters::JsonSplitter;pub use splitters::MarkdownSplitter;pub use splitters::RecursiveCharSplitter;pub use splitters::SentenceSplitter;pub use splitters::TextSplitter;pub use splitters::TokenAwareSplitter;pub use transformers::Dedup;pub use transformers::Enrichment;pub use transformers::LongContextReorder;pub use transformers::MetadataTransformer;pub use vectorstore::Filter;pub use vectorstore::InMemoryVectorStore;pub use vectorstore::SearchResult;pub use vectorstore::VectorStore;
Modules§
- cross_
encoder - Cross-encoder scoring trait + cross-encoder-based reranker.
- distance
- Distance metrics for vector similarity.
- docstore
Docstore— keyedDocumentstorage by stable id.- document
Document— the unit of RAG: a piece of text plus typed metadata.- embeddings
- Embeddings trait + implementations.
- example_
selectors - Embedding-driven example selectors for few-shot prompts.
- indexing
- Indexing pipeline — load → split → embed → store.
- loaders
- Document loaders — read sources into
Documents. - multi_
vector MultiVectorIndexer— index many representations of one document under a shared parent id.- prelude
- Common imports for v2 RAG user code.
- record_
manager - Incremental indexing — track per-document fingerprints so re-indexing only re-embeds new or changed documents and removes deleted ones.
- retrievers
- Retrievers —
Runnable<String, Vec<Document>>. - splitters
- Text splitters — chunk a
Documentinto smallerDocuments suitable for embedding. - transformers
- Document-list transformers —
Runnable<Vec<Document>, Vec<Document>>. - vectorstore
- Vector store trait + SearchResult + Filter.
Structs§
- Char
Tokenizer - Trivial char-as-token implementation. Conservative upper bound on real tokenizer counts; useful as a default for budgeting.
- FnTokenizer
- Closure-backed tokenizer.
Traits§
- Tokenizer
- Counts tokens in a piece of text.