Expand description
§entelix-rag
Algorithmic primitives for retrieval-augmented generation
pipelines — Document with provenance + lineage, plus the
DocumentLoader / TextSplitter / Chunker trait surface
every RAG path composes around.
§Position
2026-era agentic RAG (Contextual Retrieval, Self-RAG, CRAG, Adaptive-RAG) is no longer a side pipeline — it’s the agent’s baseline working memory. This crate ships the algorithmic primitives (splitters, chunkers, ingestion composition) that every consumer reaches for. Concrete source connectors (S3, Notion, Confluence, GDrive, …) live in companion crates so the core surface stays small and dependency-light.
§Surface
Document— RAG-shaped document withSource(where it came from),Lineage(split / chunk ancestry), andentelix_memory::Namespace(multi-tenant boundary). The retrieval-sideentelix_memory::Document(with similarity score) is a result shape; this is the ingestion shape.DocumentLoader— async source-side trait. Streams to keep ingestion memory-bounded over arbitrarily large corpora.TextSplitter— sync algorithmic primitive. Slices aDocumentinto smallerDocuments preservingLineage.Chunker— async transform over a chunk sequence. LLM-call capable (Anthropic Contextual Retrieval, HyDE, query decomposition).
§What lives in companion crates
- Source connectors —
entelix-rag-s3,entelix-rag-notion,entelix-rag-confluence,entelix-rag-fs(filesystem-backed, invariant 9 exemption). - Vendor-accurate tokenizers —
entelix-tokenizer-tiktoken,entelix-tokenizer-hf, locale-aware companions (Korean / Japanese morphology). Theentelix_core::TokenCountertrait is the integration surface; this crate’sTokenCountSplitteris generic over anyC: TokenCounter + ?Sized + 'static(defaultdyn TokenCounter) so concreteArc<TiktokenCounter>and type-erasedArc<dyn TokenCounter>plug in interchangeably. Vendor accuracy is a counter swap, not a splitter rewrite.
§Why algorithmic primitives only
The LangChain ecosystem’s mistake was bundling 100+ source
connectors into the core surface — version churn became
unmanageable. entelix-rag’s coreis explicitly small (4 traits +
Document + provenance types) so vendor-specific loaders ship
independently and never gate the core’s release cadence. The
algorithmic primitives (splitters, chunkers, ingestion
composition) ARE universal so they live here.
Structs§
- Contextual
Chunker - Anthropic Contextual Retrieval chunker. Each chunk’s content is
rewritten as
<contextual prefix>\n\n<original chunk>where<contextual prefix>is a model-generated 50-100 token summary of how the chunk relates to its parent document. - Contextual
Chunker Builder - Builder for
ContextualChunker. Construct viaContextualChunker::builder; chain config setters; finalise withSelf::build. - Corrective
RagState - State the corrective-RAG graph drives across nodes. Carries the original + current query, the rewrite history, the last retrieval batch + verdicts, the surviving correct subset, and the terminal answer.
- Crag
Config - Operator-tunable knobs for the corrective-RAG recipe. Construct
via
Self::neworSelf::default; chainwith_*setters. - Document
- The unit a RAG pipeline moves around — content plus everything downstream needs to know about where it came from.
- Document
Id - Stable identifier for a
Documentwithin itsNamespace. Loaders mint these from the source’s natural id (S3 object key, Notion page id, file path); splitters derive child ids by suffixing the parent id with:<chunk_index>. - Ingest
Error - One per-document failure recorded during ingestion. Carries the originating document id (when known) and a stage label identifying which pipeline phase failed.
- Ingest
Report - Outcome counters and per-document failure list a single
IngestionPipeline::runproduces. - Ingestion
Pipeline - End-to-end RAG ingestion pipeline. Construct via
Self::builder; finalise withIngestionPipelineBuilder::build; drive withSelf::run. - Ingestion
Pipeline Builder - Builder for
IngestionPipeline. Required components (loader / splitter / embedder / store) come in viaIngestionPipeline::builder; optionalChunkerentries accumulate viaSelf::add_chunker. - Lineage
- Split-history — survives every transformation. A leaf chunk’s
Lineagedescribes which parent it came from, which split produced it, and which chunkers ran over it. Audit / debug flows reconstruct the path from a retrieval hit back to the ingestion source by walking the lineage chain (parent_id → loader’s source URI). - LlmQuery
Rewriter - Reference LLM-driven
QueryRewriter. Asks the suppliedRunnable<Vec<Message>, Message>model for a corrected query, then trims surrounding whitespace and quote marks. - LlmQuery
Rewriter Builder - Builder for
LlmQueryRewriter. - LlmRetrieval
Grader - Reference LLM-driven
RetrievalGrader. Asks the suppliedRunnable<Vec<Message>, Message>model to classify relevance, then parses the reply into aGradeVerdict. Operators inheriting from this default tune the prompt viaLlmRetrievalGraderBuilder::with_instructionor write their own grader from scratch. - LlmRetrieval
Grader Builder - Builder for
LlmRetrievalGrader. - Markdown
Structure Splitter - Heading-aware markdown splitter.
- Recursive
Character Splitter - Recursive character-budget splitter.
- Source
- Where a
Documentoriginated. Survives every split and chunker pass — the leaf chunk knows the source URI of the parent document and which loader produced it. - Token
Count Splitter - Recursive token-budget splitter.
Enums§
- Failure
Policy - Per-chunk failure policy — picks how the chunker reacts when the underlying model call fails on one chunk. See module docs for the trade-off matrix.
- Grade
Verdict - Three-way verdict the grader emits per
(query, document)pair, matching the CRAG paper’s relevance classes.
Constants§
- CONTEXTUAL_
CHUNKER_ DEFAULT_ INSTRUCTION - Default operator-facing instruction prepended to every model call. Verbatim from Anthropic’s published Contextual Retrieval recipe — lifts the model into the right framing without requiring per-corpus tuning.
- CORRECTIVE_
RAG_ AGENT_ NAME - Stable agent name surfaced on every emitted
entelix_agents::AgentEventand OTelentelix.agent.runspan. - DEFAULT_
CHUNK_ OVERLAP_ CHARS - Default overlap between consecutive chunks. ~10% of
DEFAULT_CHUNK_SIZE_CHARSpreserves enough trailing context for retrieval grounding without bloating the index. - DEFAULT_
CHUNK_ OVERLAP_ TOKENS - Default overlap between consecutive chunks in tokens. ~12.5% of
DEFAULT_CHUNK_SIZE_TOKENSpreserves enough trailing context for retrieval grounding without bloating the index. - DEFAULT_
CHUNK_ SIZE_ CHARS - Default chunk size in characters. ~1000 chars maps to roughly
200-300 tokens for English under
cl100k_base, comfortably under every shipping vendor’s per-message ceiling. - DEFAULT_
CHUNK_ SIZE_ TOKENS - Default chunk size in tokens.
512matches the typical embedding context window (text-embedding-3-smalland-largeboth cap at 8191 tokens; chunking under 512 leaves headroom for query + instruction tokens at retrieval time). - DEFAULT_
GENERATOR_ SYSTEM_ PROMPT - Default system prompt the generator node prepends to every answer-generation call. Vendor-neutral, focused on grounded answer style.
- DEFAULT_
GRADER_ INSTRUCTION - Default instruction prepended to every model call. Frames the task verbatim in the CRAG-paper terms so the model emits one of the three canonical labels.
- DEFAULT_
MARKDOWN_ HEADING_ LEVELS - Default ATX heading levels that open a new chunk.
[1, 2, 3]splits at#,##,###; deeper sub-headings (####+) stay inline. - DEFAULT_
MAX_ REWRITE_ ATTEMPTS - Default cap on rewrite-loop attempts before the recipe
surrenders and generates over whatever was retrieved last.
3is the CRAG paper’s reported sweet spot (retrieval rarely improves beyond the third rewrite). - DEFAULT_
MIN_ CORRECT_ FRACTION - Default minimum fraction of retrieved documents that must
grade
GradeVerdict::Correctfor the recipe to skip rewriting and proceed directly to generation.0.5matches the CRAG paper’s mid-confidence threshold — operators tuning for higher retrieval precision raise it; tuning for lower model spend (fewer rewrites at the cost of weaker grounding) lower it. - DEFAULT_
RECURSIVE_ SEPARATORS - Default separator priority list. Paragraph break → line break → word boundary → character. The empty-string fallback guarantees termination even on pathological input (one giant unbroken token).
- DEFAULT_
RETRIEVAL_ TOP_ K - Default top-k passed into the retriever on every retrieval
pass. Operator-overridable via
CragConfig::with_retrieval_top_k. - DEFAULT_
REWRITER_ INSTRUCTION - Default instruction prepended to every model call. Verbatim matches the CRAG-paper rewriter framing — the model produces one corrected query string, no surrounding explanation.
- PROVENANCE_
METADATA_ KEY - Reserved key on the persisted
metadatamap under which the pipeline stampsSource+Lineage+namespace. Carries theentelixprefix so an operator’s own metadata fields never collide. Retrieval-side consumers reach back to provenance through this nested object.
Traits§
- Chunker
- Async transform applied to a sequence of chunks after a
TextSplitterran. Implementations may issue LLM calls, embedding lookups, or external metadata enrichment; theExecutionContextsupplies cancellation, deadline, and anyentelix_core::RunBudgetcaps the parent pipeline configured. - Document
Loader - Source-side trait the ingestion pipeline pulls documents from.
- Query
Rewriter - Async trait the corrective-RAG recipe calls when retrieval quality requires another attempt with a different query. Implementations may be LLM-driven, heuristic (query-expansion / synonym-bag), classifier-routed, or any hybrid — the recipe takes whatever string comes back and re-runs retrieval with it.
- Retrieval
Grader - Async trait the corrective-RAG recipe consults for every
retrieved document. Implementations may be LLM-driven (the
canonical case, see
LlmRetrievalGrader) or keyword / heuristic / classifier-model based — the recipe doesn’t care as long as the verdict is one of the threeGradeVerdictvariants. - Text
Splitter - Pure-algorithm slice of a
Documentinto smaller documents.
Functions§
- build_
corrective_ rag_ graph - Compile the corrective-RAG graph from operator-supplied
primitives. Use this when you need to embed the graph as a
node in a larger
StateGraph; for a ready-to-execute agent, prefercreate_corrective_rag_agent. - create_
corrective_ rag_ agent - Build a ready-to-execute corrective-RAG
Agent. Wrapsbuild_corrective_rag_graphin the standardAgent<S>shape so the full lifecycle (AgentEventstream, sink fan-out, observer hooks, supervisor handoff) integrates uniformly with every other recipe (create_react_agent,create_supervisor_agent,create_chat_agent).
Type Aliases§
- Document
Stream - Boxed stream type alias for documents produced by a
DocumentLoader. Items areResultso a partial-success stream can yield successful documents while reporting per-item errors — a single mid-walk failure does not abort the whole ingestion run.