Skip to main content

Crate entelix_rag

Crate entelix_rag 

Source
Expand description

§entelix-rag

Algorithmic primitives for retrieval-augmented generation pipelines — Document with provenance + lineage, plus the DocumentLoader / TextSplitter / Chunker trait surface every RAG path composes around.

§Position

2026-era agentic RAG (Contextual Retrieval, Self-RAG, CRAG, Adaptive-RAG) is no longer a side pipeline — it’s the agent’s baseline working memory. This crate ships the algorithmic primitives (splitters, chunkers, ingestion composition) that every consumer reaches for. Concrete source connectors (S3, Notion, Confluence, GDrive, …) live in companion crates so the core surface stays small and dependency-light.

§Surface

  • Document — RAG-shaped document with Source (where it came from), Lineage (split / chunk ancestry), and entelix_memory::Namespace (multi-tenant boundary). The retrieval-side entelix_memory::Document (with similarity score) is a result shape; this is the ingestion shape.
  • DocumentLoader — async source-side trait. Streams to keep ingestion memory-bounded over arbitrarily large corpora.
  • TextSplitter — sync algorithmic primitive. Slices a Document into smaller Documents preserving Lineage.
  • Chunker — async transform over a chunk sequence. LLM-call capable (Anthropic Contextual Retrieval, HyDE, query decomposition).

§What lives in companion crates

  • Source connectorsentelix-rag-s3, entelix-rag-notion, entelix-rag-confluence, entelix-rag-fs (filesystem-backed, invariant 9 exemption).
  • Vendor-accurate tokenizersentelix-tokenizer-tiktoken, entelix-tokenizer-hf, locale-aware companions (Korean / Japanese morphology). The entelix_core::TokenCounter trait is the integration surface; this crate’s TokenCountSplitter is generic over any C: TokenCounter + ?Sized + 'static (default dyn TokenCounter) so concrete Arc<TiktokenCounter> and type-erased Arc<dyn TokenCounter> plug in interchangeably. Vendor accuracy is a counter swap, not a splitter rewrite.

§Why algorithmic primitives only

The LangChain ecosystem’s mistake was bundling 100+ source connectors into the core surface — version churn became unmanageable. entelix-rag’s coreis explicitly small (4 traits + Document + provenance types) so vendor-specific loaders ship independently and never gate the core’s release cadence. The algorithmic primitives (splitters, chunkers, ingestion composition) ARE universal so they live here.

Structs§

ContextualChunker
Anthropic Contextual Retrieval chunker. Each chunk’s content is rewritten as <contextual prefix>\n\n<original chunk> where <contextual prefix> is a model-generated 50-100 token summary of how the chunk relates to its parent document.
ContextualChunkerBuilder
Builder for ContextualChunker. Construct via ContextualChunker::builder; chain config setters; finalise with Self::build.
CorrectiveRagState
State the corrective-RAG graph drives across nodes. Carries the original + current query, the rewrite history, the last retrieval batch + verdicts, the surviving correct subset, and the terminal answer.
CragConfig
Operator-tunable knobs for the corrective-RAG recipe. Construct via Self::new or Self::default; chain with_* setters.
Document
The unit a RAG pipeline moves around — content plus everything downstream needs to know about where it came from.
DocumentId
Stable identifier for a Document within its Namespace. Loaders mint these from the source’s natural id (S3 object key, Notion page id, file path); splitters derive child ids by suffixing the parent id with :<chunk_index>.
IngestError
One per-document failure recorded during ingestion. Carries the originating document id (when known) and a stage label identifying which pipeline phase failed.
IngestReport
Outcome counters and per-document failure list a single IngestionPipeline::run produces.
IngestionPipeline
End-to-end RAG ingestion pipeline. Construct via Self::builder; finalise with IngestionPipelineBuilder::build; drive with Self::run.
IngestionPipelineBuilder
Builder for IngestionPipeline. Required components (loader / splitter / embedder / store) come in via IngestionPipeline::builder; optional Chunker entries accumulate via Self::add_chunker.
Lineage
Split-history — survives every transformation. A leaf chunk’s Lineage describes which parent it came from, which split produced it, and which chunkers ran over it. Audit / debug flows reconstruct the path from a retrieval hit back to the ingestion source by walking the lineage chain (parent_id → loader’s source URI).
LlmQueryRewriter
Reference LLM-driven QueryRewriter. Asks the supplied Runnable<Vec<Message>, Message> model for a corrected query, then trims surrounding whitespace and quote marks.
LlmQueryRewriterBuilder
Builder for LlmQueryRewriter.
LlmRetrievalGrader
Reference LLM-driven RetrievalGrader. Asks the supplied Runnable<Vec<Message>, Message> model to classify relevance, then parses the reply into a GradeVerdict. Operators inheriting from this default tune the prompt via LlmRetrievalGraderBuilder::with_instruction or write their own grader from scratch.
LlmRetrievalGraderBuilder
Builder for LlmRetrievalGrader.
MarkdownStructureSplitter
Heading-aware markdown splitter.
RecursiveCharacterSplitter
Recursive character-budget splitter.
Source
Where a Document originated. Survives every split and chunker pass — the leaf chunk knows the source URI of the parent document and which loader produced it.
TokenCountSplitter
Recursive token-budget splitter.

Enums§

FailurePolicy
Per-chunk failure policy — picks how the chunker reacts when the underlying model call fails on one chunk. See module docs for the trade-off matrix.
GradeVerdict
Three-way verdict the grader emits per (query, document) pair, matching the CRAG paper’s relevance classes.

Constants§

CONTEXTUAL_CHUNKER_DEFAULT_INSTRUCTION
Default operator-facing instruction prepended to every model call. Verbatim from Anthropic’s published Contextual Retrieval recipe — lifts the model into the right framing without requiring per-corpus tuning.
CORRECTIVE_RAG_AGENT_NAME
Stable agent name surfaced on every emitted entelix_agents::AgentEvent and OTel entelix.agent.run span.
DEFAULT_CHUNK_OVERLAP_CHARS
Default overlap between consecutive chunks. ~10% of DEFAULT_CHUNK_SIZE_CHARS preserves enough trailing context for retrieval grounding without bloating the index.
DEFAULT_CHUNK_OVERLAP_TOKENS
Default overlap between consecutive chunks in tokens. ~12.5% of DEFAULT_CHUNK_SIZE_TOKENS preserves enough trailing context for retrieval grounding without bloating the index.
DEFAULT_CHUNK_SIZE_CHARS
Default chunk size in characters. ~1000 chars maps to roughly 200-300 tokens for English under cl100k_base, comfortably under every shipping vendor’s per-message ceiling.
DEFAULT_CHUNK_SIZE_TOKENS
Default chunk size in tokens. 512 matches the typical embedding context window (text-embedding-3-small and -large both cap at 8191 tokens; chunking under 512 leaves headroom for query + instruction tokens at retrieval time).
DEFAULT_GENERATOR_SYSTEM_PROMPT
Default system prompt the generator node prepends to every answer-generation call. Vendor-neutral, focused on grounded answer style.
DEFAULT_GRADER_INSTRUCTION
Default instruction prepended to every model call. Frames the task verbatim in the CRAG-paper terms so the model emits one of the three canonical labels.
DEFAULT_MARKDOWN_HEADING_LEVELS
Default ATX heading levels that open a new chunk. [1, 2, 3] splits at #, ##, ###; deeper sub-headings (####+) stay inline.
DEFAULT_MAX_REWRITE_ATTEMPTS
Default cap on rewrite-loop attempts before the recipe surrenders and generates over whatever was retrieved last. 3 is the CRAG paper’s reported sweet spot (retrieval rarely improves beyond the third rewrite).
DEFAULT_MIN_CORRECT_FRACTION
Default minimum fraction of retrieved documents that must grade GradeVerdict::Correct for the recipe to skip rewriting and proceed directly to generation. 0.5 matches the CRAG paper’s mid-confidence threshold — operators tuning for higher retrieval precision raise it; tuning for lower model spend (fewer rewrites at the cost of weaker grounding) lower it.
DEFAULT_RECURSIVE_SEPARATORS
Default separator priority list. Paragraph break → line break → word boundary → character. The empty-string fallback guarantees termination even on pathological input (one giant unbroken token).
DEFAULT_RETRIEVAL_TOP_K
Default top-k passed into the retriever on every retrieval pass. Operator-overridable via CragConfig::with_retrieval_top_k.
DEFAULT_REWRITER_INSTRUCTION
Default instruction prepended to every model call. Verbatim matches the CRAG-paper rewriter framing — the model produces one corrected query string, no surrounding explanation.
PROVENANCE_METADATA_KEY
Reserved key on the persisted metadata map under which the pipeline stamps Source + Lineage + namespace. Carries the entelix prefix so an operator’s own metadata fields never collide. Retrieval-side consumers reach back to provenance through this nested object.

Traits§

Chunker
Async transform applied to a sequence of chunks after a TextSplitter ran. Implementations may issue LLM calls, embedding lookups, or external metadata enrichment; the ExecutionContext supplies cancellation, deadline, and any entelix_core::RunBudget caps the parent pipeline configured.
DocumentLoader
Source-side trait the ingestion pipeline pulls documents from.
QueryRewriter
Async trait the corrective-RAG recipe calls when retrieval quality requires another attempt with a different query. Implementations may be LLM-driven, heuristic (query-expansion / synonym-bag), classifier-routed, or any hybrid — the recipe takes whatever string comes back and re-runs retrieval with it.
RetrievalGrader
Async trait the corrective-RAG recipe consults for every retrieved document. Implementations may be LLM-driven (the canonical case, see LlmRetrievalGrader) or keyword / heuristic / classifier-model based — the recipe doesn’t care as long as the verdict is one of the three GradeVerdict variants.
TextSplitter
Pure-algorithm slice of a Document into smaller documents.

Functions§

build_corrective_rag_graph
Compile the corrective-RAG graph from operator-supplied primitives. Use this when you need to embed the graph as a node in a larger StateGraph; for a ready-to-execute agent, prefer create_corrective_rag_agent.
create_corrective_rag_agent
Build a ready-to-execute corrective-RAG Agent. Wraps build_corrective_rag_graph in the standard Agent<S> shape so the full lifecycle (AgentEvent stream, sink fan-out, observer hooks, supervisor handoff) integrates uniformly with every other recipe (create_react_agent, create_supervisor_agent, create_chat_agent).

Type Aliases§

DocumentStream
Boxed stream type alias for documents produced by a DocumentLoader. Items are Result so a partial-success stream can yield successful documents while reporting per-item errors — a single mid-walk failure does not abort the whole ingestion run.