Expand description
§text-retrieval
Library-first semantic and hybrid retrieval for moritzbrantner-video-analysis.
Default builds are deterministic and local-first. Transcript integration is feature-gated, and native model execution stays outside the default dependency closure.
For the high-level text workflow, see
docs/TEXT_WORKSPACE_GUIDE.md. For a
lower-level walkthrough of how RetrievalIndex relates to TextCorpus, lexical
scoring, hashed semantic search, and corpus analysis reports, see
docs/TEXT_CORPUS_GUIDE.md.
§Highlights
- Deterministic text chunking with token overlap
- Exact semantic retrieval over
moritzbrantner-vector-analysis-index - BM25 lexical retrieval over
moritzbrantner-text-lexical - Hybrid weighted ranking with metadata filters
- Related-content lookup and persistence-friendly export helpers
§Stable contract
The stable surface is chunk construction, SearchDocument,
TextDocumentContract/TextSegmentContract ingestion, retrieval request/result
types, metadata filters, snapshot planning, and persistence DTOs.
§Quality and limits
Hybrid score calibration and ranking quality are best-effort. Persistence helper types are stable DTOs, but default package-surface operations plan or build in-memory indexes and do not write files.
§Package surface
- Primary workflow:
retrieval.searchbuilds a transient in-memory retrieval index and searches it. - Workflow operations:
retrieval.chunk,retrieval.search,retrieval.rerank, andretrieval.snapshotPlan. - Debug operations:
describeinspects package metadata and operation support. - Runtime support: pure Rust, available through library, CLI, server, and WASM wrappers.
- Sample output includes
title,message,summary,result, and operation-specific fields such aschunks,report,mode,results, or snapshot planning details. - Package-surface operations do not write persistence artifacts or run native
model inference;
retrieval.snapshotPlanplans in-memory persistence work but does not write files.
§Related crates
moritzbrantner-text-embeddingsmoritzbrantner-text-lexicalmoritzbrantner-vector-analysis-index
Modules§
- surface
- Library-owned runtime surface for
text-retrieval.
Structs§
- Chunking
Options - Options for explicit chunk construction.
- Document
Chunk - Data type for document chunk.
- Hybrid
Config - Data type for hybrid config.
- Ingest
Report - Data type for ingest report.
- Ingestion
Options - Data type for ingestion options.
- Persisted
Chunk Record - Data type for persisted chunk record.
- Persisted
Corpus Metadata - Data type for persisted corpus metadata.
- Persisted
Search Index - Data type for persisted search index.
- Rerank
Execution Context - Caller-supplied runtime context for reranking.
- Rerank
Request - Request for query/document reranking.
- Rerank
Response - Response for query/document reranking.
- Rerank
Result - One reranked document.
- Retrieval
File - Data type for retrieval file.
- Retrieval
Index - Data type for retrieval index.
- Retrieval
Manifest - Data type for retrieval manifest.
- Search
Document - Data type for search document.
- Search
Filter - Data type for search filter.
- Search
Query - Data type for search query.
- Search
Result - Data type for search result.
Enums§
- Chunking
Strategy - Strategy for constructing retrieval chunks.
- Retrieval
Mode - Variants describing retrieval mode.
- Storage
Error - Variants describing storage error.
Traits§
Functions§
- chunk_
search_ document - Chunks one search document with explicit strategy options.
- rerank_
documents - Reranks documents from imported scores or deterministic lexical overlap.
- rerank_
documents_ with_ context - Reranks documents using a runtime backend when supplied, otherwise falls back.
Type Aliases§
- Result
- Type alias for result.