text-retrieval
Library-first semantic and hybrid retrieval for moritzbrantner-video-analysis.
Default builds are deterministic and local-first. Transcript integration is feature-gated, and native model execution stays outside the default dependency closure.
For the high-level text workflow, see
docs/TEXT_WORKSPACE_GUIDE.md. For a
lower-level walkthrough of how RetrievalIndex relates to TextCorpus, lexical
scoring, hashed semantic search, and corpus analysis reports, see
docs/TEXT_CORPUS_GUIDE.md.
Highlights
- Deterministic text chunking with token overlap
- Exact semantic retrieval over
moritzbrantner-vector-analysis-index - BM25 lexical retrieval over
moritzbrantner-text-lexical - Hybrid weighted ranking with metadata filters
- Related-content lookup and persistence-friendly export helpers
Stable contract
The stable surface is chunk construction, SearchDocument,
TextDocumentContract/TextSegmentContract ingestion, retrieval request/result
types, metadata filters, snapshot planning, and persistence DTOs.
Quality and limits
Hybrid score calibration and ranking quality are best-effort. Persistence helper types are stable DTOs, but default package-surface operations plan or build in-memory indexes and do not write files.
Package surface
- Primary workflow:
retrieval.searchbuilds a transient in-memory retrieval index and searches it. - Workflow operations:
retrieval.chunk,retrieval.search,retrieval.rerank, andretrieval.snapshotPlan. - Debug operations:
describeinspects package metadata and operation support. - Runtime support: pure Rust, available through library, CLI, server, and WASM wrappers.
- Sample output includes
title,message,summary,result, and operation-specific fields such aschunks,report,mode,results, or snapshot planning details. - Package-surface operations do not write persistence artifacts or run native
model inference;
retrieval.snapshotPlanplans in-memory persistence work but does not write files.
Related crates
moritzbrantner-text-embeddingsmoritzbrantner-text-lexicalmoritzbrantner-vector-analysis-index