Expand description
§rig-retrieval-evals
Retrieval and knowledge-base evaluation harness for Rig agents.
The crate gives you:
- A BEIR-compatible
dataset::Qrelsloader (JSONL and BEIR/BRIGHTqueries.jsonl+qrels/<split>.tsvviadataset::Qrels::from_beir). - A pure-Rust catalogue of standard IR metrics (Recall, Precision, MRR,
MAP, nDCG, HitRate) in
retrieval. - An async
harness::RetrievalHarnessthat drives any store implementingrig::vector_store::VectorStoreIndexDyn, plus aretriever::Retrieverseam for scoring non-vector backends (lexical / BM25 / hybrid) with the same metrics. - A deterministic, seeded
syntheticcorpus + qrels generator for reproducible benchmarks and fixture-free tests. - JSON / Markdown
report::MultiReports with baseline diffing.
See the crate README for an end-to-end quickstart.
§Stability
The default build ships retrieval-quality evaluation only. Optional features add RAGAS-style judges, zero-waste ingestion checks, knowledge-gain scoring, and optional embedding novelty adapters.
Re-exports§
pub use dataset::GoldQuery;pub use dataset::Qrels;pub use dataset::RetrievedDoc;pub use dataset::RetrievedSet;pub use error::Error;pub use error::Result;pub use harness::RetrievalHarness;pub use report::FreshnessQueryRollup;pub use report::FreshnessReport;pub use report::MetricCi;pub use report::MetricDelta;pub use report::MetricReport;pub use report::MultiReport;pub use report::QueryDelta;pub use report::QueryReliability;pub use report::RegressionGate;pub use report::ReliabilityReport;pub use report::ReportDiff;pub use retrieval::HitRateAtK;pub use retrieval::MapAtK;pub use retrieval::Mrr;pub use retrieval::NdcgAtK;pub use retrieval::PrecisionAtK;pub use retrieval::RecallAtK;pub use retrieval::RetrievalMetric;pub use retriever::Retriever;pub use retriever::VectorStoreRetriever;pub use retriever::retrieve_all;pub use retriever::score_retriever;pub use staleness::ConflictGroup;pub use staleness::ConflictReport;pub use staleness::CorpusVersions;pub use staleness::StaleHit;pub use staleness::StalenessAnnotation;pub use staleness::StalenessReport;pub use staleness::detect_conflicts;pub use staleness::detect_stale_hits;pub use synthetic::SyntheticConfig;pub use synthetic::SyntheticCorpus;pub use synthetic::SyntheticDoc;pub use synthetic::generate as generate_synthetic;
Modules§
- dataset
- Labeled retrieval datasets (qrels) and accompanying corpus / answer files.
- error
- Error types for
rig-retrieval-evals. - harness
- Async driver that executes a
Qrelsagainst aVectorStoreIndexDynand aggregates per-query metric scores. - report
- Aggregation, serialization, and baseline diffing of per-query metric
scores produced by
crate::harness::RetrievalHarness. - retrieval
- Retrieval-quality metrics.
- retriever
- A retriever abstraction over anything that maps a query to a ranked list of documents — not just vector stores.
- staleness
- Stale-content and conflict detection for retrieval results.
- synthetic
- Deterministic synthetic “needle in a haystack” dataset generation.