Expand description
§mnem-extract
Statistical, embedding-based entity + relation extraction for mnem.
This crate is the default path for experiment E3 of the GraphRAG research track: it replaces LLM-driven NER with a KeyBERT-style candidate-scoring pass over the chunk embedding that mnem-ingest already computes. That keeps extraction deterministic, fully offline, and cost-free at ingest time.
§Scope
traits::Extractor- pluggable extractor surface. One default implementation (keybert::KeyBertExtractor) ships with the crate; callers can swap in authored or LLM-backed extractors by implementing the trait themselves.keybert::KeyBertExtractor- KeyBERT-style n-gram ranking against a supplied chunk embedding, with MMR (Maximal Marginal Relevance) diversification and deterministic tiebreaks.cooccurrence::mine_relations- PMI-weighted co-occurrence relation miner that emits onetraits::Relationper sentence- local entity pair whose pointwise mutual information exceeds a configurable threshold.
§Determinism
Every public extractor in this crate is deterministic: same input
text + same embedder → byte-identical traits::Entity and
traits::Relation streams across runs. The proptest suite under
tests/proptest_determinism.rs enforces this as a first-class
property.
§Non-goals
- No LLM calls. No network. No tokio.
- No training, no fine-tuning: the extractor consumes whatever
mnem_embed_providers::Embedderthe caller already configured. - No HTTP / MCP / CLI wiring lives in this crate;
mnem-ingestexposes the integration andmnem-clisurfaces the flag.
Re-exports§
pub use cooccurrence::CoOccurrenceMiner;pub use cooccurrence::mine_relations;pub use keybert::KeyBertExtractor;pub use traits::Entity;pub use traits::ExtractionSource;pub use traits::Extractor;pub use traits::Relation;pub use inference::InferenceBudget;pub use inference::InferenceMethod;pub use inference::TypedRelation;pub use trust::AuthorFingerprint;pub use trust::AuthorRateLimiter;pub use trust::Candidate;pub use trust::PPR_AMPLIFICATION_FLOOR;pub use trust::TrustBoundary;
Modules§
- cooccurrence
- Co-occurrence relation miner - PMI-weighted edges between entities that share a sentence.
- inference
- Optional typed-relation inference (gap 03). Gated behind the
typed-relationsCargo feature. Default OFF per solution.md R3. Optional typed-relation inference for mnem-extract. - keybert
- KeyBERT-style statistical keyword / entity extractor.
- traits
- Public traits and value types for mnem-extract.
- trust
- Adversarial trust-boundary gate for opt-in typed-relation
inference (gap 03). Gated behind the
typed-relationsCargo feature. Default OFF. Trust-boundary gate for opt-in typed-relation inference.