gramdex
gramdex provides small, dependency-light primitives for approximate string matching:
- Unicode-scalar (Rust
char) (k)-gram generation - A minimal grams → document-ids index (
GramDex) for candidate generation - An exact (verification) trigram Jaccard helper (
trigram_jaccard)
Quickstart
[]
= "0.1.0"
use ;
let mut ix = new;
ix.add_document_trigrams;
ix.add_document_trigrams;
let candidates = ix.candidates_union_trigrams;
let mut verified: = candidates
.into_iter
.filter
.collect;
verified.sort_unstable;
assert_eq!;
Best starting points
- Gram generation:
char_kgrams/char_trigrams - Candidate index:
GramDex(union candidates, scored candidates, bailout planning) - Verification:
trigram_jaccard
Design notes
- This crate focuses on candidate generation; you bring your own verification policy.
- Offsets/spans are naturally expressed in Unicode scalar values (
charcount), not bytes.
License
Licensed under either of:
- Apache License, Version 2.0 (
LICENSE-APACHE) - MIT license (
LICENSE-MIT)
at your option.