memory-indexer 0.3.1

An in-memory full-text fuzzy search indexer.
Documentation
# memory-indexer

In-memory multilingual full-text indexer with pinyin-first search, prefix and fuzzy recall—built for chat memory, note-taking, or local knowledge bases.

## Highlights

- [x] Out-of-the-box CJK support

    - [x] chinese and pinyin fuzzy search

    - [x] japanese/korean n-grams with custom dictionaries

    - [x] mixed-script text supported

- [x] Ranking and routing

    - [x] BM25 with minimum-should-match

    - [x] ASCII queries auto-route exact → pinyin → fuzzy

    - [x] non-ASCII uses 2/3-gram + Levenshtein fuzzy

- [x] Highlight-friendly offsets: UTF-8/UTF-16 positions supported
- [x] Index snapshots: compressed binary format for persistence and fast loading
- [x] Pluggable dictionaries: inject or train Japanese/Hangul dictionaries for better tokenization

## Quick start

```rust
use memory_indexer::{InMemoryIndex, SearchMode};

let mut index = InMemoryIndex::default();
index.add_doc("kb", "doc-cn", "你好世界 memory-indexer", true);
index.add_doc("kb", "doc-en", "fuzzy search handles typos", true);

// Auto chooses between exact / pinyin / fuzzy
let hits = index.search_hits("kb", "nihao");

// Explicit modes
let fuzzy = index.search_with_mode("kb", "memry-indexer", SearchMode::Fuzzy);
let pinyin_prefix = index.search_with_mode_hits("kb", "nhs", SearchMode::Pinyin);

// Highlight spans (UTF-16 positions by default)
let spans = index.get_matches("kb", "doc-cn", "nihao");

// Snapshot persistence
let snapshot = index.get_snapshot_data("kb").unwrap();
// index.load_snapshot("kb", snapshot);
```

## Development

-   Tests: `cargo test`
-   Benchmarks: `cargo bench`

## License

> AGPL-3.0-or-later