docs.rs failed to build jmdict-fast-0.1.1
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
๐ jmdict-fast
Blazing-fast, Japanese dictionary engine
Note: This crate uses bunpo for Japanese conjugation handling. Both crates are part of the same monorepo but are published separately to crates.io.
โจ Features
- ๐พ Compile-time indexed data โ FST + binary blob for maximum efficiency
- โก Instant lookups โ O(log n) exact matching across all writing systems
- ๐ Multimodal search โ Kanji, kana, and romaji support
- ๐ฆ Ergonomic Rust API โ Usable as a library or binary
- ๐ชถ Tiny binary โ Zero runtime parsing, no allocations during lookup
- ๐ฏ Memory-mapped โ Zero-copy access to all dictionary data
๐๏ธ Performance at a Glance
| Metric | Value |
|---|---|
| Index Size | ~888KB (FSTs) |
| Data Size | 16MB binary blob |
| Entries | 22,569 |
| Unique Keys | 24,342 |
| Lookup Speed | O(log n), instant |
| Memory Usage | Memory-mapped, zero allocations |
๐ Quick Start
Building the Dictionary
This creates:
OUT_DIR/kanji.fstโ Kanji lookup indexOUT_DIR/kana.fstโ Kana lookup indexOUT_DIR/romaji.fstโ Romaji lookup indexOUT_DIR/entries.binโ Binary blob with all entries
Using the Library
Search - Prefix
use Dict;
Search Exact
use Dict;
๐ Data Structure
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ kanji.fst โ โ kana.fst โ โ romaji.fst โ
โ (243KB) โ โ (257KB) โ โ (388KB) โ
โ โ โ โ โ โ
โ ๆผขๅญ โ Entry ID โ โ ใใช โ Entry ID โ โ romaji โ Entry IDโ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโ
โ entries.bin โ
โ (16MB) โ
โ โ
โ Offset Table โ
โ + JSON Entries โ
โโโโโโโโโโโโโโโโโโโ
๐ง API Reference
Dict::load<P: AsRef<Path>>(base_dir: P) -> Result<Self>โ Loads the dictionary from the specified directory.dict.lookup_exact(term: &str) -> Vec<Entry>โ Performs exact lookup across all writing systems.
Entry Structure:
๐ ๏ธ Development
Caching System
The build script implements a robust caching system to avoid re-downloading the large JMdict dataset. See CACHING.md and CACHE_QUICK_REFERENCE.md for details.
๐ How It Works
- Build Phase: The
buildtool processes the JMdict JSON and creates FST indexes and a binary blob for instant retrieval. - Runtime Phase: The library provides memory-mapped loading, FST-based lookups, and efficient entry retrieval.
๐ Real Benchmark Results
Criterion (lookup_word.rs) โ MacBook, Rust 1.70+
lookup_exact ็ซ (jmdict-fast)
time: [4.06 ยตs]
lookup_word ็ซ (jmdict)
time: [511.96 ยตs]
- jmdict-fast is ~125x faster than a traditional filter-based approach for exact lookups.
- Both methods are stable, but jmdict-fast is highly optimized for speed and memory.
๐ค Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
๐ License
MIT License โ see LICENSE for details.
๐ Acknowledgments
- JMdict โ The source dictionary data - see (EDRDG DICTIONARY LICENCE STATEMENT)[https://www.edrdg.org/edrdg/licence.html]
- FST crate โ Fast finite state transducer implementation
- 10ten Japanese Reader for their definflector implemtation
- Rust ecosystem โ For making this possible
Built with โค๏ธ and Rust ๐ฆ