opendict-rs
A Rust library for reading StarDict and MDict dictionary files through a unified API.
Quick start
use ;
// Auto-detects format from directory contents
let dict = open?;
println!; // dictionary title
println!; // number of entries
if let Some = dict.lookup?
let matches = dict.search_prefix; // prefix search
let words = dict.word_list; // all headwords
Format-specific APIs
use StarDictDictionary;
use MdictDictionary;
// StarDict — synonyms
let sd = open_dir?;
if let Some = sd.lookup_synonym?
// MDict — resource files (.mdd)
let md = open?;
let css = md.lookup_resource; // CSS, images, fonts
Format support
StarDict
| Feature | Status |
|---|---|
.ifo metadata (v2.4.2, v3.0.0) |
Supported |
.idx binary index (32-bit and 64-bit offsets) |
Supported |
.dict data (all type identifiers, sametypesequence) |
Supported |
.syn synonym files |
Supported |
.idx.gz and .dict.dz compressed files |
Supported |
stardict_strcmp sort order |
Supported |
Tree dictionaries (.tdx) |
Not supported |
Resource storage (res.rifo/ridx/rdic) |
Not supported |
MDict
| Feature | Status |
|---|---|
| v2.0 and v3.0 formats | Supported |
| Keyword index encryption (RIPEMD-128 for v2, xxhash64 for v3) | Supported |
| Per-block nibble-swap XOR decryption | Supported |
| zlib and LZO compression | Supported |
Encodings: UTF-8, UTF-16LE, GBK/GB2312/GB18030, Big5 via encoding_rs |
Supported |
| Adler32 checksum verification | Supported |
.mdd resource files (CSS, images, fonts) |
Supported |
| v1.2 format | Not supported |
| Keyword header encryption (Salsa20) | Not supported |
| StyleSheet / Compact mode | Not supported |
Both formats
- Memory-mapped I/O for dictionary data
- Binary search with prefix search
- Single-entry block cache to avoid redundant decompression (MDict)
- Optional disk caching of
.dict.dz/.idx.gzdecompression (StarDict)
Bindings
Node.js
Native addon via napi-rs. Auto-detects dictionary format.
import from '@opendict-rs/node'
const dict =
console.log
console.log
const entries = dict.
if
const matches = dict.
Build from source:
&& &&
Expo (React Native)
Native module via uniffi + Expo Modules. Supports iOS and Android.
import { Dictionary } from '@opendict-rs/expo'
const dict = new Dictionary('/path/to/dictionary-dir')
console.log(dict.info.name)
console.log(dict.wordCount())
const entries = dict.lookup('hello')
const matches = dict.searchPrefix('hel', 10)
dict.close() // free native resources when done
Requires cross-compiling the Rust library for your target platform. See examples/expo-test for a complete example app.
Performance
Release mode, benchmarked against real dictionaries.
Load time
| Dictionary | Format | Words | Load |
|---|---|---|---|
| Langdao CN-EN | StarDict | 405k | 10 ms |
| Modern Chinese | StarDict | 58k | 1 ms |
| Korean-English | StarDict | 50k | 2 ms |
| Spanish-English | StarDict | 99k | 21 ms |
| CN-EN (xinshiji) | MDict | 137k | 34 ms |
| Oxford OED | MDict | 277k | 64 ms |
| JP Names | MDict | 67k | 22 ms |
Lookup (sequential, all words)
| Dictionary | Format | ns/word |
|---|---|---|
| Langdao CN-EN | StarDict | 381 |
| Modern Chinese | StarDict | 437 |
| Korean-English | StarDict | 955 |
| Spanish-English | StarDict | 436 |
| CN-EN (xinshiji) | MDict | 1,479 |
| Oxford OED | MDict | 8,649 |
| JP Names | MDict | 919 |
Prefix search runs in 1-4 us across all dictionaries.
MDict mmap vs Vec on the Oxford OED (277k words, ~230 MB .mdx): RSS dropped from 234 MB to 24 MB (-90%), load time from 101 ms to 64 ms (-37%).
Tests
Dependencies
= "1" # zlib decompression
= "0.9" # memory-mapped I/O
= "0.8" # character encoding (GBK, Big5, etc.)
= "2" # adler32 checksums
= "0.2" # LZO decompression
= "0.8" # v3 key derivation
= "0.4" # optional warning logs
Contributing
Contributions are very welcome — bug fixes, performance improvements, docs, tests, additional fixtures, all of it.
The biggest gap is support for more dictionary formats (Lingoes,
DSL, Babylon BGL, XDXF, plain text). I'd like opendict-rs to grow into
a genuinely format-agnostic reader, but I'm unlikely to get to those any
time soon — if you need one, opening a PR is the fastest way to make it
happen. The existing StarDict and MDict modules under src/ are good
templates for how a new format slots in behind the Dictionary trait.
See CONTRIBUTING.md for the repo layout, build steps for each binding (Rust core, Node, Expo), and the rule for keeping the bindings thin.
Acknowledgements
The MDict implementation was built with reference to:
- mdict-analysis — Xiaoqiang Wang's Python analysis of the MDict format
- mdict — Jeka Kiselyov's JavaScript MDict reader
- writemdict — Zhanshi Liu's Python MDict writer (useful for understanding the binary format)
License
MIT