Expand description
wubi — self-developed Wubi 86 encoder + dictionary.
Phase 1 deliverable: a self-contained, zero-allocation encoder
(encode_into, encode) backed by a pluggable lookup closure.
Public modules:
codec— pure algorithm + types; no external imports; usable frombuild.rsvia#[path].stroke— backward-compat re-exports fromcodec.decomp— ownedDecompand seed-file parser.encode— high-level encoder API (uses runtime PHF/HashMap tables).zigen— 字根 → letter lookup.jianma— 一级简码 lookup.
ⓘ
use wubi::{encode, embedded_seed};
for (ch, decomp) in embedded_seed() {
let code = encode(&decomp).unwrap();
println!("{ch}\t{code}");
}Re-exports§
pub use codec::DecompRef;pub use codec::EncodeError;pub use codec::Shape;pub use codec::Stroke;pub use codec::encode_with_lookup;pub use decomp::Decomp;pub use decomp::embedded_seed;pub use dict::L0Snapshot;pub use dict::WubiDict;pub use dict::PROMOTE_THRESHOLD;pub use encode::EncodedCode;pub use encode::encode;pub use encode::encode_into;pub use jianma::iter_jianma1;pub use jianma::lookup_jianma1;pub use layer::DEFAULT_LAYER_PREFS;pub use layer::LAYER_BASE;pub use layer::LAYER_COUNT;pub use layer::Layer;pub use zigen::iter as iter_zigen;pub use zigen::lookup as lookup_zigen;
Modules§
- codec
- Self-contained Wubi 86 codec — pure algorithm + types, zero external imports.
- decomp
- Owned character-decomposition data (
char → Decomp) and a parser fordata/seed.txt. - dict
- FST-backed Wubi dictionary with a two-tier ranking model.
- encode
- High-level encoder API over
crate::codec: zero-allocencode_intoplus an ergonomicencodethat returns a stackEncodedCode. - jianma
- 一级简码 lookup, backed by a compile-time
phf::Map<u8, char>generated bybuild.rsfromdata/jianma1.txt. - layer
- Layer taxonomy for dictionary entries.
- stroke
- Backward-compatible re-exports from
crate::codec. - zigen
- 字根 → letter lookup, backed by a compile-time
phf::Map<char, u8>generated bybuild.rsfromdata/zigen86.txt.