1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
//! `inputx-wubi-data` — embedded Wubi 86 IDFv1 dict + lookup helpers
//! for the [`inputx-wubi`](https://crates.io/crates/inputx-wubi)
//! engine, packaged as a publishable stone.
//!
//! Successor to [`inputx-wubi-cement`](https://crates.io/crates/inputx-wubi-cement)
//! under the v1.5 D11 taxonomy correction (2026-05): cement = an
//! application's source code (your own `wubi.rs` / `engine.rs`),
//! NOT a published crate. The historical `-cement`-suffix crate is
//! deprecated and re-exports from this crate for backward compat.
//!
//! ## What's in the box
//!
//! - [`EMBEDDED_WUBI_IDF`] — IDFv1 binary dict blob with the wubi
//! `Layer` enum index encoded in `EntryFlags::engine_tag()`
//! (v1.4.7 sub-phase A4 step 2).
//! - [`wubi_idf_reader`] — process-global `OnceLock<IdfReader>` over
//! the embedded blob; amortizes the 4 MB parse + sha256 verify
//! across the process lifetime.
//! - [`layer_from_idf_tag`] — reverse of `Layer::as_u8`; decodes an
//! IDF entry's engine_tag back into the originating wubi `Layer`.
//! - `table` module — process-global stateful `WubiDict` cache +
//! per-code lookup helpers (`lookup`, `lookup_with_scores`,
//! `lookup_with_layer`, `lookup_with_freq_layer`,
//! `prefix_predictions`, `record_pick`, `export_l0`, `import_l0`)
//! + rare-CJK toggle (`set_show_rare` / `show_rare`) + warmup
//! helper.
//!
//! ## What's NOT here
//!
//! - **Stateful `WubiEngine`** (buffer / `handle_letter` /
//! auto-commit / commit_index / L0 pin state machine) — that
//! classifies as application cement per the v1.5 D11 correction
//! and now lives in the Inputx monorepo's
//! [`inputx-core/src/wubi/engine.rs`](https://github.com/goliajp/inputx/blob/develop/core/crates/inputx-core/src/wubi/engine.rs).
//! IME implementers copying this stone are expected to bring their
//! own state machine matching their UI ergonomics.
use OnceLock;
use IdfReader;
pub use ;
/// Re-export of the wubi L0 snapshot type so hosts can build /
/// destructure it without depending on the `inputx-wubi` crate
/// directly.
pub use L0Snapshot;
/// Embedded IDFv1 wubi dict blob, sourced from
/// `inputx-wubi-data/data/words.idf` at compile time. Each entry's
/// `EntryFlags::engine_tag()` carries the wubi `Layer` enum index
/// (v1.4.7 sub-phase A4 step 2 schema bump), so cement-side fills
/// can reconstruct `(word, layer, raw_freq)` without re-reading the
/// `inputx_wubi::WubiDict` table.
pub const EMBEDDED_WUBI_IDF: & =
include_bytes!;
/// Process-global [`IdfReader`] over [`EMBEDDED_WUBI_IDF`]. Parses
/// the 4 MB header / FST / entry-table sections once and amortizes
/// the ~few-ms cost over the whole process lifetime; subsequent
/// `wubi_idf_reader().lookup(code)` calls are O(|code|) FST walks
/// with zero allocation per query.
/// Decode an IDF wubi entry's `EntryFlags::engine_tag()` back into
/// the originating `inputx_wubi::Layer` variant. Falls back to
/// `Layer::Auto` on out-of-range bytes (defensive — the writer only
/// emits 0..=5).