1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
//! Tokenizer support — granular, capability-named feature gates.
//!
//! The public module API composes from independently selectable, minimal-dep
//! features (each `tokenizer-`-prefixed; umbrellas `embeddings` / `lm` /
//! `vlm` / `audio` only compose them):
//!
//! - `tokenizer` — [`Tokenizer`](crate::tokenizer::Tokenizer) load +
//! `encode`/`decode` + special tokens read from `tokenizer.json`. Pulls
//! **only** the `tokenizers` crate (no `serde_json`, no `minijinja`).
//! - `tokenizer-config` — parse `tokenizer_config.json` (bos/eos/unk,
//! `chat_template`, added tokens). Adds `serde_json`.
//! - `tokenizer-stream` — `StreamingDetokenizer` +
//! `NaiveStreamingDetokenizer` (no `serde_json`).
//! - `tokenizer-gpt2` — GPT-2 bytes↔unicode table (committed
//! `cargo xtask-codegen` artifact).
//! - `tokenizer-bpe` — GPT-2 byte-level streaming detok + decoder-class
//! inference (pulls `tokenizer-stream` + `tokenizer-gpt2`).
//! - `tokenizer-spm` — SentencePiece (+no-space) streaming detok +
//! decoder-class inference.
//! - `tokenizer-chat` — jinja `apply_chat_template` (pulls `tokenizer-config`).
//! - `tokenizer-deepseek-v32` — the one shipped chat-template override.
//! - `tokenizer-tools` — tool-call parsers + `infer_tool_parser`.
//!
//! When a model's `tokenizer.json` decoder wants the SPM or BPE streaming
//! detokenizer but that feature is disabled, the
//! [`Tokenizer::detokenizer`](crate::tokenizer::Tokenizer::detokenizer)
//! factory falls back to the naive detokenizer (and warns once) — it never
//! panics or hard-errors.
//!
//! A faithful Rust port of `mlx-lm`'s tokenizer surface, cross-referenced
//! against `mlx-swift-lm`'s `MLXLMCommon` abstractions. Intentionally **not**
//! ported: the Python `NewlineTokenizer` + `AutoTokenizer.register(...)` —
//! model-specific tokenizer registration is per-model architecture and out of
//! scope. Loading is local-path only; no Hugging Face Hub network download.
/// Committed codegen tables (`@generated by cargo xtask-codegen` from
/// `mlxrs/data/tokenizer/`). Replaces the old `build.rs` + `OUT_DIR`
/// `include!`s so a normal build never compiles `tokenizers`/`serde_json`/
/// `toml`. Only compiled when a consuming capability feature is enabled.
/// SentencePiece Unigram / BPE tokenizer (protobuf reader + Viterbi
/// lattice + byte-fallback). Standalone, self-contained — uses neither
/// the `tokenizers` crate nor [`wrapper::Tokenizer`]; loads from a raw
/// `*.model` protobuf or the JSON `tokenizer.json` model subtree (with
/// `tokenizer-config`). Gated under `audio` for now — the first caller
/// is `crate::audio::stt::streaming`; promote to a standalone feature
/// when a non-audio caller needs it.
pub use ;
/// Re-export the SPM Unigram/BPE tokenizer top-level surface — gated
/// under `audio` for now (see [`sentencepiece`] for the rationale).
pub use ;
/// SPM/BPE streaming detokenizers. Each is gated on its own feature; the
/// naive detokenizer + trait come with bare `tokenizer-stream`.
pub use BpeStreamingDetokenizer;
pub use SpmStreamingDetokenizer;
/// Decoder-class inference needs `serde_json` (parses the `decoder` node), so
/// it is available only when `tokenizer-spm` or `tokenizer-bpe` is enabled.
pub use infer_detokenizer_class;
pub use ;
/// Tool-call parsing (`tokenizer-tools`) — the per-format parsers, the
/// streaming [`tools::ToolCallProcessor`], and the selectors.
pub use ;
pub use ;