1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
//! Unicode TR39 confusable character mappings (multi-target).
//!
//! Auto-generated from confusables.txt by scripts/gen_confusables.py.
//! Version: 17.0.0
//!
//! Contains ~2,063 non-Latin → Latin mappings and ~1,369 non-Cyrillic →
//! Cyrillic mappings. Uses compile-time perfect hash maps (`phf`) for O(1)
//! lookups. Covers Cyrillic, Greek, Armenian, Georgian, CJK compatibility,
//! mathematical symbols, fullwidth forms, and other confusable characters.
//!
//! PHF maps generated by build.rs from src/tables/data/confusables_to_*.tsv.
//! To update the data, regenerate with: python scripts/gen_confusables.py
// Non-Latin → Latin confusable mappings.
include!;
// Non-Cyrillic → Cyrillic confusable mappings.
include!;
/// Look up a confusable mapping for a character to the target script.
///
/// Returns the target-script equivalent if the character is a known
/// confusable, or None if it is not.
///
/// Supported target scripts: `"latin"`, `"cyrillic"`.
/// Resolve a `target_script` to its confusables PHF map, once.
///
/// Lets callers hoist the `match target_script` out of a per-character loop
/// (#236 / #233 review item) and probe the resolved map directly. Returns
/// `None` for an unknown script.
///
/// Note: there is intentionally **no** ASCII fast path built on top of this —
/// the latin table maps ASCII source code points (e.g. U+007C `|`→`l`,
/// U+0022 `"`→`''`, U+0060 `` ` ``→`'`), so ASCII input is *not* identity even
/// for `target="latin"`.