pub struct RomanizationMap(/* private fields */);Expand description
A Thai-word → RTGS-romanization lookup table.
Built from tab-separated data via RomanizationMap::from_tsv.
Lookup is O(log n) via BTreeMap.
Implementations§
Source§impl RomanizationMap
impl RomanizationMap
Sourcepub fn from_tsv(data: &str) -> Self
pub fn from_tsv(data: &str) -> Self
Parse a tab-separated romanization table.
Format: thai_word\trtgs_romanization — one entry per line.
Lines beginning with # and blank lines are skipped.
For duplicate keys, the last entry wins.
Sourcepub fn romanize(&self, word: &str) -> Option<&str>
pub fn romanize(&self, word: &str) -> Option<&str>
Look up the RTGS romanization for a pre-segmented Thai word.
Returns None if the word is not in the table.
The returned &str borrows from the map — zero-copy for hits.
§Example
use kham_core::romanizer::RomanizationMap;
let map = RomanizationMap::from_tsv("กิน\tkin\n");
assert_eq!(map.romanize("กิน"), Some("kin"));
assert_eq!(map.romanize("xyz"), None);Sourcepub fn romanize_or_raw<'a>(&'a self, word: &'a str) -> &'a str
pub fn romanize_or_raw<'a>(&'a self, word: &'a str) -> &'a str
Return the RTGS romanization for word, or word unchanged if not found.
§Example
use kham_core::romanizer::RomanizationMap;
let map = RomanizationMap::from_tsv("กิน\tkin\n");
assert_eq!(map.romanize_or_raw("กิน"), "kin");
assert_eq!(map.romanize_or_raw("xyz"), "xyz");Sourcepub fn romanize_tokens(&self, tokens: &[&str]) -> Vec<String>
pub fn romanize_tokens(&self, tokens: &[&str]) -> Vec<String>
Romanize a slice of pre-segmented token strings.
Returns a Vec<String> aligned 1:1 with the input slice. Tokens not
found in the table are returned unchanged (same behaviour as
romanize_or_raw).
§Example
use kham_core::romanizer::RomanizationMap;
let map = RomanizationMap::from_tsv("กิน\tkin\nปลา\tpla\n");
let out = map.romanize_tokens(&["กิน", "ปลา"]);
assert_eq!(out, vec!["kin", "pla"]);