pub struct SpellChecker { /* private fields */ }Expand description
Thai spell checker using edit distance and lk82 phonetic ranking.
Candidates are drawn from the built-in 62k-word dictionary and filtered to edit distance ≤ 2. Phonetically similar candidates (matching lk82 code) are ranked above purely orthographic matches.
§Examples
use kham_core::spell::SpellChecker;
let checker = SpellChecker::builtin();
// Misspelled word — expect near-miss suggestions
let suggestions = checker.suggestions("กานข้าว", 5);
assert!(suggestions.iter().all(|s| s.edit_distance <= 2));Correctly spelled word — edit_distance 0 if it is in the dictionary:
use kham_core::spell::SpellChecker;
let checker = SpellChecker::builtin();
let suggestions = checker.suggestions("กิน", 10);
assert!(suggestions.iter().any(|s| s.word == "กิน" && s.edit_distance == 0));Implementations§
Source§impl SpellChecker
impl SpellChecker
Sourcepub fn builtin() -> Self
pub fn builtin() -> Self
Create a spell checker backed by the built-in dictionary and TNC frequency table.
Construction loads the TNC frequency map (~106k entries) — reuse the
returned instance rather than calling builtin() on every query.
§Examples
use kham_core::spell::SpellChecker;
let checker = SpellChecker::builtin();
assert!(!checker.suggestions("สวัดสี", 5).is_empty());Sourcepub fn suggestions(&self, word: &str, max_n: usize) -> Vec<Suggestion>
pub fn suggestions(&self, word: &str, max_n: usize) -> Vec<Suggestion>
Return up to max_n spelling suggestions for word.
Only candidates with edit distance ≤ 2 are returned. Results are sorted by phonetic match (lk82), then edit distance, then TNC frequency.
Returns an empty Vec when word is empty or max_n is zero.
§Examples
use kham_core::spell::SpellChecker;
let checker = SpellChecker::builtin();
// Empty input → no suggestions
assert!(checker.suggestions("", 5).is_empty());
// max_n = 0 → no suggestions
assert!(checker.suggestions("กาน", 0).is_empty());
// Results respect the distance threshold
let suggs = checker.suggestions("กิน", 10);
assert!(suggs.iter().all(|s| s.edit_distance <= 2));Sourcepub fn did_you_mean(&self, word: &str) -> Option<String>
pub fn did_you_mean(&self, word: &str) -> Option<String>
Return the single best spelling correction for word, or None if the
word appears correctly in the dictionary (edit distance 0).
Sourcepub fn correct_text(&self, text: &str) -> String
pub fn correct_text(&self, text: &str) -> String
Correct an entire Thai text by replacing Unknown tokens with their best spelling suggestion.
Segments text with the built-in tokenizer, then for every
TokenKind::Unknown token that is at least 2 characters long, looks up
the best suggestion and substitutes it. All other tokens (including known
Thai words, numbers, Latin, punctuation) are passed through unchanged.
Returns the input unchanged when no Unknown tokens are found or no correction candidates exist.
§Example
use kham_core::spell::SpellChecker;
let checker = SpellChecker::builtin();
// A correctly spelled sentence should come back unchanged word-for-word.
let out = checker.correct_text("กินข้าวกับปลา");
assert!(!out.is_empty());