pub struct FtsToken {
pub text: String,
pub position: usize,
pub kind: TokenKind,
pub is_stop: bool,
pub synonyms: Vec<String>,
pub trigrams: Vec<String>,
pub pos: Option<PosTag>,
pub ne: Option<NamedEntityKind>,
pub confidence: f32,
}Expand description
A token produced by the FTS pipeline, ready for lexeme indexing.
Fields§
§text: StringThe token text (owned; may be normalised).
position: usizeOrdinal position in the token sequence (0-based, gaps for whitespace).
kind: TokenKindScript / category of the original token.
is_stop: booltrue if this token matches the stopword list.
synonyms: Vec<String>Synonym expansions (empty if none configured or no match).
trigrams: Vec<String>Character trigrams — populated only for TokenKind::Unknown tokens.
pos: Option<PosTag>Primary part-of-speech tag from the lookup table, or None if the word
is not in the table (OOV) or is not a Thai token.
ne: Option<NamedEntityKind>Named entity category, or None if the token is not in the NE
gazetteer. When set, kind is TokenKind::Named(ne).
confidence: f32Segmentation confidence in the range [0.0, 1.0].
0.0 = Unknown token (no dictionary evidence).
1.0 = unambiguous high-frequency dictionary match.