Struct notmecab::LexerToken

source ·
pub struct LexerToken {
    pub left_context: u16,
    pub right_context: u16,
    pub pos: u16,
    pub cost: i16,
    pub original_id: u32,
    pub feature_offset: u32,
    pub start: usize,
    pub end: usize,
    pub kind: TokenType,
}

Fields§

§left_context: u16

Used internally during lattice pathfinding.

§right_context: u16

Used internally during lattice pathfinding.

§pos: u16

I don’t know what this is.

§cost: i16

Used internally during lattice pathfinding.

§original_id: u32

Unique identifier of what specific lexeme realization this is, from the mecab dictionary. changes between dictionary versions.

§feature_offset: u32

Feed this to read_feature_string to get this token’s “feature” string.

The feature string contains almost all useful information, including things like part of speech, spelling, pronunciation, etc.

The exact format of the feature string is dictionary-specific.

feature_offset is currently !0u32 (i.e. 0xFFFFFFFF) for tokens of the kind TokenType::UNK. Feeding this value to read_feature_string will result in a blank string, not an error.

§start: usize

Location, in codepoints, of the surface of this LexerToken in the string it was parsed from.

§end: usize

Corresponding ending location, in codepoints. Exclusive. (i.e. when start+1 == end, the LexerToken’s surface is one codepoint long)

§kind: TokenType

Origin of token. BOS and UNK are virtual origins (“beginning/ending-of-string” and “unknown”, respectively). Normal means it came from the mecab dictionary.

The BOS (beginning/ending-of-string) tokens are stripped away in parse_to_lexertokens.

Trait Implementations§

Returns a copy of the value. Read more
Performs copy-assignment from source. Read more
Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Gets the TypeId of self. Read more
Immutably borrows from an owned value. Read more
Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The resulting type after obtaining ownership.
Creates owned data from borrowed data, usually by cloning. Read more
Uses borrowed data to replace owned data, usually by cloning. Read more
The type returned in the event of a conversion error.
Performs the conversion.
The type returned in the event of a conversion error.
Performs the conversion.