Struct rust_tokenizers::TokensWithOffsets [−][src]
pub struct TokensWithOffsets { pub tokens: Vec<String>, pub offsets: Vec<Option<Offset>>, pub reference_offsets: Vec<Vec<OffsetSize>>, pub masks: Vec<Mask>, }
Expand description
Tokenized sequence
Intermediate tokenization steps before encoding, addition of special tokens and truncation
Fields
tokens: Vec<String>
Vector of token strings
offsets: Vec<Option<Offset>>
Offset information (as start and end positions) in relation to the original text. Tokens that can not be related to the original source are registered as None.
reference_offsets: Vec<Vec<OffsetSize>>
Offset information (as a sequence of positions) in relation to the original text. Tokens that can not be related to the original source are registered as None.
masks: Vec<Mask>
Masks tokens providing information on the type of tokens. This vector has the same length as token_ids.
Trait Implementations
Auto Trait Implementations
impl RefUnwindSafe for TokensWithOffsets
impl Send for TokensWithOffsets
impl Sync for TokensWithOffsets
impl Unpin for TokensWithOffsets
impl UnwindSafe for TokensWithOffsets
Blanket Implementations
Mutably borrows from an owned value. Read more