pub struct TokenizedInput {
    pub token_ids: Vec<i64>,
    pub segment_ids: Vec<i8>,
    pub special_tokens_mask: Vec<i8>,
    pub overflowing_tokens: Vec<i64>,
    pub num_truncated_tokens: usize,
    pub token_offsets: Vec<Option<Offset>>,
    pub reference_offsets: Vec<Vec<OffsetSize>>,
    pub mask: Vec<Mask>,
}
Expand description

Tokenized Input, ready for processing in language models

This represents the final output of the encoding process (tokenized sentence with encoded values)

Fields§

§token_ids: Vec<i64>

Vector of token IDs

§segment_ids: Vec<i8>

Vector segments ids (for example for BERT segments are separated with a [SEP] marker, each incrementing the segment ID). This vector has the same length as token_ids.

§special_tokens_mask: Vec<i8>

Flags tokens as special tokens (1) or not (0). This vector has the same length as token_ids.

§overflowing_tokens: Vec<i64>

Vector containing overflowing tokens, populated following a truncation step

§num_truncated_tokens: usize

Number of overflowing tokens following a truncation step. this equals the length overflowing_tokens

§token_offsets: Vec<Option<Offset>>

Offset information (as start and end positions) in relation to the original text. Tokens that can not be related to the original source are registered as None.

§reference_offsets: Vec<Vec<OffsetSize>>

Offset information (as a sequence of positions) in relation to the original text. Tokens that can not be related to the original source are registered as None.

§mask: Vec<Mask>

Masks tokens providing information on the type of tokens. This vector has the same length as token_ids.

Trait Implementations§

Returns a copy of the value. Read more
Performs copy-assignment from source. Read more
Formats the value using the given formatter. Read more
This method tests for self and other values to be equal, and is used by ==.
This method tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
This method returns an ordering between self and other values if one exists. Read more
This method tests less than (for self and other) and is used by the < operator. Read more
This method tests less than or equal to (for self and other) and is used by the <= operator. Read more
This method tests greater than (for self and other) and is used by the > operator. Read more
This method tests greater than or equal to (for self and other) and is used by the >= operator. Read more

Auto Trait Implementations§

Blanket Implementations§

Gets the TypeId of self. Read more
Immutably borrows from an owned value. Read more
Mutably borrows from an owned value. Read more
Checks if this value is equivalent to the given key. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The alignment of pointer.
The type for initializers.
Initializes a with the given initializer. Read more
Dereferences the given pointer. Read more
Mutably dereferences the given pointer. Read more
Drops the object pointed to by the given pointer. Read more
The resulting type after obtaining ownership.
Creates owned data from borrowed data, usually by cloning. Read more
Uses borrowed data to replace owned data, usually by cloning. Read more
The type returned in the event of a conversion error.
Performs the conversion.
The type returned in the event of a conversion error.
Performs the conversion.