pub struct TokenIdsWithSpecialTokens {
pub token_ids: Vec<i64>,
pub segment_ids: Vec<i8>,
pub special_tokens_mask: Vec<i8>,
pub token_offsets: Vec<Option<Offset>>,
pub reference_offsets: Vec<Vec<OffsetSize>>,
pub mask: Vec<Mask>,
}
Expand description
Encoded input with special tokens
Intermediate tokenization steps before truncation to a maximum length, after encoding and addition of special tokens
Fields§
§token_ids: Vec<i64>
Vector of token IDs
segment_ids: Vec<i8>
Vector segments ids (for example for BERT segments are separated with a [SEP] marker, each incrementing the segment ID). This vector has the same length as token_ids.
special_tokens_mask: Vec<i8>
Flags tokens as special tokens (1) or not (0). This vector has the same length as token_ids.
token_offsets: Vec<Option<Offset>>
Offset information (as start and end positions) in relation to the original text. Tokens that can not be related to the original source are registered as None.
reference_offsets: Vec<Vec<OffsetSize>>
Offset information (as a sequence of positions) in relation to the original text. Tokens that can not be related to the original source are registered as None.
mask: Vec<Mask>
Masks tokens providing information on the type of tokens. This vector has the same length as token_ids.
Trait Implementations§
source§impl Clone for TokenIdsWithSpecialTokens
impl Clone for TokenIdsWithSpecialTokens
source§fn clone(&self) -> TokenIdsWithSpecialTokens
fn clone(&self) -> TokenIdsWithSpecialTokens
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read more