Struct rust_tokenizers::TokenRef
source · pub struct TokenRef<'a> {
pub text: &'a str,
pub offset: Offset,
pub reference_offsets: &'a [OffsetSize],
pub mask: Mask,
}Expand description
Reference token that references the original text, with a string slice representation
Fields§
§text: &'a strString representation
offset: OffsetStart and end positions of the token with respect to the original text
reference_offsets: &'a [OffsetSize]Sequence of positions with respect to the original text contained in the token.
For example, if the token offset is start: 4, end: 10, corresponding reference_offsets are [4, 5, 6, 7, 8, 9]
mask: MaskMask indicating the type of the token
Implementations§
source§impl<'a> TokenRef<'a>
impl<'a> TokenRef<'a>
sourcepub fn new(text: &'a str, offsets: &'a [OffsetSize]) -> TokenRef<'a>
pub fn new(text: &'a str, offsets: &'a [OffsetSize]) -> TokenRef<'a>
Creates a new token reference from a text and list of offsets.
Parameters
- text (
&str): text reference - offsets (
&[OffsetSize]): reference positions with respect to the original text
Example
use rust_tokenizers::TokenRef;
let _original_text = "Hello, world";
let text = "world";
let offsets = &[7, 8, 9, 10, 11];
let token_ref = TokenRef::new(text, offsets);Trait Implementations§
source§impl<'a> ConsolidatableTokens<TokenRef<'a>> for Vec<TokenRef<'a>>
impl<'a> ConsolidatableTokens<TokenRef<'a>> for Vec<TokenRef<'a>>
source§fn iter_consolidate_tokens(&self) -> ConsolidatedTokenIterator<'_, TokenRef<'a>> ⓘ
fn iter_consolidate_tokens(&self) -> ConsolidatedTokenIterator<'_, TokenRef<'a>> ⓘ
Creates an iterator from a sequence of
ConsolidatableTokens.