pub struct XlmRobertaTokenizer { /* private fields */ }
Expand description
Tokenizer for Roberta models.
Roberta uses the sentencepiece tokenizer. However, we cannot use it in the intended way: we would have to detokenize sentences and it is not guaranteed that each token has a unique piece, which is required in sequence labeling. So instead, we use the tokenizer as a subword tokenizer.
Implementations§
source§impl XlmRobertaTokenizer
impl XlmRobertaTokenizer
pub fn new(spp: SentencePieceProcessor) -> Self
pub fn open<P>(model: P) -> Result<Self, TokenizerError>where P: AsRef<Path>,
Trait Implementations§
source§impl From<SentencePieceProcessor> for XlmRobertaTokenizer
impl From<SentencePieceProcessor> for XlmRobertaTokenizer
source§fn from(spp: SentencePieceProcessor) -> Self
fn from(spp: SentencePieceProcessor) -> Self
Converts to this type from the input type.
source§impl Tokenize for XlmRobertaTokenizer
impl Tokenize for XlmRobertaTokenizer
Auto Trait Implementations§
impl RefUnwindSafe for XlmRobertaTokenizer
impl Send for XlmRobertaTokenizer
impl Sync for XlmRobertaTokenizer
impl Unpin for XlmRobertaTokenizer
impl UnwindSafe for XlmRobertaTokenizer
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more