pub trait Tokenize: Send + Sync {
// Required method
fn tokenize(&self, sentence: Sentence) -> SentenceWithPieces;
}
Expand description
Trait for wordpiece tokenizers.
Required Methods§
Sourcefn tokenize(&self, sentence: Sentence) -> SentenceWithPieces
fn tokenize(&self, sentence: Sentence) -> SentenceWithPieces
Tokenize the tokens in a sentence into word pieces.
Implementations must prefix the first piece corresponding to a token by one or more special pieces marking the beginning of the sentence. The representation of this piece can be used for special purposes, such as classification or acting is the pseudo-root in dependency parsing.