[−][src]Trait tokenizers::tokenizer::Trainer
A Trainer
has the responsibility to train a model. We feed it with lines/sentences
and it returns a Model
when done.
Required methods
fn should_show_progress(&self) -> bool
Whether we should show progress during the training.
fn train(
&self,
words: HashMap<String, u32>
) -> Result<(Box<dyn Model>, Vec<AddedToken>)>
&self,
words: HashMap<String, u32>
) -> Result<(Box<dyn Model>, Vec<AddedToken>)>
The actual training method. This will return a new trained Model as well as a list
of special_tokens
to be added directly to the tokenizer along with the model.
fn process_tokens(&self, words: &mut HashMap<String, u32>, tokens: Vec<String>)
Process a bunch of token, counting them as relevant.
Implementors
impl Trainer for BpeTrainer
[src]
fn train(
&self,
word_counts: HashMap<String, u32>
) -> Result<(Box<dyn Model>, Vec<AddedToken>)>
[src]
&self,
word_counts: HashMap<String, u32>
) -> Result<(Box<dyn Model>, Vec<AddedToken>)>
Train a BPE model
fn process_tokens(&self, words: &mut HashMap<String, u32>, tokens: Vec<String>)
[src]
Process a bunch of tokens, counting them
fn should_show_progress(&self) -> bool
[src]
Whether we should show progress