Trait tokenizers::tokenizer::Trainer

source ·
pub trait Trainer {
    type Model: Model + Sized;

    // Required methods
    fn should_show_progress(&self) -> bool;
    fn train(&self, model: &mut Self::Model) -> Result<Vec<AddedToken>>;
    fn feed<I, S, F>(&mut self, iterator: I, process: F) -> Result<()>
       where I: Iterator<Item = S> + Send,
             S: AsRef<str> + Send,
             F: Fn(&str) -> Result<Vec<String>> + Sync;
}
Expand description

A Trainer has the responsibility to train a model. We feed it with lines/sentences and then it can train the given Model.

Required Associated Types§

Required Methods§

source

fn should_show_progress(&self) -> bool

Whether we should show progress during the training.

source

fn train(&self, model: &mut Self::Model) -> Result<Vec<AddedToken>>

The actual training method. This will return a new trained Model as well as a list of special_tokens to be added directly to the tokenizer along with the model.

source

fn feed<I, S, F>(&mut self, iterator: I, process: F) -> Result<()>
where I: Iterator<Item = S> + Send, S: AsRef<str> + Send, F: Fn(&str) -> Result<Vec<String>> + Sync,

Process an iterator of sequences, calling process for each of them in order to pre-process the said sequence as relevant.

Object Safety§

This trait is not object safe.

Implementors§