PostProcessor

Trait PostProcessor 

Source
pub trait PostProcessor {
    // Required methods
    fn added_tokens(&self, is_pair: bool) -> usize;
    fn process_encodings(
        &self,
        encodings: Vec<Encoding>,
        add_special_tokens: bool,
    ) -> Result<Vec<Encoding>>;

    // Provided method
    fn process(
        &self,
        encoding: Encoding,
        pair_encoding: Option<Encoding>,
        add_special_tokens: bool,
    ) -> Result<Encoding> { ... }
}
Expand description

A PostProcessor has the responsibility to post process an encoded output of the Tokenizer. It adds any special tokens that a language model would require.

Required Methods§

Source

fn added_tokens(&self, is_pair: bool) -> usize

Returns the number of tokens that will be added during the processing step

Source

fn process_encodings( &self, encodings: Vec<Encoding>, add_special_tokens: bool, ) -> Result<Vec<Encoding>>

Process any amount of encodings and returns a series of encoding (might merge them)

Provided Methods§

Source

fn process( &self, encoding: Encoding, pair_encoding: Option<Encoding>, add_special_tokens: bool, ) -> Result<Encoding>

Process both encodings and returns a new merged one

Implementations§

Source§

impl dyn PostProcessor

Source

pub fn default_process( encodings: Vec<Encoding>, _add_special_tokens: bool, ) -> Result<Vec<Encoding>>

Implementors§