Crate minbpe

Source

Re-exports§

pub use basic::BasicTokenizer;
pub use regex::AllowedSpecial;
pub use regex::RegexTokenizerStruct;
pub use regex::RegexTokenizerTrait;
pub use base::*;

Modules§

base
Contains the base Tokenizer struct and a few common helper functions. The base struct also contains the (common) save/load functionality. It would be possible to be a lot more strict about the interface and e.g. isolating all regex/pattern parts to the RegexTokenizer, but some concessions are made for simplicity.
basic
regex
test_common