pub trait Tokenizer:
Send
+ Sync
+ Debug {
// Required method
fn tokenize(&self, text: &str) -> Vec<String>;
}Expand description
Decompose a string into BM25 search terms.
Implementors choose the tokenization rule (whitespace +
lowercase, BPE, sentencepiece, etc.) — BM25Index will index
and search using whatever tokens this returns.
Send + Sync so a BM25Index carrying a dyn Tokenizer can be
shared across threads. Debug so the index’s Debug derive
works without manual impl.
Required Methods§
Dyn Compatibility§
This trait is dyn compatible.
In older versions of Rust, dyn compatibility was called "object safety".