Struct tantivy_analysis_contrib::icu::ICUTokenizer
source · pub struct ICUTokenizer;
Available on crate feature
icu
only.Expand description
ICU Tokenizer. It does not (yet ?) work as Lucene’s counterpart. Getting a tokenizer is simple :
use tantivy_analysis_contrib::icu::ICUTokenizer;
let tokenizer = ICUTokenizer;
§Example
Here is an example of a tokenization result
use tantivy::tokenizer::{TextAnalyzer, Token};
use tantivy_analysis_contrib::icu::ICUTokenizer;
let mut tmp = TextAnalyzer::builder(ICUTokenizer::default()).build();
let mut token_stream = tmp.token_stream("我是中国人。 1234 Tests ");
let token = token_stream.next().expect("A token should be present.");
assert_eq!(token.text, "我".to_string());
let token = token_stream.next().expect("A token should be present.");
assert_eq!(token.text, "是".to_string());
let token = token_stream.next().expect("A token should be present.");
assert_eq!(token.text, "中".to_string());
let token = token_stream.next().expect("A token should be present.");
assert_eq!(token.text, "国".to_string());
let token = token_stream.next().expect("A token should be present.");
assert_eq!(token.text, "人".to_string());
let token = token_stream.next().expect("A token should be present.");
assert_eq!(token.text, "1234".to_string());
let token = token_stream.next().expect("A token should be present.");
assert_eq!(token.text, "Tests".to_string());
assert_eq!(None, token_stream.next());
Trait Implementations§
source§impl Clone for ICUTokenizer
impl Clone for ICUTokenizer
source§fn clone(&self) -> ICUTokenizer
fn clone(&self) -> ICUTokenizer
Returns a copy of the value. Read more
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source
. Read moresource§impl Debug for ICUTokenizer
impl Debug for ICUTokenizer
source§impl Default for ICUTokenizer
impl Default for ICUTokenizer
source§fn default() -> ICUTokenizer
fn default() -> ICUTokenizer
Returns the “default value” for a type. Read more
source§impl Tokenizer for ICUTokenizer
impl Tokenizer for ICUTokenizer
§type TokenStream<'a> = ICUTokenizerTokenStream<'a>
type TokenStream<'a> = ICUTokenizerTokenStream<'a>
The token stream returned by this Tokenizer.
source§fn token_stream<'a>(&'a mut self, text: &'a str) -> Self::TokenStream<'a>
fn token_stream<'a>(&'a mut self, text: &'a str) -> Self::TokenStream<'a>
Creates a token stream for a given
str
.impl Copy for ICUTokenizer
Auto Trait Implementations§
impl Freeze for ICUTokenizer
impl RefUnwindSafe for ICUTokenizer
impl Send for ICUTokenizer
impl Sync for ICUTokenizer
impl Unpin for ICUTokenizer
impl UnwindSafe for ICUTokenizer
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more