Struct nlprule::tokenizer::Tokenizer[][src]

pub struct Tokenizer { /* fields omitted */ }

The complete Tokenizer doing tagging, chunking and disambiguation.

Implementations

impl Tokenizer[src]

pub fn new<P: AsRef<Path>>(p: P) -> Result<Self, Error>[src]

Creates a new tokenizer from a path to a binary.

Errors

  • If the file can not be opened.
  • If the file content can not be deserialized to a rules set.

pub fn from_reader<R: Read>(reader: R) -> Result<Self, Error>[src]

Creates a new tokenizer from a reader.

pub fn rules(&self) -> &Vec<DisambiguationRule>[src]

pub fn tagger(&self) -> &Arc<Tagger>[src]

pub fn chunker(&self) -> &Option<Chunker>[src]

pub fn options(&self) -> &Arc<TokenizerOptions>[src]

pub fn disambiguate<'t>(
    &'t self,
    tokens: Vec<IncompleteToken<'t>>
) -> Vec<DisambiguatedToken<'t>>
[src]

Apply rule-based disambiguation to the tokens. This does not change the number of tokens, but can change the content arbitrarily.

pub fn sentencize<'t>(&'t self, text: &'t str) -> Vec<Vec<IncompleteToken<'t>>>[src]

Splits the text into sentences and tokenizes each sentence.

pub fn pipe<'t>(&'t self, text: &'t str) -> Vec<Vec<Token<'t>>>[src]

Applies the entire tokenization pipeline including sentencization, tagging, chunking and disambiguation.

Trait Implementations

impl Default for Tokenizer[src]

impl<'de> Deserialize<'de> for Tokenizer[src]

impl Serialize for Tokenizer[src]

Auto Trait Implementations

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> DeserializeOwned for T where
    T: for<'de> Deserialize<'de>, 
[src]

impl<T> From<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T> Pointable for T

type Init = T

The type for initializers.

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.