language-tokenizer 0.1.0

Text tokenizer for linguistic purposes, such as text matching. Supports more than 40 languages, including English, French, Russian, Japanese, Thai etc.
Documentation

language-tokenizer

There is very little structured metadata to build this page from currently. You should check the main library docs, readme, or Cargo.toml in case the author documented the features in them.

This version has 11 feature flags, 0 of them enabled by default.

chinese-icu

chinese-lindera

full

japanese-icu

japanese-ipadic-lindera

japanese-ipadic-neologd-lindera

japanese-unidic-lindera

korean-lindera

serde

snowball

southeast-asian