charabia 0.9.1

A simple library to detect the language, tokenize the text and normalize the tokens
Documentation

charabia

There is very little structured metadata to build this page from currently. You should check the main library docs, readme, or Cargo.toml in case the author documented the features in them.

This version has 20 feature flags, 14 of them enabled by default.

default

chinese (default)

german-segmentation (default)

This feature flag does not enable additional features.

greek (default)

This feature flag does not enable additional features.

hebrew (default)

This feature flag does not enable additional features.

japanese (default)

khmer (default)

This feature flag does not enable additional features.

korean (default)

swedish-recomposition (default)

This feature flag does not enable additional features.

thai (default)

This feature flag does not enable additional features.

turkish (default)

This feature flag does not enable additional features.

vietnamese (default)

This feature flag does not enable additional features.

chinese-normalization (default)

This feature flag does not enable additional features.

chinese-segmentation (default)

japanese-segmentation-unidic (default)

chinese-normalization-pinyin

japanese-segmentation-ipadic

japanese-transliteration

latin-camelcase

latin-snakecase

lindera