Expand description
Litsea is an extremely compact word segmentation and POS tagging library implemented in Rust.
It performs word segmentation using a compact pre-trained model based on AdaBoost binary classification, inspired by TinySegmenter and TinySegmenterMaker. It also supports joint word segmentation and POS (Part-of-Speech) tagging using an Averaged Perceptron with Universal POS (UPOS) tags.
§Supported Languages
- Japanese
- Chinese (Simplified and Traditional)
- Korean
Re-exports§
pub use adaboost::AdaBoost;pub use error::LitseaError;pub use error::Result;pub use extractor::Extractor;pub use language::Language;pub use metrics::BinaryMetrics;pub use metrics::MulticlassMetrics;pub use perceptron::AveragedPerceptron;pub use segmenter::Segmenter;pub use trainer::PosTrainer;pub use trainer::Trainer;pub use upos::SegmentLabel;pub use upos::Upos;
Modules§
- adaboost
- error
- Error types for the litsea library.
- extractor
- language
- metrics
- Evaluation metrics for the learners.
- perceptron
- segmenter
- trainer
- upos