nlpO3
Thai Natural Language Processing library in Rust, with Python and Node bindings. Formerly oxidized-thainlp.
Features
- Thai word tokenizer
- use maximal-matching dictionary-based tokenization algorithm and honor Thai Character Cluster boundaries
- 2x faster than similar pure Python implementation (PyThaiNLP's newmm)
- support custom dictionary
- default dictionary included (62,000 words, a copy from PyThaiNLP)
- use maximal-matching dictionary-based tokenization algorithm and honor Thai Character Cluster boundaries
Usage
Command line interface
|
Bindings
As Rust library
In Cargo.toml:
[]
# ...
= "1.2.0"
Build
Requirements
Steps
Generic test:
Build API document and open it to check:
Build (remove --release to keep debug information):
Check target/ for build artifacts.
Issues
Please report issues at https://github.com/PyThaiNLP/nlpo3/issues