rustling 0.7.0

A blazingly fast library for computational linguistics
Documentation

PyPI crates.io Conda Version

Rustling is a blazingly fast library for computational linguistics.

Documentation: Python | Rust

Features

  • N-grams
  • Language models
  • Hidden Markov model
  • Word segmentation
  • Part-of-speech tagging
  • CHAT parsing for TalkBank and CHILDES data

Performance

Component Task Speedup vs.
Language Models Fit 11x NLTK
Score 2x NLTK
Generate 86--107x NLTK
Word Segmentation LongestStringMatching 9x wordseg
POS Tagging Training 5x NLTK
Tagging 17x NLTK
HMM Fit 14x hmmlearn
Predict 0.9x hmmlearn
Score 5x hmmlearn
CHAT Parsing Reading from a ZIP archive 30x pylangacq
Reading from strings 35x pylangacq
Parsing utterances 15x pylangacq
Parsing tokens 8x pylangacq

See benchmarks/ for reproduction scripts.

Installation

Python

Using pip:

pip install rustling

Using conda:

conda install -c conda-forge rustling

For Pyodide, pre-built WASM wheels (with multithreading disabled, as Pyodide does not support it) are available from each GitHub release — look for the .whl file with emscripten in the filename.

Rust

cargo add rustling

License

MIT License