tiniestsegmenter 0.3.0

Compact Japanese segmenter
Documentation

TiniestSegmenter

A port of TinySegmenter written in pure, safe rust with no dependencies. You can find bindings for both Rust and Python.

TinySegmenter is an n-gram word tokenizer for Japanese text originally built by Taku Kudo (2008).

Usage

Add the crate to your project: cargo add tiniestsegmenter.

use tiniestsegmenter as ts;

fn main() {
    let tokens: Vec<&str> = ts::tokenize("ジャガイモが好きです。");
}