Crate vaporetto[][src]

Expand description

Vaporetto

Vaporetto is a fast and lightweight pointwise prediction based tokenizer.

Examples

use std::fs::File;
use std::io::{prelude::*, stdin, BufReader};

use vaporetto::{Model, Predictor, Sentence};

let mut f = BufReader::new(File::open("model.bin").unwrap());
let model = Model::read(&mut f).unwrap();
let mut predictor = Predictor::new(model);

for line in stdin().lock().lines() {
    let s = Sentence::from_raw(line.unwrap()).unwrap();
    let s = predictor.predict(s);
    let toks = s.to_tokenized_string().unwrap();
    println!("{}", toks);
}

Training requires crate feature train. For more details, see Trainer.

Structs

Datasettrain

Dataset manager.

Model data created by KyTea.

Model data.

MultithreadPredictormultithreading

Predictor for multithreading.

Predictor.

Sentence with boundary annotations.

Trainertrain

Trainer.

Enums

Boundary type.

Character type.