Crate vaporetto

Crate vaporetto 

Source
Expand description

§Vaporetto

Vaporetto is a fast and lightweight pointwise prediction based tokenizer.

§Examples

use std::fs::File;

use vaporetto::{Model, Predictor, Sentence};

let f = File::open("../resources/model.bin")?;
let model = Model::read(f)?;
let predictor = Predictor::new(model, true)?;

let mut buf = String::new();

let mut s = Sentence::default();

s.update_raw("まぁ社長は火星猫だ")?;
predictor.predict(&mut s);
s.fill_tags();
s.write_tokenized_text(&mut buf);
assert_eq!(
    "まぁ/名詞/マー 社長/名詞/シャチョー は/助詞/ワ 火星/名詞/カセー 猫/名詞/ネコ だ/助動詞/ダ",
    buf,
);

s.update_raw("まぁ良いだろう")?;
predictor.predict(&mut s);
s.fill_tags();
s.write_tokenized_text(&mut buf);
assert_eq!(
    "まぁ/副詞/マー 良い/形容詞/ヨイ だろう/助動詞/ダロー",
    buf,
);

Tag prediction requires crate feature tag-prediction.

Training requires crate feature train. For more details, see Trainer.

Modules§

errors
Definition of errors.

Structs§

KyteaModelkytea
Model data created by KyTea.
Model
Model data.
Predictor
Predictor created from the model.
Sentence
Sentence data containing boundary and tag annotations.
Token
A Token information.
TokenIterator
Iterator returned by Sentence::iter_tokens().
Trainertrain
Trainer.
WordWeightRecord
Record of weights for each word.

Enums§

CharacterBoundary
Boundary type.
CharacterType
Character type.
SolverTypetrain
Solver type.

Constants§

VERSION
Version number of this library.