Expand description
mecrab-word2vec: Pure Rust Word2Vec implementation
Fast, memory-efficient word2vec training optimized for Japanese morphological analysis.
§Features
- Skip-gram with negative sampling
- Multi-threaded training with Rayon
- Direct MCV1 format output
- Memory-efficient streaming
§Example
use mecrab_word2vec::Word2VecBuilder;
let model = Word2VecBuilder::new()
.vector_size(100)
.window_size(5)
.negative_samples(5)
.min_count(10)
.epochs(3)
.threads(8)
.build()?;
model.train_from_file("corpus.txt")?;
model.save_text("vectors.txt")?;Structs§
- Vocabulary
- Vocabulary with frequency counts and subsampling
- Word2
Vec - Word2Vec model
- Word2
VecBuilder - Builder for Word2Vec model