Tokenizer implemented in Rust.
This tokenizer is based on Lucene's EnglishAnalyzer.
cargo r -r --help
cargo r -r -- -i wiki.txt -o wiki_tocken_f10.json -f 10