detone 1.0.0

Decompose Vietnamese tone marks
Documentation

detone

crates.io docs.rs Apache 2 / MIT dual-licensed

An iterator adapter that takes an iterator over char yielding a sequence of chars in Normalization Form C (this precondition is not checked!) and yields chars either such that tone marks that wouldn't otherwise fit into windows-1258 are decomposed or such that text is decomposed into orthographic units.

Use cases include preprocessing before encoding Vietnamese text into windows-1258 or converting precomposed Vietnamese text into a form that looks like it was written with the (non-IME) Vietnamese keyboard layout (e.g. for machine learning training or benchmarking purposes).

Licensing

Please see the file named COPYRIGHT.

Documentation

Generated API documentation is available online.

Release Notes

1.0.0

  • Initial release.