Crate utf8proc

Source
Expand description

Rust bindings to the utf8proc library supporting normalization, case-folding, and character class testing.

A statically linked binary is under 350K, yet can replace the functionality of the unicode-width and unicode-normalization crates, among others. It is used for Unicode implementation in the Julia programming language.

§Limitations

The underlying utf8proc library does not support any “derived properties”, including the XID_Start/XID_Continue properties. For this purpose, the unicode-ident crate is recommended. It is very fast and needs only ~10KiB of static storage. Emulating this functionality would be slow and/or require additional static storage.

It also does not support lookup or resolution of character names. For that, consider the unicode_names2 crate.

The safe bindings does not (yet) wrap all the functionality that the C library does. PRs are welcome.

Modules§

case
Functionality for Unicode case mappings: case conversion, case detection, and caseless matching: .
grapheme
Unicode grapheme handling.
properties
Defines the CharProperties type, which contains full information on a Unicode codepoint.
transform
Operations to transform strings, including the map function.

Structs§

Error
Indicates an error that occurred in utf8proc.

Enums§

ErrorKind
Indicates the type of the underlying utf8proc::Error.

Functions§

unicode_version
Return the Unicode version utf8proc was compiled with.
version
Return the version of the underlying utf8proc library.