Expand description
Rust bindings to the utf8proc library supporting normalization, case-folding, and character class testing.
A statically linked binary is under 350K, yet can replace the functionality of the unicode-width
and unicode-normalization
crates, among others.
It is used for Unicode implementation in the Julia programming language.
§Limitations
The underlying utf8proc library does not support any “derived properties”, including the XID_Start
/XID_Continue
properties.
For this purpose, the unicode-ident
crate is recommended.
It is very fast and needs only ~10KiB of static storage.
Emulating this functionality would be slow and/or require additional static storage.
It also does not support lookup or resolution of character names.
For that, consider the unicode_names2
crate.
The safe bindings does not (yet) wrap all the functionality that the C library does. PRs are welcome.
Modules§
- case
- Functionality for Unicode case mappings: case conversion, case detection, and caseless matching: .
- grapheme
- Unicode grapheme handling.
- properties
- Defines the
CharProperties
type, which contains full information on a Unicode codepoint. - transform
- Operations to transform strings,
including the
map
function.
Structs§
- Error
- Indicates an error that occurred in utf8proc.
Enums§
- Error
Kind - Indicates the type of the underlying
utf8proc::Error
.
Functions§
- unicode_
version - Return the Unicode version utf8proc was compiled with.
- version
- Return the version of the underlying utf8proc library.