Crate simdutf

Source
Expand description

Unicode validation and transcoding at billions of characters per second.

This crate is the Rust binding of simdutf.

§Compilation

This crate works out of the box as long as you have a C++11-compatible toolchain installed correctly.

simdutf links C++ standard library, which adds a dynamic linking dependency.

For more details, see simdutf documentation and cc documentation.

Here is an example for local benchmark:

export RUSTFLAGS='-C target-cpu=native'
export CXXFLAGS='-march=native'
cargo build --release

Structs§

Encoding
The encoding of a string, defined as a bitflags type.
Result
The result type of validation and transcoding.

Enums§

Base64Options
The error code type of validation and transcoding.
ErrorCode
The error code type of validation and transcoding.

Functions§

autodetect_encodings
Autodetect the possible encodings of the input in one pass.
autodetect_single_encoding
Autodetect the encoding of the input.
base64_to_binary_safe
Convert base64 string into binary data.
binary_to_base64
Convert binary data into base64.
change_endianness_utf16
Change the endianness of UTF-16 string.
convert_latin1_to_utf8
Convert possibly broken Latin1 string into UTF-8 string.
convert_latin1_to_utf16
Convert possibly broken Latin1 string into UTF-16 string.
convert_latin1_to_utf32
Convert possibly broken Latin1 string into UTF-32 string.
convert_latin1_to_utf16be
Convert possibly broken Latin1 string into UTF-16BE string.
convert_latin1_to_utf16le
Convert possibly broken Latin1 string into UTF-16LE string.
convert_utf8_to_latin1
Convert possibly broken UTF-8 string into Latin1 string.
convert_utf8_to_latin1_with_errors
Convert possibly broken UTF-8 string into Latin1 string.
convert_utf8_to_utf16
Convert possibly broken UTF-8 string into UTF-16 string.
convert_utf8_to_utf32
Convert possibly broken UTF-8 string into UTF-32 string.
convert_utf8_to_utf16_with_errors
Convert possibly broken UTF-8 string into UTF-16 string.
convert_utf8_to_utf16be
Convert possibly broken UTF-8 string into UTF-16BE string.
convert_utf8_to_utf16be_with_errors
Convert possibly broken UTF-8 string into UTF-16BE string.
convert_utf8_to_utf16le
Convert possibly broken UTF-8 string into UTF-16LE string.
convert_utf8_to_utf16le_with_errors
Convert possibly broken UTF-8 string into UTF-16LE string.
convert_utf8_to_utf32_with_errors
Convert possibly broken UTF-8 string into UTF-32 string.
convert_utf16_to_latin1
Convert possibly broken UTF-16 string into Latin1 string.
convert_utf16_to_latin1_with_errors
Convert possibly broken UTF-16 string into Latin1 string.
convert_utf16_to_utf8
Convert possibly broken UTF-16 string into UTF-8 string.
convert_utf16_to_utf8_with_errors
Convert possibly broken UTF-16 string into UTF-8 string.
convert_utf16_to_utf32
Convert possibly broken UTF-16 string into UTF-32 string.
convert_utf16_to_utf32_with_errors
Convert possibly broken UTF-16 string into UTF-32 string.
convert_utf16be_to_latin1
Convert possibly broken UTF-16BE string into Latin1 string.
convert_utf16be_to_latin1_with_errors
Convert possibly broken UTF-16BE string into Latin1 string.
convert_utf16be_to_utf8
Convert possibly broken UTF-16BE string into UTF-8 string.
convert_utf16be_to_utf8_with_errors
Convert possibly broken UTF-16BE string into UTF-8 string.
convert_utf16be_to_utf32
Convert possibly broken UTF-16BE string into UTF-32 string.
convert_utf16be_to_utf32_with_errors
Convert possibly broken UTF-16BE string into UTF-32 string.
convert_utf16le_to_latin1
Convert possibly broken UTF-16LE string into Latin1 string.
convert_utf16le_to_latin1_with_errors
Convert possibly broken UTF-16LE string into Latin1 string.
convert_utf16le_to_utf8
Convert possibly broken UTF-16LE string into UTF-8 string.
convert_utf16le_to_utf8_with_errors
Convert possibly broken UTF-16LE string into UTF-8 string.
convert_utf16le_to_utf32
Convert possibly broken UTF-16LE string into UTF-32 string.
convert_utf16le_to_utf32_with_errors
Convert possibly broken UTF-16LE string into UTF-32 string.
convert_utf32_to_latin1
Convert possibly broken UTF-32 string into Latin1 string.
convert_utf32_to_utf8
Convert possibly broken UTF-32 string into UTF-8 string.
convert_utf32_to_utf8_with_errors
Convert possibly broken UTF-32 string into UTF-8 string.
convert_utf32_to_utf16
Convert possibly broken UTF-32 string into UTF-16 string.
convert_utf32_to_utf16_with_errors
Convert possibly broken UTF-32 string into UTF-16 string.
convert_utf32_to_utf16be
Convert possibly broken UTF-32 string into UTF-16BE string.
convert_utf32_to_utf16be_with_errors
Convert possibly broken UTF-32 string into UTF-16BE string.
convert_utf32_to_utf16le
Convert possibly broken UTF-32 string into UTF-16LE string.
convert_utf32_to_utf16le_with_errors
Convert possibly broken UTF-32 string into UTF-16LE string.
convert_valid_utf8_to_latin1
Convert valid UTF-8 string into Latin1 string.
convert_valid_utf8_to_utf16
Convert valid UTF-8 string into UTF-16 string.
convert_valid_utf8_to_utf32
Convert valid UTF-8 string into UTF-32 string.
convert_valid_utf8_to_utf16be
Convert valid UTF-8 string into UTF-16BE string.
convert_valid_utf8_to_utf16le
Convert valid UTF-8 string into UTF-16LE string.
convert_valid_utf16_to_latin1
Convert valid UTF-16 string into Latin1 string.
convert_valid_utf16_to_utf8
Convert valid UTF-16 string into UTF-8 string.
convert_valid_utf16_to_utf32
Convert valid UTF-16 string into UTF-32 string.
convert_valid_utf16be_to_latin1
Convert valid UTF-16BE string into Latin1 string.
convert_valid_utf16be_to_utf8
Convert valid UTF-16BE string into UTF-8 string.
convert_valid_utf16be_to_utf32
Convert valid UTF-16BE string into UTF-32 string.
convert_valid_utf16le_to_latin1
Convert valid UTF-16LE string into Latin1 string.
convert_valid_utf16le_to_utf8
Convert valid UTF-16LE string into UTF-8 string.
convert_valid_utf16le_to_utf32
Convert valid UTF-16LE string into UTF-32 string.
convert_valid_utf32_to_utf8
Convert valid UTF-32 string into UTF-8 string.
convert_valid_utf32_to_utf16
Convert valid UTF-32 string into UTF-16 string.
convert_valid_utf32_to_utf16be
Convert valid UTF-32 string into UTF-16BE string.
convert_valid_utf32_to_utf16le
Convert valid UTF-32 string into UTF-16LE string.
count_utf8
Count the number of code points in the UTF-8 string.
count_utf16
Count the number of code points in the UTF-16 string.
count_utf16be
Count the number of code points in the UTF-16BE string.
count_utf16le
Count the number of code points in the UTF-16LE string.
latin1_length_from_utf8
Count the number of code units that the UTF-8 string would require in Latin1 format.
latin1_length_from_utf16
Count the number of code units that the UTF-16 string would require in Latin1 format.
latin1_length_from_utf32
Count the number of code units that the UTF-32 string would require in Latin1 format.
utf8_length_from_latin1
Count the number of code units that the Latin1 string would require in UTF-8 format.
utf8_length_from_utf16
Count the number of code units that the UTF-16 string would require in UTF-8 format.
utf8_length_from_utf32
Count the number of code units that the UTF-32 string would require in UTF-8 format.
utf8_length_from_utf16be
Count the number of code units that the UTF-16BE string would require in UTF-8 format.
utf8_length_from_utf16le
Count the number of code units that the UTF-16LE string would require in UTF-8 format.
utf16_length_from_latin1
Count the number of code units that the Latin1 string would require in UTF-16 format.
utf16_length_from_utf8
Count the number of code units that the UTF-8 string would require in UTF-16 format.
utf16_length_from_utf32
Count the number of code units that the UTF-32 string would require in UTF-16 format.
utf32_length_from_utf8
Count the number of code units that the UTF-8 string would require in UTF-32 format.
utf32_length_from_utf16
Count the number of code units that the UTF-16 string would require in UTF-32 format.
utf32_length_from_utf16be
Count the number of code units that the UTF-16BE string would require in UTF-32 format.
utf32_length_from_utf16le
Count the number of code units that the UTF-16LE string would require in UTF-32 format.
validate_ascii
Validate the ASCII string.
validate_ascii_with_errors
Validate the ASCII string.
validate_utf8
Validate the UTF-8 string.
validate_utf8_with_errors
Validate the UTF-8 string.
validate_utf16
Validate the UTF-16 string.
validate_utf32
Validate the UTF-32 string.
validate_utf16_with_errors
Validate the UTF-16 string.
validate_utf16be
Validate the UTF-16BE string.
validate_utf16be_with_errors
Validate the UTF-16BE string.
validate_utf16le
Validate the UTF-16LE string.
validate_utf16le_with_errors
Validate the UTF-16LE string.
validate_utf32_with_errors
Validate the UTF-32 string.