Expand description

ICU collation support for rust

This crate provides collation (locale-sensitive string ordering), based on the collation as implemented by the ICU library. Specifically the functionality exposed through its C API, as available in the header ucol.h.

The main type is UCollator, which can be created using UCollator::try_from from a &str.

A detailed discussion of collation is out of scope of source code documentation. An interested reader can check out the collation documentation on the ICU user guide.

Are you missing some features from this crate? Consider reporting an issue or even contributing the functionality.

Examples

Some example code for the use of collation is given below.

First off, the more low-level API, which uses ustring::UChar is the following, which requires a conversion to ustring::UChar prior to use. This function is mostly used in algorithms that compose Unicode functionality.

use rust_icu_ustring as ustring;
use rust_icu_ucol as ucol;
use std::convert::TryFrom;
let collator = ucol::UCollator::try_from("sr-Latn").expect("collator");
let mut mixed_up = vec!["d", "dž", "đ", "a", "b", "c", "č", "ć"];
mixed_up.sort_by(|a, b| {
   let first = ustring::UChar::try_from(*a).expect("first");
   let second = ustring::UChar::try_from(*b).expect("second");
   collator.strcoll(&first, &second)
});
let alphabet = vec!["a", "b", "c", "č", "ć", "d", "dž", "đ"];
assert_eq!(alphabet, mixed_up);

A more rustful API is UCollator::strcoll_utf8 which can operate on rust AsRef<str> and can be used without converting the input data ahead of time.

use rust_icu_ustring as ustring;
use rust_icu_ucol as ucol;
use std::convert::TryFrom;
let collator = ucol::UCollator::try_from("sr-Latn").expect("collator");
let mut mixed_up = vec!["d", "dž", "đ", "a", "b", "c", "č", "ć"];
mixed_up.sort_by(|a, b| collator.strcoll_utf8(a, b).expect("strcoll_utf8"));
let alphabet = vec!["a", "b", "c", "č", "ć", "d", "dž", "đ"];
assert_eq!(alphabet, mixed_up);

Structs

Functions

Creates an enumeration of all available locales supporting collation.