Expand description

This crate provides a basic implementation of the Unicode Collation Algorithm. There is really just one function, collate, and a few options that can be passed to it. (The collate_no_tiebreak function is a variation whose behavior is a bit more strict.) Despite the bare-bones API, this implementation conforms to the standard and allows for the use of the CLDR root collation order; so it may indeed be useful, even in this early stage of development.

Structs

This struct specifies the options to be passed to the collate function. You can choose between two tables (DUCET and CLDR root), and between two approaches to the handling of variable-weight characters (“non-ignorable” and “shifted”). The default, and a good starting point for Unicode collation, is to use the CLDR table with the “shifted” approach.

Enums

This enum provides for a choice of which table of character weights to use.

Functions

This is the main public function in the library. It accepts as arguments two string references or byte slices, and a CollationOptions struct. It returns an Ordering value. This is designed to be used in conjunction with the sort_by function in the standard library. Simple usage might look like the following…

This is a variation on the collate function, to which it is almost identical. The difference is that, in the event that two strings are ordered equally per the Unicode Collation Algorithm, this function will not attempt to “break the tie” by using byte-value comparison.