Expand description
This crate provides a basic implementation of the Unicode Collation Algorithm. There is really
just one function, collate
, and a few options that can be passed to it. (The
collate_no_tiebreak
function is a variation whose behavior is a bit more strict.) Despite the
bare-bones API, this implementation conforms to the standard and allows for the use of the CLDR
root collation order; so it may indeed be useful, even in this early stage of development.
Structs
This struct specifies the options to be passed to the collate
function. You can choose between
two tables (DUCET and CLDR root), and between two approaches to the handling of variable-weight
characters (“non-ignorable” and “shifted”). The default, and a good starting point for Unicode
collation, is to use the CLDR table with the “shifted” approach.
Enums
This enum provides for a choice of which table of character weights to use.
Functions
This is the main public function in the library. It accepts as arguments two string references
or byte slices, and a CollationOptions
struct. It returns an Ordering
value. This is
designed to be used in conjunction with the sort_by
function in the standard library. Simple
usage might look like the following…
This is a variation on the collate
function, to which it is almost identical. The difference
is that, in the event that two strings are ordered equally per the Unicode Collation Algorithm,
this function will not attempt to “break the tie” by using byte-value comparison.