pub struct Collator {
    pub tailoring: Tailoring,
    pub shifting: bool,
    /* private fields */
}
Expand description

The Collator struct is the entry point for this library’s API. It defines the options to be used in collation. The method collate or collate_no_tiebreak will then compare two string references (or byte slices) according to the selected options, and return an Ordering value.

You can choose between two tables of character weights: DUCET and CLDR. With the CLDR table, there is a further choice of locale tailoring. The Root locale represents the table in its unmodified form. The ArabicScript locale shifts the weights of Arabic-script letters so that they sort before the Latin script. Further locales will be added over time.

You can also choose between two approaches to the handling of variable-weight characters: “non-ignorable” and “shifted.”

The default for Collator is to use the CLDR table with the Root locale, and the “shifted” approach. This is a good starting point for collation in many languages.

Fields

tailoring: Tailoring

The table of weights to be used: DUCET or CLDR (with a choice of locale for the latter)

shifting: bool

The approach to handling variable-weight characters (“non-ignorable” or “shifted”). For our purposes, shifting is either true (recommended) or false.

Implementations

Create a new Collator with the specified options. Please note that it is also possible to call Collator::default().

This is the primary method in the library. It accepts as arguments two string references or byte slices; compares them using the options chosen; and returns an Ordering value. This is designed to be passed to the sort_by function in the standard library. Simple usage might look like the following…

use feruca::{Collator};

let mut collator = Collator::default();

let mut names = ["Peng", "Peña", "Ernie", "Émile"];
names.sort_by(|a, b| collator.collate(a, b));

let expected = ["Émile", "Ernie", "Peña", "Peng"];
assert_eq!(names, expected);

Significantly, in the event that two strings are ordered equally per the Unicode Collation Algorithm, this method will use byte-value comparison (i.e., the traditional, naïve way of sorting strings) as a tiebreaker. While this is probably appropriate in most cases, it can be avoided by using the collate_no_tiebreak method.

This is a variation on collate, to which it is almost identical. The difference is that, in the event that two strings are ordered equally per the Unicode Collation Algorithm, this method will not attempt to “break the tie” by using byte-value comparison.

Trait Implementations

Formats the value using the given formatter. Read more

Returns the “default value” for a type. Read more

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.