Expand description
The Collator
struct is the entry point for this library’s API. It defines the options to be
used in collation. The method collate
or collate_no_tiebreak
will then compare two string
references (or byte slices) according to the selected options, and return an Ordering
value.
You can choose between two tables of character weights: DUCET and CLDR. With the CLDR table,
there is a further choice of locale tailoring. The Root
locale represents the table in its
unmodified form. The ArabicScript
locale shifts the weights of Arabic-script letters so that
they sort before the Latin script. Further locales will be added over time.
You can also choose between two approaches to the handling of variable-weight characters: “non-ignorable” and “shifted.”
The default for Collator
is to use the CLDR table with the Root
locale, and the “shifted”
approach. This is a good starting point for collation in many languages.
Fields
tailoring: Tailoring
The table of weights to be used: DUCET or CLDR (with a choice of locale for the latter)
shifting: bool
The approach to handling variable-weight characters (“non-ignorable” or “shifted”). For our
purposes, shifting
is either true (recommended) or false.
Implementations
sourceimpl Collator
impl Collator
sourcepub fn new(tailoring: Tailoring, shifting: bool) -> Self
pub fn new(tailoring: Tailoring, shifting: bool) -> Self
Create a new Collator
with the specified options. Please note that it is also possible
to call Collator::default()
.
sourcepub fn collate<T: AsRef<[u8]> + Eq + Ord + ?Sized>(
&mut self,
a: &T,
b: &T
) -> Ordering
pub fn collate<T: AsRef<[u8]> + Eq + Ord + ?Sized>(
&mut self,
a: &T,
b: &T
) -> Ordering
This is the primary method in the library. It accepts as arguments two string references or
byte slices; compares them using the options chosen; and returns an Ordering
value. This
is designed to be passed to the sort_by
function in the standard library. Simple usage
might look like the following…
use feruca::{Collator};
let mut collator = Collator::default();
let mut names = ["Peng", "Peña", "Ernie", "Émile"];
names.sort_by(|a, b| collator.collate(a, b));
let expected = ["Émile", "Ernie", "Peña", "Peng"];
assert_eq!(names, expected);
Significantly, in the event that two strings are ordered equally per the Unicode Collation
Algorithm, this method will use byte-value comparison (i.e., the traditional, naïve way of
sorting strings) as a tiebreaker. While this is probably appropriate in most cases, it can
be avoided by using the collate_no_tiebreak
method.
sourcepub fn collate_no_tiebreak<T: AsRef<[u8]> + Eq + Ord + ?Sized>(
&mut self,
a: &T,
b: &T
) -> Ordering
pub fn collate_no_tiebreak<T: AsRef<[u8]> + Eq + Ord + ?Sized>(
&mut self,
a: &T,
b: &T
) -> Ordering
This is a variation on collate
, to which it is almost identical. The difference is that,
in the event that two strings are ordered equally per the Unicode Collation Algorithm, this
method will not attempt to “break the tie” by using byte-value comparison.
Trait Implementations
Auto Trait Implementations
impl RefUnwindSafe for Collator
impl Send for Collator
impl Sync for Collator
impl Unpin for Collator
impl UnwindSafe for Collator
Blanket Implementations
sourceimpl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
const: unstable · sourcefn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more