[][src]Crate lexical_sort

This is a library to compare and sort strings (or file paths) lexicographically. This means that non-ASCII characters such as á or ß are treated like their closest ASCII character: á is treated as a, ß is treated as ss, etc.

Lexical comparisons are case-insensitive. Alphanumeric characters are sorted after all other characters (punctuation, whitespace, special characters, emojis, ...).

It is possible to enable natural sorting, which also handles ASCII numbers. For example, 50 is less than 100 with natural sorting turned on. It's also possible to skip characters that aren't alphanumeric, so e.g. f-5 is next to f5.

If different strings have the same ASCII representation (e.g. "Foo" and "fóò"), it falls back to the default method from the standard library, so sorting is deterministic.

NOTE: This crate doesn't attempt to be correct for every locale, but it should work reasonably well for a wide range of locales, while providing excellent performance.

Usage

To sort strings or paths, you can use the StringSort or PathSort trait:

use lexical_sort::{StringSort, natural_lexical_cmp};

let mut strings = vec!["ß", "é", "100", "hello", "world", "50", ".", "B!"];
strings.string_sort_unstable(natural_lexical_cmp);

assert_eq!(&strings, &[".", "50", "100", "B!", "é", "hello", "ß", "world"]);

There are eight comparison functions:

Functionlexico­graphicalnaturalskips non-alphanumeric chars
cmp
only_alnum_cmpyes
lexical_cmpyes
lexical_only_alnum_cmpyesyes
natural_cmpyes
natural_only_alnum_cmpyesyes
natural_lexical_cmpyesyes
natural_lexical_­only_alnum_cmpyesyesyes

Note that only the functions that sort lexicographically are case insensitive.

Modules

iter

Iterators to transliterate Unicode to ASCII. Note that only alphanumeric characters are transliterated, and not all of them are supported.

Traits

PathSort

A trait to sort paths and OsStrings. This is a convenient wrapper for the standard library sort functions.

StringSort

A trait to sort strings. This is a convenient wrapper for the standard library sort functions.

Functions

cmp

Compares strings (not lexicographically or naturally, doesn't skip non-alphanumeric characters)

lexical_cmp

Compares strings lexicographically

lexical_only_alnum_cmp

Compares strings lexicographically, skipping non-alphanumeric characters

natural_cmp

Compares strings naturally

natural_lexical_cmp

Compares strings naturally and lexicographically

natural_lexical_only_alnum_cmp

Compares strings naturally and lexicographically, skipping non-alphanumeric characters

natural_only_alnum_cmp

Compares strings naturally, skipping non-alphanumeric characters

only_alnum_cmp

Compares strings, skipping non-alphanumeric characters