Natural Lexicographic Order Sort
Provides hybrid sorting logic for Rust, combining natural and lexicographical ordering for strings (&str) and byte slices (&[u8]).
Overview
Standard sorting methods often fall short when dealing with lists containing both fixed-format identifiers and human-readable names with embedded numbers.
- Lexicographical sort: Works well for fixed-length IDs (e.g.,
id-001,id-002) but sorts filenames counter-intuitively (file10.txtbeforefile2.txt). - Natural sort: Handles filenames correctly (
file2.txtbeforefile10.txt) but might not be the desired behaviour for fixed-length, zero-padded IDs where lexicographical order is preferred.
natlex_sort offers a hybrid approach to address this:
- Items of the same length are compared lexicographically.
- Items of different lengths are compared using natural ordering.
This allows fixed-length IDs to sort byte-wise while variable-length names are sorted naturally.
Case-sensitive and case-insensitive variants are provided. The string-based natural sorting relies on the excellent natord crate.
Features
- Hybrid comparison functions for
&strand&[u8]. - Case-sensitive (
nat_lex_cmp,nat_lex_byte_cmp) and case-insensitive (nat_lex_cmp_ignore,nat_lex_byte_cmp_ignore) variants. - Convenience functions to sort slices in place (
nat_lex_sort,nat_lex_sort_bytes, etc.). NatLexOrderedStringwrapper type for use with ordered collections likeBTreeMap.- Relies on
natordfor robust natural sorting of strings. - Custom byte-slice natural comparison logic suitable for ASCII text with numbers.
Installation
Add natlex_sort to your Cargo.toml:
[]
= "0.1.0" # Replace with the latest version from crates.io
Usage Examples
use ;
use BTreeMap;
// --- Sorting Mixed Strings ---
let mut items = vec!;
// Case-sensitive sort
nat_lex_sort;
// Expected: "Item1", "id_001", "id_002", "item2", "item10"
// - "id_001", "id_002" (same length -> lexicographical)
// - "Item1", "item2", "item10" (different lengths -> natural, 'I' < 'i')
// - Group comparison: "Item1" vs "id_001" (diff length -> natural, 'I' < 'i')
assert_eq!;
// --- Case-Insensitive Sorting ---
let mut items_ci = vec!;
nat_lex_sort_ignore_case;
// Expected: "ID_002", "id_001", "Item1", "item2", "item10"
// - "ID_002", "id_001" (same length -> case-insensitive lex, 'D' < 'd' tie-breaker)
// - "Item1", "item2", "item10" (different lengths -> case-insensitive natural)
// - Group comparison: "ID_002" vs "Item1" (diff length -> natural ignore case, 'I'=='I', then 'D' < 't')
assert_eq!;
// --- Using the Wrapper Type ---
let mut map = new;
// Keys are ordered using case-sensitive nat_lex_cmp
map.insert;
map.insert; // Same length
map.insert; // Same length
map.insert;
let ordered_keys: = map.keys.map.collect;
// Expected: "config_2", "config_10", "id_abc", "id_abd"
// ('c' < 'i')
assert_eq!;
// --- See documentation for byte slice sorting and direct comparator usage ---
API Documentation
Detailed API documentation can be found on docs.rs.
The main components are:
- Comparison Functions:
nat_lex_cmp,nat_lex_cmp_ignore(for&str)nat_lex_byte_cmp,nat_lex_byte_cmp_ignore(for&[u8])
- Sorting Functions:
nat_lex_sort,nat_lex_sort_ignore_case(for&[T: AsRef<str>])nat_lex_sort_bytes,nat_lex_sort_bytes_ignore_case(for&[&[u8]])
- Wrapper Type:
NatLexOrderedString(owningStringwrapper usingnat_lex_cmp)
Behavior Notes
- Case-Insensitive Tie-Breaking: When comparing items of the same length using the
_ignore_casevariants, if the items are identical ignoring case (e.g.,"FileA"vs"filea"), a standard case-sensitive comparison (a.cmp(b)) is used as a tie-breaker to ensure a stable, total order. - Byte Slice Logic: The byte comparison functions (
nat_lex_byte_cmp,nat_lex_byte_cmp_ignore) implement custom natural sorting logic suitable for ASCII text with embedded numbers. See the documentation for details. Behavior with non-ASCII bytes in the natural sorting path might be less predictable than the&strversions which usenatord.
License
This project is licensed under the Mozilla Public License Version 2.0 (MPL-2.0).
Contributing
Contributions are welcome! Please feel free to open an issue for bugs, feature requests, or questions. If you'd like to submit a pull request, please open an issue first to discuss the proposed changes.
When contributing, please ensure:
- Code is formatted using
cargo fmt. - Tests pass (
cargo test). - New functionality includes relevant tests.
- Clippy lints are addressed (
cargo clippy).