# Natural Lexicographic Order Sort
[](https://crates.io/crates/natlex_sort)
[](https://docs.rs/natlex_sort)
[](https://opensource.org/licenses/MPL-2.0)
Provides hybrid sorting logic for Rust, combining natural and lexicographical ordering for strings (`&str`) and byte slices (`&[u8]`).
## Overview
Standard sorting methods often fall short when dealing with lists containing both fixed-format identifiers and human-readable names with embedded numbers.
* **Lexicographical sort:** Works well for fixed-length IDs (e.g., `id-001`, `id-002`) but sorts filenames counter-intuitively (`file10.txt` before `file2.txt`).
* **Natural sort:** Handles filenames correctly (`file2.txt` before `file10.txt`) but might not be the desired behaviour for fixed-length, zero-padded IDs where lexicographical order is preferred.
`natlex_sort` offers a hybrid approach to address this:
1. Items of the **same length** are compared **lexicographically**.
2. Items of **different lengths** are compared using **natural ordering**.
This allows fixed-length IDs to sort byte-wise while variable-length names are sorted naturally.
Case-sensitive and case-insensitive variants are provided. The string-based natural sorting relies on the excellent [`natord`](https://crates.io/crates/natord) crate.
## Features
* Hybrid comparison functions for `&str` and `&[u8]`.
* Case-sensitive (`nat_lex_cmp`, `nat_lex_byte_cmp`) and case-insensitive (`nat_lex_cmp_ignore`, `nat_lex_byte_cmp_ignore`) variants.
* Convenience functions to sort slices in place (`nat_lex_sort`, `nat_lex_sort_bytes`, etc.).
* `NatLexOrderedString` wrapper type for use with ordered collections like `BTreeMap`.
* Relies on `natord` for robust natural sorting of strings.
* Custom byte-slice natural comparison logic suitable for ASCII text with numbers.
## Installation
Add `natlex_sort` to your `Cargo.toml`:
```toml
[dependencies]
natlex_sort = "0.1.0" # Replace with the latest version from crates.io
```
## Usage Examples
```rust
use natlex_sort::{nat_lex_sort, nat_lex_sort_ignore_case, NatLexOrderedString};
use std::collections::BTreeMap;
// --- Sorting Mixed Strings ---
let mut items = vec![
"item10",
"item2",
"id_002", // Same length as id_001
"id_001",
"Item1", // Different length from item2/item10
];
// Case-sensitive sort
nat_lex_sort(&mut items);
// Expected: "Item1", "id_001", "id_002", "item2", "item10"
// - "id_001", "id_002" (same length -> lexicographical)
// - "Item1", "item2", "item10" (different lengths -> natural, 'I' < 'i')
// - Group comparison: "Item1" vs "id_001" (diff length -> natural, 'I' < 'i')
assert_eq!(items, vec!["Item1", "id_001", "id_002", "item2", "item10"]);
// --- Case-Insensitive Sorting ---
let mut items_ci = vec![
"item10",
"item2",
"ID_002", // Different case
"id_001",
"Item1",
];
nat_lex_sort_ignore_case(&mut items_ci);
// Expected: "ID_002", "id_001", "Item1", "item2", "item10"
// - "ID_002", "id_001" (same length -> case-insensitive lex, 'D' < 'd' tie-breaker)
// - "Item1", "item2", "item10" (different lengths -> case-insensitive natural)
// - Group comparison: "ID_002" vs "Item1" (diff length -> natural ignore case, 'I'=='I', then 'D' < 't')
assert_eq!(items_ci, vec!["ID_002", "id_001", "Item1", "item2", "item10"]);
// --- Using the Wrapper Type ---
let mut map = BTreeMap::new();
// Keys are ordered using case-sensitive nat_lex_cmp
map.insert(NatLexOrderedString::from("config_10"), "Value C");
map.insert(NatLexOrderedString::from("id_abc"), "Value A"); // Same length
map.insert(NatLexOrderedString::from("id_abd"), "Value B"); // Same length
map.insert(NatLexOrderedString::from("config_2"), "Value D");
```
## API Documentation
Detailed API documentation can be found on [docs.rs](https://docs.rs/natlex_sort).
The main components are:
* **Comparison Functions:**
* `nat_lex_cmp`, `nat_lex_cmp_ignore` (for `&str`)
* `nat_lex_byte_cmp`, `nat_lex_byte_cmp_ignore` (for `&[u8]`)
* **Sorting Functions:**
* `nat_lex_sort`, `nat_lex_sort_ignore_case` (for `&[T: AsRef<str>]`)
* `nat_lex_sort_bytes`, `nat_lex_sort_bytes_ignore_case` (for `&[&[u8]]`)
* **Wrapper Type:**
* `NatLexOrderedString` (owning `String` wrapper using `nat_lex_cmp`)
## Behavior Notes
* **Case-Insensitive Tie-Breaking:** When comparing items of the same length using the `_ignore_case` variants, if the items are identical ignoring case (e.g., `"FileA"` vs `"filea"`), a standard **case-sensitive** comparison (`a.cmp(b)`) is used as a tie-breaker to ensure a stable, total order.
* **Byte Slice Logic:** The byte comparison functions (`nat_lex_byte_cmp`, `nat_lex_byte_cmp_ignore`) implement custom natural sorting logic suitable for ASCII text with embedded numbers. See the documentation for details. Behavior with non-ASCII bytes in the natural sorting path might be less predictable than the `&str` versions which use `natord`.
## License
This project is licensed under the [Mozilla Public License Version 2.0](https://opensource.org/licenses/MPL-2.0) (MPL-2.0).
## Contributing
Contributions are welcome! Please feel free to open an issue for bugs, feature requests, or questions. If you'd like to submit a pull request, please open an issue first to discuss the proposed changes.
When contributing, please ensure:
* Code is formatted using `cargo fmt`.
* Tests pass (`cargo test`).
* New functionality includes relevant tests.
* Clippy lints are addressed (`cargo clippy`).