# Overclocked Sort (v0.2.1)
A hyper-optimized, auto-adaptive, and zero-dependency Parallel Hybrid Counting/Radix Sort for Rust.
Built explicitly for **High-Performance Computing (HPC)**, **Big Data (1B+ records)**, and **L2 Cache efficiency**.
It dynamically chooses between parallel counting sort (for dense ranges) and comparison-based fallback paths (for sparse or pattern-heavy ranges).
## 🚀 Features
- **Generic API**: Supports `u32`, `i32`, and custom objects (like `KeyPtr`) via the `SortableKey` trait.
- **Zero-Dependency Parallelism**: Leverages pure `std::thread::scope` for Thread-pool-less parallel data processing. No `rayon` overhead.
- **Pattern-Aware Routing**: Uses lightweight probes for monotonic/near-monotonic inputs and routes them to the most suitable path.
- **Parallel Histogram & Scatter**: Uses parallel prefix-sum arrays and concurrent Memory Scatter for extreme memory bandwidth saturation.
- **Edge-Case Guardrails**: Handles already-sorted, reverse-sorted, almost-sorted, and all-equal inputs near parity with `std::sort_unstable`.
## 📦 Installation
Add this to your `Cargo.toml`:
```toml
[dependencies]
overclocked_sort = "0.2.1"
```
## 🛠 Usage
### 1. Sorting Primitives (In-place)
```rust
use overclocked_sort::overclocked_sort;
fn main() {
let mut arr: Vec<i32> = vec![999, 1, 54, 2, 7000, 10];
// Sort directly in-place!
overclocked_sort(&mut arr);
assert_eq!(arr, vec![1, 2, 10, 54, 999, 7000]);
}
```
### 2. Sorting Objects with Payloads (`KeyPtr`)
When you need to sort indices, database records, or pointer addresses (scatter-gathering):
```rust
use overclocked_sort::{overclocked_sort, KeyPtr};
fn main() {
let mut db_records = vec![
KeyPtr { key: 5, ptr: 0xFFFF_0001 },
KeyPtr { key: 1, ptr: 0xFFFF_0002 },
];
overclocked_sort(&mut db_records);
assert_eq!(db_records[0].key, 1);
assert_eq!(db_records[1].key, 5);
}
```
## 📊 Benchmarks vs `std::sort_unstable`
| **random_dense (2M)** | **24.242 ms** | 40.806 ms | **1.683x** |
| **random_sparse (2M)** | **34.193 ms** | 53.466 ms | **1.564x** |
| **already_sorted (2M)** | 0.908 ms | **0.866 ms** | 0.954x |
| **reverse_sorted (2M)** | 1.458 ms | **1.414 ms** | 0.970x |
| **all_equal (2M)** | **0.882 ms** | 0.888 ms | **1.007x** |
| **sawtooth_1024 (2M)** | **9.708 ms** | 22.109 ms | **2.277x** |
Detailed multi-case benchmark table is available in `benchmark_results_matrix.md`.
## ⚙️ Pseudocode (v0.2.1)
```text
function overclocked_sort(arr):
if len(arr) <= 1:
return
# 1) Lightweight probe to quickly detect monotonic-like inputs.
probe_desc_breaks = sampled_desc_breaks(arr)
if probe_desc_breaks == 0 or probe_desc_breaks >= probe_points - 2:
std_sort_unstable(arr)
return
# 2) Single pass: monotonic flags + min/max + desc_breaks.
stats = scan_keys(arr)
if stats.non_decreasing:
return
if stats.non_increasing:
reverse(arr)
return
range = stats.max_key - stats.min_key + 1
max_limit = min(1_000_000, max(1, len(arr)/4))
# 3) Near-sorted sparse-ish patterns are delegated to std PDQ path.
if range > max_limit and is_near_monotonic(stats.desc_breaks, len(arr)):
std_sort_unstable(arr)
return
# 4) Dense range uses parallel counting, otherwise radix fallback.
if range <= max_limit:
if primitive_type(arr):
parallel_counting_fill(arr, stats.min_key, range)
else:
parallel_counting_scatter(arr, stats.min_key, range)
else:
radix_sort_by_key(arr)
```
## 📄 License
MIT License. See `LICENSE` for details.