Overclocked Sort (v0.2.0)
A hyper-optimized, auto-adaptive, and zero-dependency Parallel Hybrid Counting/Radix Sort for Rust. Built explicitly for High-Performance Computing (HPC), Big Data (1B+ records), and L2 Cache efficiency.
It dynamically chooses between highly-concurrent Parallel Counting Sort (for Dense ranges) and std::slice::sort_unstable (for Sparse ranges) in real-time, completely preventing Cache Thrashing and Out-of-Memory (OOM) Panics.
🚀 Features
- Generic API (New in v0.2.0): Supports
u32,i32,i64, and custom objects (likeKeyPtr) out of the box via theSortableKeytrait. - Zero-Dependency Parallelism: Leverages pure
std::thread::scopefor Thread-pool-less parallel data processing. Norayonoverhead. - Data-Aware Hybrid Partitioning: Scans data dynamically $O(N)$ and splits massive inputs (like $10^9$ elements uniformly distributed) into a Dense Stream (resolved by Overclocked Counting in milliseconds) and a Sparse Stream (resolved by In-place Quicksort).
- Parallel Histogram & Scatter: Uses parallel prefix-sum arrays and concurrent Memory Scatter for extreme memory bandwidth saturation.
- Cache-Oblivious: Avoids L3 cache misses intrinsically related to standard Radix Sort variants by allocating localized count buckets.
📦 Installation
Add this to your Cargo.toml:
[]
= "0.2.0"
🛠 Usage
1. Sorting Primitives (In-place)
use overclocked_sort;
2. Sorting Objects with Payloads (KeyPtr)
When you need to sort indices, database records, or pointer addresses (scatter-gathering):
use ;
📊 Benchmarks vs std::sort_unstable
| Test Case (Dataset) | Overclocked Sort (v0.2.0) | std::sort_unstable |
Diff |
|---|---|---|---|
| Dense Data (50M) | 100.2 ms | 510.1 ms | ~5.1x Faster |
| Sparse Data (10M) | 161.0 ms | 144.3 ms | Hybrid Fallback Safe |
| Already Sorted (50M) | 55.2 ms | 12.8 ms | Fast $O(N)$ check |
| Structures (20M KeyPtrs) | 328.3 ms | 351.9 ms | Memory Bandwidth Limit |
| Massive Mixed (100M) | 5.5 s | > 8.0 s | In-place Partitioning |
Test Spec: CPU Core i7/i9 equivalent, Rust 1.70+, Release mode
⚙️ Architecture: Dual-Stream Coordinator
If the system detects that allocating buckets would exceed ~OOM limits, it runs a Two-Pointer In-place Partition:
Dense Stream: Data< Limitsmapped to Left. Solved concurrently by Parallel Counting Sort.Sparse Stream: Data> Limitsmapped to Right. Delegated safely to PDQSort / Unstable Sort.
📄 License
MIT License. See LICENSE for details.