djbsort 0.2.0

Constant-time sorting network (djbsort) with SIMD optimization
Documentation
# djbsort

A Rust port of the [Zig implementation](https://github.com/jedisct1/zig-djbsort) of DJB's constant-time sorting network ([sorting.cr.yp.to](https://sorting.cr.yp.to/)).

The sorting network is data-oblivious: the sequence of comparisons and swaps depends only on the array length, never on the values. This makes it immune to timing side-channels, which matters when sorting sensitive data in cryptographic contexts.

All code paths are constant-time. There are no non-CT options.

## Usage

The crate provides two functions.

`sort` is the fast path for numeric primitives. It uses SIMD vectorization (NEON on aarch64, AVX2 on x86_64) with a scalar fallback:

```rust
use djbsort::{sort, Order};

let mut data = [42i32, -7, 13, 0, -99];
sort(&mut data, Order::Asc);
assert_eq!(data, [-99, -7, 0, 13, 42]);
```

All Rust numeric types are supported: `i8` through `i128`, `u8` through `u128`, `isize`, `usize`, `f32`, and `f64`.

`sort_by` handles arbitrary types with a user-supplied comparator. It uses the same network topology and swaps bytes via XOR with an assembly barrier to prevent the compiler from introducing branches. The type must implement `CtSwappable` (all bit patterns valid, no padding), and the result is timing-safe as long as your comparator is too:

```rust
use djbsort::sort_by;

let mut keys = [5u64, 3, 8, 1, 4];
sort_by(&mut keys, |a, b| a < b);
assert_eq!(keys, [1, 3, 4, 5, 8]);
```

## Float ordering

Floats are sorted using a total order: `-NaN < -inf < ... < -0.0 < +0.0 < ... < +inf < +NaN`. Internally they are bitcast to integers via the "useint" technique, sorted as integers, and bitcast back. This means floats sort at the same speed as same-width integers.

## Constant-time guarantees

On x86_64, x86, and aarch64, scalar `min`/`max` operations compile to constant-time instructions (cmov, csel). On other architectures, the library uses XOR-masked arithmetic with an assembly barrier to avoid branches. The generic `sort_by` path always uses XOR-masked byte swaps regardless of architecture.