sliding-ring 0.1.0

# Sliding Ring

[![Crates.io](https://img.shields.io/crates/v/sliding-ring.svg)](https://crates.io/crates/sliding-ring)
[![lib.rs](https://img.shields.io/badge/lib.rs-sliding--ring-orange.svg)](https://lib.rs/crates/sliding-ring)
[![Documentation](https://docs.rs/sliding-ring/badge.svg)](https://docs.rs/sliding-ring)
[![License](https://img.shields.io/crates/l/sliding-ring.svg)](#license)
[![Downloads](https://img.shields.io/crates/d/sliding-ring.svg)](https://crates.io/crates/sliding-ring)

`sliding-ring` provides a cache-friendly, allocation-free ring buffer that keeps its slots indexed relative to a moving anchor coordinate. It was extracted from our high-performance orderbook engine and generalized so any `Sized` slot type can participate in consistent sliding-window semantics.

## Installation

```bash
cargo add sliding-ring
# Enable the branchless clearing routines if you need predictably low latency clears.
cargo add sliding-ring --features branchless
```

The crate ships with a `README.md`-driven `cargo doc` experience, so `cargo doc --open -p sliding-ring` gives you rendered API docs locally.

## Why this exists
Traditional ring buffers optimize for FIFO semantics, but exchange and telemetry workloads often care about the *best* coordinate (highest bid, current time bucket, etc.) and need random access around that best. `SlidingRing` keeps a 256-slot ring that re-anchors in O(1) time, clearing the newly entered slots while preserving the rest. This enables predictable performance, avoids per-slot allocations, and keeps the CPU cache warm even under heavy churn.

## When to reach for it
- Aggregated orderbooks keyed by discrete price ticks.
- Sliding time buckets for counters, rate-limiters, or observability windows.
- Any moving reference window where only ±255 offsets from the anchor matter.

## When it is the wrong tool
- You need more than 256 simultaneous offsets (use a wider structure or a map).
- The anchor jumps sparsely across a huge domain, making the clearing work wasteful.
- Slots own large heap allocations that would be expensive to recreate on every clear (store handles or IDs instead).

## Model
- `anchor` denotes the absolute coordinate represented by the pointer slot.
- Moving the anchor by `k` steps rewires indices via `(pointer + k) mod 256`.
- Small moves clear only the slots that just became visible; moves with `|k| >= 256` clear the entire ring.
- [`shift_to_keep_new_base`](https://docs.rs/sliding-ring/latest/sliding_ring/struct.SlidingRing.html#method.shift_to_keep_new_base) preserves the new base slot when scanning for the next best coordinate.

## Performance guardrails
- **No dynamic dispatch:** `SlidingRing` is generic over `T: Slot`, so every slot type is monomorphized at compile time and the compiler inlines your `Slot` methods. Do not box `dyn Slot`; the struct stores `[T; 256]`, meaning `T` must stay `Sized`.
- **Anchor slices for SIMD:** `slices_from_anchor*` return the two contiguous spans starting at the pointer so you can run vectorized scans or clears without modulo math. This is the primary primitive for ultra-low-latency users.
- **Stable SIMD workflow:** Pair `slices_from_anchor*` with the [`wide`](https://crates.io/crates/wide) crate (see the `simd_scan` example below) to stay on stable Rust while still using vector instructions.
- **Raw-pointer iterators:** `iter()` / `iter_mut()` walk storage order, while `iter_from_anchor*` emit `(price, slot)` in buy/sell order. Both rely on pointer arithmetic, so they stay zero-cost abstractions.
- **Inline hot paths:** ring methods and bundled `Slot` implementations are annotated with `#[inline(always)]` so LLVM can flatten the tiny helpers.
- **Memory over CPU:** we willingly dedicate 256 slots per ring to keep operations branch-predictable. If memory becomes tighter than latency, this crate is not the right fit.

### Ultra-low-latency workflow
1. Call `slices_from_anchor*()` to obtain the tail/head spans.
2. Run SIMD loads/stores or hand-unrolled loops directly over those spans (tail first, then head) to scan for non-zero slots, update counters, etc.
3. If you also need absolute coordinates, pair the SIMD scan with `iter_from_anchor*()`—they share the same traversal order so you can map SIMD findings back to prices with minimal bookkeeping.

## Slot trait
Every slot type must implement the lightweight [`Slot`](https://docs.rs/sliding-ring/latest/sliding_ring/trait.Slot.html) trait, which defines how to create and clear a value. The crate ships ready-made implementations for:
- All integer primitives (`i*`, `u*`, and `usize`/`isize`).
- `Option<T>` (clears back to `None`).
- `Cell<T>` where `T: Slot + Copy` so interior-mutable slots can be hosted.

Custom data can implement `Slot` to retain domain-specific state or perform branchless clears (e.g., resetting SIMD lanes, bitsets, etc.).

## Example
```rust
use sliding_ring::{SlidingRing, Slot};

#[derive(Default)]
struct Bucket {
    hits: u64,
}

impl Slot for Bucket {
    fn init() -> Self {
        Self { hits: 0 }
    }

    fn clear(&mut self) {
        self.hits = 0;
    }
}

let mut ring = SlidingRing::<Bucket>::new(10_000);
ring.slot_mut(10_005).hits += 1;
ring.shift_to(10_005);
assert_eq!(ring.slot_at_pointer().hits, 1);
```

### Telemetry counter
```rust
use sliding_ring::{Direction, SlidingRing, Slot};

#[derive(Default, Clone, Copy)]
struct Bucket { hits: u64 }

impl Slot for Bucket {
    fn init() -> Self { Self { hits: 0 } }
    fn clear(&mut self) { self.hits = 0; }
}

let mut ring = SlidingRing::<Bucket>::new(0);
ring.slot_mut(5).hits += 1;
ring.slot_mut(6).hits += 3;
let sum: u64 = ring
    .iter_from_anchor(Direction::Forward)
    .map(|(_, bucket)| bucket.hits)
    .sum();
assert_eq!(sum, 4);
```

### SIMD scanning
```rust,ignore
use sliding_ring::SlidingRing;
use wide::u64x4;

let mut ring = SlidingRing::<u64>::new(1_000);
for i in 0..256 {
    *ring.slot_mut(1_000 + i as i128) = i as u64;
}
let (tail, head) = ring.slices_from_anchor();
let needle = u64x4::splat(42);

for chunk in tail.chunks_exact(4).chain(head.chunks_exact(4)) {
    let vec = u64x4::new([chunk[0], chunk[1], chunk[2], chunk[3]]);
    if vec
        .simd_eq(needle)
        .to_array()
        .iter()
        .any(|&lane| lane == u64::MAX)
    {
        println!("found 42");
        break;
    }
}
```

Run `cargo run --example telemetry` or `cargo run --example simd_scan` for the full samples. The SIMD example relies on the `wide` crate, so it stays on stable Rust while still demonstrating contiguous-slice scanning.

## Documentation
- API reference on [docs.rs](https://docs.rs/sliding-ring).
- Run `cargo doc --open -p sliding-ring` for local browsing.

## Feature flags
- `branchless` – enables branchless variants of the re-anchoring routines. They cost a few extra instructions per slot but avoid unpredictable branches when clearing. Enable with `--features branchless` when compiling/testing.

## Status
This crate is extracted from production systems but is still evolving. Expect additions such as configurable ring widths, iterators, and more slot utilities as we work toward a polished open-source release.

## Examples
- `cargo run --example telemetry` – basic per-slot counter with anchor-ordered reduction.
- `cargo run --example simd_scan` – demonstrates splitting the ring into contiguous spans and scanning with SIMD on stable via the `wide` crate.

## Benchmarks
- `cargo bench -p sliding-ring` runs the Criterion suite measuring anchor shifts and slot updates.
- Bazel users can run `bazel run //lib/collections/sliding-ring:sliding_ring_bench` (tagged `benchmark`) to launch the same binary.

## License
Licensed under either of

- Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)

at your option.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in `sliding-ring` by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.