bitwheel 0.6.0

High-performance fixed capacity timer wheel
Documentation
# BitWheel

High-performance timer primitives for Rust, optimized for consistent low-latency in event-driven applications.

## Two Data Structures, Two Use Cases

| | BitWheel | IntervalHeap |
|--|----------|------------|
| **Purpose** | High-throughput ephemeral timers | Cancellable periodic tickers |
| **Examples** | Order timeouts, rate limits, retries | Heartbeats, monitoring sweeps, metrics flush |
| **Volume** | Thousands to millions | Tens to hundreds |
| **Handle stability** | Invalid after fire | Stable until cancelled |
| **Complexity** | O(1) insert/cancel/poll | O(log n) insert/remove, O(1) poll |

**Use BitWheel when**: You need maximum throughput for short-lived timers that fire once and disappear.

**Use IntervalHeap when**: You need periodic callbacks with stable handles for cancellation.

## Quick Start

### BitWheel — Ephemeral Timers
```rust
use bitwheel::{BitWheel, BalancedWheel, Timer};
use std::time::{Duration, Instant};

struct OrderTimeout {
    order_id: u64,
}

impl Timer for OrderTimeout {
    type Context = Vec<u64>;

    fn fire(&mut self, _now: Instant, ctx: &mut Self::Context) {
        ctx.push(self.order_id);
    }
}

fn main() {
    let mut wheel: Box<BalancedWheel<OrderTimeout>> = BitWheel::boxed();
    
    // Insert a timer
    let when = Instant::now() + Duration::from_millis(100);
    let handle = wheel.insert(when, OrderTimeout { order_id: 12345 }).unwrap();
    
    // Cancel if order fills before timeout
    if let Some(timer) = wheel.cancel(handle) {
        println!("Cancelled order {}", timer.order_id);
    }
    
    // Poll in event loop
    let mut expired_orders = Vec::new();
    wheel.poll(Instant::now(), &mut expired_orders).unwrap();
}
```

### IntervalHeap — Periodic Intervals
```rust
use bitwheel::IntervalHeap;
use std::time::{Duration, Instant};

fn main() {
    let mut tickers: IntervalHeap<(), 32> = IntervalHeap::new();
    
    // Register a heartbeat ticker
    let handle = tickers.insert(
        Duration::from_secs(30),
        Instant::now(),
        |_ctx| println!("heartbeat"),
    ).unwrap();
    
    // Poll in event loop — tickers automatically reschedule
    tickers.poll(Instant::now(), &mut ());
    
    // Cancel when connection closes
    tickers.remove(handle);
}
```

## BitWheel

### Features

- **O(1) operations** — insert, cancel, and poll are constant-time regardless of timer count
- **Predictable tail latency** — no cascading between gears eliminates latency spikes
- **Zero allocation on hot path** — all memory pre-allocated; trades memory for determinism
- **Bounded capacity by design** — insert failure signals backpressure, not a bug
- **Failover option** — `BitWheelWithFailover` gracefully degrades to BTreeMap when full
- **Compile-time configuration** — tune resolution, capacity, and range via const generics

### Preconfigured Wheels

| Type | Resolution | Range | Slot Cap | Use Case |
|------|------------|-------|----------|----------|
| `BalancedWheel` | 4ms | ~19 hours | 16 | General purpose |
| `BalancedFastWheel` | 4ms | ~19 hours | 64 | Minimal probing |
| `BalancedLightWheel` | 4ms | ~19 hours | 8 | Memory constrained |
| `PreciseWheel` | 1ms | ~4.7 hours | 16 | Sub-10ms timeouts |
| `BurstWheel` | 4ms | ~19 hours | 32 | High timer density |
| `ExtendedWheel` | 16ms | ~3 days | 16 | Long-running timers |

Each has a `WithFailover` variant (e.g., `BalancedWheelWithFailover`) that falls back to BTreeMap on overflow.

```rust
use bitwheel::{PreciseWheel, BurstWheelWithFailover, BitWheel, BitWheelWithFailover};

// 1ms resolution for tight timeouts
let mut precise: Box<PreciseWheel<MyTimer>> = BitWheel::boxed();

// High capacity with safety net
let mut burst: Box<BurstWheelWithFailover<MyTimer>> = BitWheelWithFailover::boxed();
```

### Design Philosophy

#### Insert Can Fail (And That's OK)

Unlike most timer implementations, `BitWheel::insert` returns `Result`. When slots overflow:
```rust
match wheel.insert(when, timer) {
    Ok(handle) => { /* timer scheduled */ }
    Err(InsertError(timer)) => {
        // Capacity exceeded — apply backpressure
        // This is a feature: your config doesn't match your workload
    }
}
```

For systems where dropping timers is unacceptable, use `BitWheelWithFailover`:
```rust
use bitwheel::{BalancedWheelWithFailover, BitWheelWithFailover};

let mut wheel: Box<BalancedWheelWithFailover<MyTimer>> = 
    BitWheelWithFailover::boxed();

// Never fails — overflows go to BTreeMap
let handle = wheel.insert(when, timer);
```

#### Precision vs. Predictability

Traditional hierarchical wheels cascade timers between levels as time advances. This causes latency spikes when many timers move simultaneously.

BitWheel trades precision for predictability:
- Timers in higher gears (longer delays) have coarser resolution
- A 1-hour timer might fire within a ~64ms window — acceptable for most use cases
- No cascading means no surprise latency spikes

#### Memory Tradeoff

BitWheel pre-allocates all slot storage at construction:

| Configuration | Approximate Memory |
|---------------|-------------------|
| `BalancedWheel` | ~200 KB |
| `BalancedFastWheel` | ~800 KB |
| `BurstWheel` | ~400 KB |

This trades memory for determinism — no allocator calls during insert/poll.

### How It's Fast

**Hierarchical gear design**: 64 slots per gear, each gear 64× coarser than the previous. 5 gears cover ~19 hours at 4ms resolution.

**Skip-ahead optimization**: Tracks `next_fire_tick` globally. Empty polls jump directly to the next timer instead of iterating idle slots.

**Bitmap-accelerated lookup**: Each gear maintains a 64-bit occupied bitmap. Finding the next non-empty slot is `rotate_right` + `trailing_zeros` — a few CPU cycles.

**Dirty tracking**: Gear metadata only recomputes when timers are removed, not on every poll.

**Fixed-size slabs with intrusive lists**: Each slot is a pre-allocated slab with O(1) insert/remove via intrusive linked lists. No pointer chasing through heap allocations.

**No cascading**: Timers stay in their original slot until they fire. Higher gears have lower precision, but zero overhead from timer migration.

**Power-of-2 arithmetic**: Resolution must be a power of 2, enabling bit shifts instead of division on the hot path.

### Advanced Configuration
```rust
use bitwheel::BitWheel;

// BitWheel<Timer, NUM_GEARS, RESOLUTION_MS, SLOT_CAP, MAX_PROBES>
type CustomWheel<T> = BitWheel<T, 4, 1, 64, 8>;
```

| Parameter | Default | Description |
|-----------|---------|-------------|
| `NUM_GEARS` | 5 | Hierarchy levels (more = longer max delay) |
| `RESOLUTION_MS` | 4 | Base tick resolution in milliseconds (must be power of 2) |
| `SLOT_CAP` | 32 | Timers per slot before probing |
| `MAX_PROBES` | 3 | Linear probe distance on collision |

### Capacity Planning

Each gear has 64 slots. When a slot fills to `SLOT_CAP`, inserts probe linearly to adjacent slots.

**Total capacity**: `64 × NUM_GEARS × SLOT_CAP` timers (theoretical max)

**Hotspot capacity**: `SLOT_CAP × MAX_PROBES` timers at the same target time

#### The Probing Tradeoff

| Config | Latency | Memory | Best For |
|--------|---------|--------|----------|
| Large `SLOT_CAP`, small `MAX_PROBES` | Tight tails | Higher | Latency-critical systems |
| Small `SLOT_CAP`, large `MAX_PROBES` | Variable tails | Lower | Memory-constrained systems |

**Why probing hurts tails**: Each probe is a potential cache miss and adds branches. With `MAX_PROBES=2`, insert is almost always single-slot. With `MAX_PROBES=16`, worst-case insert touches 16 slots.

#### Sizing for Your Workload

1. **Estimate hotspot density**: How many timers fire within one tick? (e.g., 1000 orders/sec × 100ms timeout = 100 timers per 1ms tick)

2. **Set `SLOT_CAP`** to handle typical density without probing

3. **Set `MAX_PROBES`** for burst headroom (2-4× typical density)

4. **Use failover** if drops are unacceptable: `BitWheelWithFailover` catches overflow in a BTreeMap

---

## IntervalHeap

A min-heap for low-volume periodic tickers with stable handles.

### Features

- **Stable handles** — Handle remains valid through any number of fires until explicitly removed
- **O(log n) insert/remove** — Heap operations scale logarithmically  
- **O(1) poll** — Peek at min is constant time; fires are O(log n) each
- **Zero allocation** — Fixed-capacity array, no heap alloc on hot path
- **Automatic rescheduling** — Intervals reschedule themselves after firing
- **Configurable capacity** — Const generic for compile-time sizing

### API
```rust
use bitwheel::IntervalHeap;
use std::time::{Duration, Instant};

// Create with capacity for 32 tickers
let mut heap: IntervalHeap<MyContext, 32> = IntervalHeap::new();

// Insert a ticker (period, first fire time, callback)
let handle = heap.insert(
    Duration::from_secs(30),
    Instant::now() + Duration::from_secs(30),
    |ctx| { /* heartbeat logic */ },
).unwrap();

// Poll fires due tickers and reschedules them
let fired_count = heap.poll(Instant::now(), &mut ctx);

// Remove when no longer needed
heap.remove(handle);

// Query state
heap.is_active(handle);      // Check if handle is still valid
heap.duration_until_next();  // Time until next fire
heap.len();                  // Number of active tickers
```

### Typical Use Cases

| Interval | Period | Example |
|--------|--------|---------|
| Venue heartbeat | 30s | Exchange liveness check |
| Position sweep | 1s | Reconciliation scan |
| Metrics flush | 10s | Push to monitoring |
| Connection keepalive | 60s | TCP keepalive |
| Risk check | 100ms | Portfolio limits |

### Capacity Guidelines

IntervalHeap is designed for small n (tens to low hundreds). For a trading system:

```rust
// Typical ticker inventory:
// - 5 venue heartbeats
// - 1 position sweep  
// - 1 metrics flush
// - 1 risk check
// - ~10 spare
// Total: ~20 tickers

type TradingIntervals<Ctx> = IntervalHeap<Ctx, 32>;  // Comfortable headroom
```

---

## Benchmarks

Measured with HDR histograms (100k warmup, 1M iterations, `--release`).

### BitWheel — Ephemeral Timer Performance

#### Realistic Trading Simulation

Order timeout management: 80% inserts, 5% cancels, continuous polling.

| Operation | BitWheel | BitWheel+Failover | BTreeMap | HHWT |
|-----------|----------|-------------------|----------|------|
| Insert p50 | **24 ns** | 27 ns | 42 ns | 28 ns |
| Insert p99.9 | **61 ns** | 66 ns | 79 ns | 87 ns |
| Poll p50 | **15 ns** | 17 ns | 17 ns | 19 ns |
| Poll p99.9 | **61 ns** | 65 ns | 149 ns | 1515 ns |
| Cancel p50 | **14 ns** | 14 ns | 54 ns | 20 ns |
| Cancel p99.9 | **33 ns** | 36 ns | 156 ns | 45 ns |

#### Interleaved Insert Latency

Timers inserted between existing entries — stresses tree rebalancing.

| Percentile | BitWheel | BitWheel+Failover | BTreeMap | HHWT |
|------------|----------|-------------------|----------|------|
| p50 | **26 ns** | 27 ns | 34 ns | 39 ns |
| p90 | **28 ns** | 31 ns | 49 ns | 54 ns |
| p99 | **34 ns** | 50 ns | 100 ns | 278 ns |
| p99.9 | **48 ns** | 75 ns | 131 ns | 990 ns |
| p99.99 | **131 ns** | 149 ns | 201 ns | 2415 ns |

### IntervalHeap — Periodic Interval Performance

#### Heartbeat Scenario (5 venue heartbeats, continuous operation)

| Operation | p50 | p99 | p99.9 |
|-----------|-----|-----|-------|
| Insert | 15 ns | 35 ns | 53 ns |
| Poll | 12 ns | 99 ns | 134 ns |
| Remove | 15 ns | 30 ns | 52 ns |

#### Steady State (10 tickers firing @ 1ms)

| Percentile | Latency |
|------------|---------|
| p50 | 119 ns |
| p90 | 181 ns |
| p99 | 192 ns |
| p99.9 | 631 ns |

Comparisons:
- **BTreeMap**: `BTreeMap<(u64, u32), T>` - similar usage on async runtimes
- **HHWT**: [hierarchical_hash_wheel_timer](https://crates.io/crates/hierarchical_hash_wheel_timer) v1.3

---

## Event Loop Integration
```rust
use bitwheel::{BitWheel, BalancedWheel, IntervalHeap, Timer};
use std::time::{Duration, Instant};

struct TimerSystem<T: Timer> {
    wheel: Box<BalancedWheel<T>>,
    tickers: IntervalHeap<T::Context, 32>,
}

impl<T: Timer> TimerSystem<T> {
    fn poll(&mut self, now: Instant, ctx: &mut T::Context) {
        // Poll tickers first (heartbeats are time-critical)
        self.tickers.poll(now, ctx);
        
        // Then ephemeral timers
        if let Err(e) = self.wheel.poll(now, ctx) {
            eprintln!("{} timers lost to overflow", e.0);
        }
    }
    
    fn duration_until_next(&self) -> Option<Duration> {
        let wheel_next = self.wheel.duration_until_next();
        let ticker_next = self.tickers.duration_until_next();
        
        match (wheel_next, ticker_next) {
            (Some(w), Some(t)) => Some(w.min(t)),
            (Some(d), None) | (None, Some(d)) => Some(d),
            (None, None) => None,
        }
    }
}
```

## License

MIT OR Apache-2.0