rusted-ring
A high-performance, cache-optimized ring buffer library for Rust, designed for lock-free, zero-copy event processing with dual patterns for different use cases: smart pointer semantics for multi-actor sharing and sequential processing for high-throughput pipelines.
Features
- Cache-line aligned ring buffers for optimal CPU cache performance
- Lock-free operations using atomic memory ordering
- T-shirt sized pools for different event categories (XS, S, M, L, XL)
- Zero-copy operations with Pod/Zeroable support
- Dual processing patterns: RingPtr for sharing, Reader/Writer for sequential throughput
- Reference counting for safe slot reuse across multiple consumers
- Mobile optimized for ARM and x86 architectures with device-aware pool sizing
- LMAX-inspired sequential processing for ultra-low latency
Core Architecture: Two Complementary Patterns
Pattern A: RingPtr - Multi-Actor Event Sharing
For scenarios where events need to be shared across multiple actors with variable processing speeds and lifetimes:
use ;
// Allocate event in ring buffer
let allocator = new;
let ring_ptr: = allocator.allocate_m_event?;
// Share across multiple actors (reference counting)
let shared_ptr1 = ring_ptr.clone; // Database writer
let shared_ptr2 = ring_ptr.clone; // Search indexer
let shared_ptr3 = ring_ptr.clone; // Analytics processor
let shared_ptr4 = ring_ptr; // Audit logger
// Send to different actors
database_tx.send?;
indexing_tx.send?;
analytics_tx.send?;
audit_tx.send?;
// Each actor processes independently, slot freed when all finish
Ideal for:
- Multi-actor systems with fan-out processing
- Variable processing speeds across consumers
- Cross-system boundaries (FFI, networking)
- Event persistence and replication
- Complex routing and error handling
- Systems requiring reference counting semantics
Pattern B: Reader/Writer - Sequential High-Throughput
For scenarios requiring maximum throughput with predictable, sequential processing:
use ;
use Arc;
// Create ring buffer for high-throughput pipeline
let ring = new;
let mut writer = new;
let mut reader = new;
// Producer thread: Ultra-fast sequential writing
spawn;
// Consumer thread: Sequential processing pipeline
spawn;
Ideal for:
- Sequential operator pipelines
- Batch processing workflows (sort → fold → reduce)
- High-frequency data streaming
- Real-time systems with strict ordering requirements
- Single-producer, single-consumer scenarios
- Maximum throughput applications
Core Types
PooledEvent
Fixed-size event structure optimized for zero-copy operations:
// Example: Custom event type conversion
RingPtr - Smart Pointer to Ring Buffer Slots
Acts like Arc<T> but points to ring buffer memory instead of heap:
use ;
// Allocation
let ring_ptr: = allocator.allocate_s_event?;
// Sharing (increments reference count)
let shared_ptr = ring_ptr.clone;
// Access (zero-copy deref to ring buffer slot)
let event_data = &ring_ptr.data;
let event_type = ring_ptr.event_type;
// Automatic cleanup when all RingPtrs drop
Dual Ring Buffer Architecture
T-Shirt Sizing for Event Categories
Pre-defined event sizes for optimal memory usage:
// Automatic size selection
let size = estimate_size;
match size
Performance Characteristics by Pattern
Multi-Actor Sharing (RingPtr)
- Allocation: ~10-50ns per event vs ~100-500ns malloc
- Sharing: ~1-3 CPU cycles per clone (atomic ref count)
- Fan-out: Zero-copy distribution to N actors
- Memory: Predictable pools, no heap fragmentation
- Cleanup: Automatic when all references drop
Sequential Processing (Reader/Writer)
- Throughput: 10-50x faster than channel-based pipelines
- Latency: Microsecond-level event processing
- Backpressure: Natural ring buffer full detection
- Cache: Optimized sequential access patterns
- Ordering: Strict FIFO processing guarantees
Usage Examples
Example 1: Event Distribution System
use ;
use Arc;
// Initialize global allocator
static ALLOCATOR: LazyLock =
new;
// Event wrapper for your application
// Incoming event handler
// Actor processing
Example 2: High-Throughput Data Pipeline
use ;
use Arc;
// Pipeline stages
Example 3: Batch Processing with Ring Buffers
use ;
// Efficient batch processing using sequential ring buffer access
Memory Requirements by Configuration
Default Configuration (~2.5MB total)
XS: 64B × 2000 = 128KB
S: 256B × 1000 = 256KB
M: 1KB × 300 = 307KB
L: 4KB × 60 = 245KB
XL: 16KB × 15 = 245KB
Mobile Optimized (~600KB total)
XS: 64B × 500 = 32KB
S: 256B × 250 = 64KB
M: 1KB × 100 = 100KB
L: 4KB × 20 = 80KB
XL: 16KB × 5 = 80KB
High-Throughput Server (~8MB total)
XS: 64B × 4000 = 256KB
S: 256B × 2000 = 512KB
M: 1KB × 600 = 614KB
L: 4KB × 120 = 491KB
XL: 16KB × 30 = 491KB
Device-Aware Initialization
use ;
// Automatic device detection and optimization
// Custom configuration
let custom_config = DeviceConfig ;
let allocator = new_with_config;
When to Use Each Pattern
Use RingPtr when:
- Events need to be shared across multiple actors
- Processing speeds vary significantly between consumers
- You need reference counting semantics
- Events cross system boundaries (FFI, networking)
- Complex routing and error handling is required
- Fan-out distribution patterns
Use Reader/Writer when:
- Maximum throughput is the primary goal
- Processing follows strict sequential ordering
- Single-producer, single-consumer scenarios
- Batch processing workflows
- Real-time systems with predictable latency requirements
- Pipeline architectures with multiple stages
Compile-time Safety
Built-in guards prevent stack overflow from oversized ring buffers:
const MAX_STACK_BYTES: usize = 1_048_576; // 1MB stack limit
// Compile-time size validation
const _STACK_GUARD: = ;
Memory Ordering & Safety
Careful memory ordering ensures lock-free safety:
- Writers: Use
Releaseordering when updating cursors - Readers: Use
Acquireordering when reading positions - Reference counting: Uses
AcqRelordering for atomicity - Slot reuse: Protected by generation numbers (ABA prevention)
- Cache optimization: All structures are 64-byte aligned
License
MPL-2.0