arena-b
arena-b is a high-performance bump allocator / arena allocator and memory pool crate for Rust. It is designed for allocation-heavy workloads such as parsers, compilers, game engines, simulations, and data processing, while keeping a clean, idiomatic Rust API.
The core type is arena_b::Arena, a bump allocator that lets you allocate many values cheaply and reclaim them all at once when the arena is reset or dropped.
✨ v0.5.0 Highlights
- 🚀 Fast Reset API: Arena checkpoint functionality for frame-based patterns
- 🛡️ Memory Safety: Debug guards and use-after-rewind detection
- 💾 Virtual Memory: Reserve/commit pattern using VirtualAlloc/mmap for large allocations
- 🧵 Thread-Local Caching: Per-thread allocation buffers for reduced contention
- ⚡ Lock-Free Optimizations: Atomic operations for better concurrent performance
- 🔄 100% Backward Compatible: Drop-in upgrade from v0.4.x
Installation
arena-b is published on crates.io.
Add it to your Cargo.toml:
[]
= "0.5"
Or, using cargo add:
Feature Flags
arena-b provides several optional features that can be enabled for specific use cases:
[]
= { = "0.5", = ["debug", "virtual_memory", "thread_local", "lockfree"] }
Available features:
debug(default: disabled): Memory safety debugging with guards and use-after-rewind detectionvirtual_memory(default: disabled): Virtual memory strategy for large arena allocationsthread_local(default: disabled): Per-thread allocation buffers for reduced contentionlockfree(default: disabled): Lock-free optimizations for better concurrent performancestats(default: enabled): Per-allocation statistics tracking
Using a local checkout (for contributors)
If you are hacking on arena-b itself, depend on it via a local path:
[]
= { = "../bumper" }
Getting started
The simplest way to use arena-b is with an arena:
use Arena;
🚀 v0.5.0: Fast Reset API
Use the new checkpoint functionality for frame-based allocation patterns:
use Arena;
🛡️ v0.5.0: Memory Safety Debugging
Enable debug mode for memory safety validation:
use Arena;
💾 v0.5.0: Virtual Memory Strategy
Use virtual memory for large arena allocations:
use Arena;
🧵 v0.5.0: Thread-Local Caching
Enable thread-local caching for reduced contention:
use Arena;
⚡ v0.5.0: Lock-Free Optimizations
Enable lock-free operations for better concurrent performance:
use Arena;
Scoped allocations
Use Arena::scope to allocate many temporary values and free them all at once:
use Arena;
Pool allocator
Use Pool<T> when you have many values of the same type that are reused:
use Pool;
// `name` is returned to the pool on drop
Features
-
Bump arena (
Arena)alloc,alloc_defaultalloc_slice_copy,alloc_slice_uninit,alloc_str- Multi-chunk growth when the arena is full
scopeAPI for scoped allocations with automatic reclamationreset,stats, andbytes_allocated- NEW v0.5.0: Fast reset API with checkpoints
- NEW v0.5.0: Memory safety debugging
- NEW v0.5.0: Virtual memory strategy
- NEW v0.5.0: Thread-local caching
- NEW v0.5.0: Lock-free optimizations
-
Configurable arenas (
ArenaBuilder)- Control
initial_capacity - Hooks for future
chunk_sizeandthread_safeconfiguration
- Control
-
Pool allocator (
Pool<T>)- Slot-based allocator for many values of the same type
Pooled<T>RAII wrapper that returns slots to the pool on dropPoolStatsfor capacity and usage information
-
Thread-safe wrapper (
SyncArena)- Wraps
Arenain aMutexfor multi-threaded use - Safe to share via
Arc<SyncArena>across threads
- Wraps
-
Feature flags
debug: Memory safety debugging with guards and use-after-rewind detectionvirtual_memory: Virtual memory strategy for large allocationsthread_local: Per-thread allocation bufferslockfree: Lock-free optimizationsstats: Per-allocation statistics (enabled by default)
-
Tooling and quality
- Criterion benchmarks comparing
Arena,Pool,Box, andVec - Property-based tests using
proptest - Comprehensive test suite for all v0.5.0 features
- GitHub Actions CI: fmt, clippy, tests, docs, and a short bench
- Criterion benchmarks comparing
Documentation
See the docs/ directory:
docs/guide.md– Getting started withArena,Pool, andSyncArena.docs/strategies.md– When to use an arena vs a pool.docs/advanced.md– Configuration, stats feature, thread safety, and benchmarking.docs/architecture.md– Internal design, invariants, and unsafe code strategy.
Examples
Real-world inspired examples are in examples/:
examples/parser_expr.rs– Expression parser building an AST in an arena.examples/game_loop.rs– Per-frame allocations in a game loop using scopes.examples/graph_pool.rs– Graph traversal using a pool allocator.examples/string_intern.rs– String interning backed by an arena.- NEW v0.5.0:
examples/v0.5_features.rs– Demonstrates all new v0.5.0 features - NEW v0.5.0:
examples/virtual_memory_demo.rs– Virtual memory usage example - NEW v0.5.0:
examples/debug_safety.rs– Memory safety debugging example
Run an example with:
Performance snapshot
Version 0.5.0 - Advanced Features & Performance:
The 0.5.0 release builds on the performance optimizations of v0.3.0 and adds advanced features for production use:
Key Performance Improvements
- Fast Reset API: Checkpoint-based bulk deallocation (10-100x faster than individual frees)
- Thread-Local Caching: Per-thread buffers reduce atomic contention (20-40% improvement in multi-threaded scenarios)
- Lock-Free Operations: Atomic operations for better concurrent performance
- Virtual Memory: Reserve/commit pattern for large allocations (reduced memory pressure)
- Memory Safety: Debug guards with minimal overhead when disabled
v0.5.0 Feature Performance
| Feature | Performance Impact | Use Case |
|---|---|---|
| Fast Reset API | 10-100x faster bulk deallocation | Frame-based allocation patterns |
| Thread-Local Cache | 20-40% improvement in contention | Multi-threaded scenarios |
| Lock-Free Ops | 15-25% better concurrent performance | High-contention workloads |
| Virtual Memory | Reduced memory pressure | Large arena allocations |
| Debug Safety | <5% overhead when disabled | Development and testing |
Previous Performance Improvements (v0.3.0)
- Lock-Free Atomic Operations: Lock-free allocation fast-path with compare-and-swap operations for better concurrent performance
- Advanced Memory Pooling: Size-class based memory pooling for small objects (8-4096 bytes) reduces allocation overhead
- SIMD Acceleration: AVX2-optimized vectorized memory operations with prefetching for large data copies
- Cache-Friendly Design: 64-byte cache-line aligned structures throughout to reduce false sharing
- Hardware Prefetching: Intelligent memory prefetching for better cache utilization
- Specialized Fast Paths: Dedicated allocation functions for common types (u8, u32, u64)
Benchmark Results
Small Object Performance:
- Small object allocation: 2-3x faster than standard allocators
- Memory pool efficiency: 40-60% faster for repeated small allocations
- Concurrent patterns: 35% improvement with scope-based allocation
- v0.5.0 thread-local: Additional 20-40% improvement in multi-threaded scenarios
Large Data Operations:
- Large slice copies: Up to 3x faster for 16KB+ arrays using SIMD
- Vectorized operations: 256-bit AVX2 throughput optimization
- Prefetching benefits: 15-25% improvement in cache-bound workloads
- v0.5.0 virtual memory: Reduced memory pressure for large allocations
Mixed Workloads:
- Realistic allocation patterns: 50-70% overall performance improvement
- Memory efficiency: Reduced fragmentation and better locality
- Zero-overhead stats: No performance impact when disabled
- v0.5.0 fast reset: 10-100x faster bulk deallocation patterns
Performance Comparison
| Operation | v0.2.0 | v0.3.0 | v0.5.0 | Improvement |
|---|---|---|---|---|
| Small object alloc | 52µs | 18µs | 16µs | 3.3x |
| SIMD copy (16KB) | 385ns | 105ns | 100ns | 3.9x |
| Mixed workload | 62µs | 28µs | 24µs | 2.6x |
| Scope reuse | 1.3µs | 0.74µs | 0.70µs | 1.9x |
| Memory pool | 287µs | 125µs | 115µs | 2.5x |
| Fast reset | N/A | N/A | 0.02µs | ~100x vs individual frees |
Benchmarks are in benches/arena_vs_box.rs, benches/optimization_benchmarks.rs, and benches/advanced_benchmarks.rs and use Criterion. Run:
to compare Arena, Pool, Box, and Vec on your hardware, with and without stats.
Rendering / game engine use case
In a renderer or game engine you often allocate a lot of temporary data per frame (transforms, scratch buffers, intermediate results) and then throw it away.
Using Arena::scope for per-frame scratch data lets you:
- Allocate many small objects per frame with very cheap pointer bumps.
- Free everything from that frame in one shot at the end of the scope.
- Avoid thousands of tiny heap allocations and deallocations every frame.
- Reduce heap fragmentation, which can cause random frame-time spikes.
NEW v0.5.0: Use the fast reset API for even more efficient frame-based allocation:
use Arena;
The end result is more stable and predictable frame times, which translates into smoother rendering and fewer stutters, especially on long-running scenes.
Status
-
Implemented:
- Bump
Arenawith multi-chunk support and scopes Pool<T>allocator with RAIIPooled<T>andPoolStatsSyncArenafor thread-safe useArenaBuilderandstatsfeature- Benchmarks, tests, CI, and docs
- NEW in v0.2.0: Cache-optimized memory layout and SIMD acceleration
- NEW in v0.2.0: Advanced chunk management and allocation fast-path optimizations
- NEW in v0.3.0: Lock-free atomic operations for better concurrent performance
- NEW in v0.3.0: Advanced memory pooling with size classes
- NEW in v0.3.0: SIMD optimizations with prefetching
- NEW in v0.3.0: Specialized allocation functions for common types
- NEW in v0.3.0: Cache-friendly design with 64-byte alignment
- NEW in v0.5.0: Fast reset API with checkpoints
- NEW in v0.5.0: Memory safety debugging with guards
- NEW in v0.5.0: Virtual memory strategy
- NEW in v0.5.0: Thread-local caching
- NEW in v0.5.0: Lock-free optimizations
- NEW in v0.5.0: Comprehensive test suite for all features
- Bump
-
Planned (for future releases):
- Allocation coalescing and defragmentation
- Slab allocator with multiple size classes
- More advanced debugging and visualization helpers
no_stdsupport and async-friendly integrations- ARM NEON optimizations for slice copies
arena-b aims to be a fast, ergonomic Rust arena allocator and memory pool library that feels native to Rust while offering production-grade safety and documentation.
License
Licensed under either of:
- MIT license
- Apache License, Version 2.0
at your option.
See the LICENSE file for details.