Batchy
Transparently batch concurrent requests into efficient bulk operations.
Batchy merges multiple concurrent requests into larger batches, forwarding them to a single processing call. It's perfect for ML inference, database queries, API calls, or any scenario where batching improves throughput.
Features
- Transparent batching: Callers submit single requests and receive single responses - batching is invisible at the call site
- Configurable strategy: Control max batch size, queue size, and wait time
- Backpressure: Built-in queue limits prevent memory exhaustion under high load
- Error handling: Batch-level errors are delivered to all affected callers
- Async-native: Built on tokio for high-performance async workloads
Quick Start
[]
= "0.1"
= { = "1", = ["rt-multi-thread", "macros"] }
Usage
use ;
use tokio;
async
Configuration
| Field | Default | Description |
|---|---|---|
max_batch |
32 | Maximum requests merged into one processing call |
queue_size |
128 | Size of the internal request queue (backpressure when full) |
max_wait_ms |
50 | Maximum wait time for batch to fill under low load |
use BatcherConfig;
// Using builder pattern
let config = default
.max_batch
.max_wait_ms
.build
.unwrap;
Batching Strategy
- First request arrives, worker starts timer
- Worker accumulates requests until:
max_batchis reached (immediate processing), ORmax_wait_msexpires (process what we have)
- Processing call executes with accumulated batch
- Results are fanned back to respective callers
Under high load: batches fill to max_batch for maximum throughput.
Under low load: each request waits at most max_wait_ms for bounded latency.
Error Handling
The processor returns Result<Vec<Res>, E>:
Err(e): The entire batch failed - every caller in that batch receives a clone ofeOk(results): One result per input, in the same order
let batcher = new;
CPU-Heavy Work
For CPU-intensive operations like ML inference, wrap your work in spawn_blocking:
let batcher = new;
Use Cases
- ML Inference: Batch multiple inputs for GPU efficiency
- Database Queries: Combine individual queries into bulk operations
- API Calls: Aggregate requests to respect rate limits
- File I/O: Batch disk writes for better throughput
License
MIT