Batchy
Transparently batch concurrent requests into efficient bulk operations.
Batchy merges multiple concurrent requests into larger batches, forwarding them to a single processing call. It's perfect for ML inference, database queries, API calls, or any scenario where batching improves throughput.
Features
- Transparent batching: Callers submit single requests and receive single responses - batching is invisible at the call site
- Configurable strategy: Control max batch size, queue size, and wait time
- Backpressure: Built-in queue limits prevent memory exhaustion under high load
- Error handling: Batch-level errors are delivered to all affected callers
- Two modes: Async
Batcherfor async workloads,SyncBatcherfor sync workloads with thread-local resources
Quick Start
[]
= "0.1"
= { = "1", = ["rt-multi-thread", "macros"] }
Async Batcher
For async workloads:
use ;
async
CPU-Heavy Async Work
For CPU-intensive async operations, wrap in spawn_blocking:
let batcher = new;
Synchronous Batcher
For sync workloads that need thread-local resources (like fastembed's TextEmbedding):
use ;
async
This avoids spawn_blocking overhead and keeps thread-local state alive on the worker thread.
Configuration
| Field | Default | Description |
|---|---|---|
max_batch |
32 | Maximum requests merged into one processing call |
queue_size |
128 | Size of the internal request queue (backpressure when full) |
max_wait_ms |
50 | Maximum wait time for batch to fill under low load |
use BatcherConfig;
use BatcherConfigBuilder;
// Using builder pattern
let config = default
.max_batch
.max_wait_ms
.build
.unwrap;
Batching Strategy
- First request arrives, worker starts timer
- Worker accumulates requests until:
max_batchis reached (immediate processing), ORmax_wait_msexpires (process what we have)
- Processing call executes with accumulated batch
- Results are fanned back to respective callers
Under high load: batches fill to max_batch for maximum throughput.
Under low load: each request waits at most max_wait_ms for bounded latency.
Error Handling
The processor returns Result<Vec<Res>, E>:
Err(e): The entire batch failed - every caller in that batch receives a clone ofeOk(results): One result per input, in the same order
let batcher = new;
Use Cases
- ML Inference: Batch multiple inputs for GPU efficiency (
BatcherorSyncBatcher) - Database Queries: Combine individual queries into bulk operations (
Batcher) - API Calls: Aggregate requests to respect rate limits (
Batcher) - File I/O: Batch disk writes for better throughput (
SyncBatcher) - Embedding Models: Thread-local model instances with sync processing (
SyncBatcher)
License
MIT