Expand description
Adaptive batch prediction for geospatial ML inference
This module provides:
PredictionRequest/PredictionResult— typed request/response pairsAdaptiveBatcher— adjusts batch size based on observed latency so that throughput is maximised while keeping per-batch latency near a configurable target.
§Algorithm
After each batch completes the batcher computes a rolling average over the
last N observations and compares it to target_latency_ms:
- Too slow (avg > target): shrink towards
min_batch_size - Too fast (avg < target): grow towards
max_batch_size
The magnitude of each adjustment is controlled by adaptation_rate.
Structs§
- Adaptive
Batch Config - Adaptive batch sizing configuration
- Adaptive
Batcher - Adaptive batch size controller
- Prediction
Request - Single prediction request carrying raw float tensors.
- Prediction
Result - Single prediction result produced by an inference run.