pub struct BatchConfig {
pub max_batch_size: usize,
pub max_latency: Duration,
pub queue_capacity: Option<usize>,
pub response_timeout: Option<Duration>,
pub max_in_flight_per_feed: usize,
pub startup_timeout: Option<Duration>,
}Expand description
Configuration for a batch coordinator.
Controls batch formation: how many items accumulate before dispatch and how long to wait for a full batch.
§Tradeoffs
max_batch_size: Larger batches improve throughput (better GPU utilization) but increase per-frame latency because each frame waits for the batch to fill.max_latency: Lower values reduce worst-case latency for partial batches but may dispatch smaller, less efficient batches.
Reasonable starting points for multi-feed inference:
max_batch_size: 4–16 (depends on GPU memory / model size)max_latency: 20–100ms (depends on frame rate / latency tolerance)
Fields§
§max_batch_size: usizeMaximum items in a single batch.
When this many items accumulate, the batch is dispatched
immediately without waiting for max_latency.
Must be ≥ 1.
max_latency: DurationMaximum time to wait for a full batch before dispatching a partial one.
After the first item arrives, the coordinator waits up to this duration for more items. If the batch is still not full when the deadline expires, it is dispatched as-is.
Must be > 0.
queue_capacity: Option<usize>Submission queue capacity.
Controls how many pending items can be buffered before
submit_and_wait returns
BatchSubmitError::QueueFull.
Defaults to max_batch_size * 4 (minimum 4) when None.
When specified, must be ≥ max_batch_size.
response_timeout: Option<Duration>Safety timeout added beyond max_latency when a feed thread waits
for a batch response.
The total wait is max_latency + response_timeout. This bounds
how long a feed thread can block if the coordinator is wedged or
processing is severely delayed.
In practice, responses arrive within max_latency + processing_time.
This safety margin exists only to guarantee eventual unblocking.
Defaults to 5 seconds when None. Must be > 0 when specified.
max_in_flight_per_feed: usizeMaximum number of in-flight submissions allowed per feed.
An item is “in-flight” from the moment it enters the submission
queue until the coordinator routes its result back (or drains it
at shutdown). When a feed reaches this limit, further
submit_and_wait calls fail
immediately with BatchSubmitError::InFlightCapReached
rather than adding to the queue.
This prevents a feed from accumulating orphaned items in the
shared queue after timeouts: when submit_and_wait times out,
the item remains in-flight inside the coordinator. Without a
cap, the feed could immediately submit another frame, stacking
multiple items and crowding other feeds.
Default: 1 — each feed contributes at most one item to the shared queue at any time. Must be ≥ 1.
startup_timeout: Option<Duration>Maximum time to wait for BatchProcessor::on_start() to
complete before returning an error.
GPU-backed processors (e.g. TensorRT engine compilation) may need significantly longer than CPU-only models. Set this to accommodate worst-case first-run warm-up on the target hardware.
Defaults to 30 seconds when None. Must be > 0 when specified.
Implementations§
Source§impl BatchConfig
impl BatchConfig
Sourcepub fn new(
max_batch_size: usize,
max_latency: Duration,
) -> Result<Self, ConfigError>
pub fn new( max_batch_size: usize, max_latency: Duration, ) -> Result<Self, ConfigError>
Create a validated batch configuration.
§Errors
Returns ConfigError::InvalidPolicy
if max_batch_size is 0 or max_latency is zero.
Sourcepub fn with_queue_capacity(self, capacity: Option<usize>) -> Self
pub fn with_queue_capacity(self, capacity: Option<usize>) -> Self
Set the submission queue capacity.
When specified, must be ≥ max_batch_size. Pass None for the
default (max_batch_size * 4, minimum 4).
Sourcepub fn with_response_timeout(self, timeout: Option<Duration>) -> Self
pub fn with_response_timeout(self, timeout: Option<Duration>) -> Self
Set the response safety timeout.
This is the safety margin added beyond max_latency when blocking
for a batch response. Pass None for the default (5 seconds).
Must be > 0 when specified.
Sourcepub fn with_max_in_flight_per_feed(self, max: usize) -> Self
pub fn with_max_in_flight_per_feed(self, max: usize) -> Self
Set the maximum number of in-flight submissions per feed.
Default is 1. Must be ≥ 1.
Sourcepub fn with_startup_timeout(self, timeout: Option<Duration>) -> Self
pub fn with_startup_timeout(self, timeout: Option<Duration>) -> Self
Set the maximum time to wait for on_start() to complete.
Pass None for the default (30 seconds). GPU-backed processors
(e.g. TensorRT engine build on first run) may need 2–5 minutes.
Must be > 0 when specified.
Sourcepub fn validate(&self) -> Result<(), ConfigError>
pub fn validate(&self) -> Result<(), ConfigError>
Validate all configuration fields.
Called internally by BatchCoordinator::start.
Also available for early validation before passing a config to the runtime.
§Errors
Returns ConfigError::InvalidPolicy
if any field violates its constraints.
Trait Implementations§
Source§impl Clone for BatchConfig
impl Clone for BatchConfig
Source§fn clone(&self) -> BatchConfig
fn clone(&self) -> BatchConfig
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for BatchConfig
impl Debug for BatchConfig
Auto Trait Implementations§
impl Freeze for BatchConfig
impl RefUnwindSafe for BatchConfig
impl Send for BatchConfig
impl Sync for BatchConfig
impl Unpin for BatchConfig
impl UnsafeUnpin for BatchConfig
impl UnwindSafe for BatchConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more