pub struct BatchSizer {
pub vram_utilization: f32,
}Expand description
Computes the maximum batch size that fits safely within VRAM.
Uses the exact per-pair device memory layout of the GPU compare kernel, no
field-length estimation required because every string is padded to
STRING_STRIDE bytes on the device regardless of actual content.
§Per-pair device memory layout
| Buffer | Bytes per pair |
|---|---|
d_data_a + d_data_b (u8, STRING_STRIDE each) | 2 times n_fields times 64 |
d_lens_a + d_lens_b (u16) | 2 times n_fields times 2 |
d_ids_a + d_ids_b (u64) | 2 times 8 |
d_weights + d_probs (f32) | 2 times 4 |
d_levels (u32) | n_fields times 4 |
Total: n_fields times 136 + 24 bytes per pair (exact, no estimation).
§Example
use zer_compute::batch_sizer::BatchSizer;
let sizer = BatchSizer::new();
// 3 GB available VRAM (e.g. after OS + model overhead), 10 fields
let available = 3u64 * 1024 * 1024 * 1024;
let max = sizer.max_batch_size(available, 10);
assert!(max > 1_000_000, "should easily fit millions of pairs");Fields§
§vram_utilization: f32Fraction of available VRAM to commit to the comparison batch. Default: 0.75.
Implementations§
Source§impl BatchSizer
impl BatchSizer
pub fn new() -> Self
Sourcepub fn with_utilization(self, fraction: f32) -> Self
pub fn with_utilization(self, fraction: f32) -> Self
Override the utilization fraction (0.0 < fraction ≤ 1.0).
Sourcepub fn max_batch_size(
&self,
available_vram_bytes: u64,
num_fields: usize,
) -> usize
pub fn max_batch_size( &self, available_vram_bytes: u64, num_fields: usize, ) -> usize
Compute the maximum number of pairs that fit in available_vram_bytes VRAM
for a schema with num_fields fields.
The formula matches the GPU compare kernel buffer layout exactly, no avg_field_len
estimate is needed because device buffers always use STRING_STRIDE bytes per string.
Returns at least 1 so callers never divide by zero.
Sourcepub const fn min_batch_for_gpu() -> usize
pub const fn min_batch_for_gpu() -> usize
Minimum batch size to justify a GPU kernel launch. Batches smaller than this are routed to the CPU path transparently.
Trait Implementations§
Source§impl Clone for BatchSizer
impl Clone for BatchSizer
Source§fn clone(&self) -> BatchSizer
fn clone(&self) -> BatchSizer
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for BatchSizer
impl Debug for BatchSizer
Auto Trait Implementations§
impl Freeze for BatchSizer
impl RefUnwindSafe for BatchSizer
impl Send for BatchSizer
impl Sync for BatchSizer
impl Unpin for BatchSizer
impl UnsafeUnpin for BatchSizer
impl UnwindSafe for BatchSizer
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more