Skip to main content

DEFAULT_BATCH_SIZE

Constant DEFAULT_BATCH_SIZE 

Source
pub const DEFAULT_BATCH_SIZE: usize = 32;
Expand description

Messages per model-inference + write batch. e5 truncates at 512 tokens, so a 32-row batch’s padded attention transient stays bounded.