Skip to main content

total_memory_transfer

Function total_memory_transfer 

Source
pub fn total_memory_transfer(
    model: &ModelConfig,
    hardware: &HardwareConfig,
    request_seq_lens: &[u32],
) -> f64
Expand description

Calculate total memory transfer bytes for an iteration Formula: total_bytes = model_weights + sum(kv_cache for each request)