Skip to main content

kv_cache_bytes

Function kv_cache_bytes 

Source
pub fn kv_cache_bytes(seq_len: u32, model: &ModelConfig) -> f64
Expand description

Calculate memory transfer bytes for KV cache for a given sequence length Formula: kv_bytes = kv_cache_bytes_per_token * seq_len