Skip to main content

tmp_buffer_bytes

Function tmp_buffer_bytes 

Source
pub fn tmp_buffer_bytes(num_heads: u32, head_dim: u32) -> usize
Expand description

Compute the size in bytes of the temporary buffer needed for TQ SDPA.

Sized for max NWG=32 regardless of actual adaptive NWG — the buffer is allocated once at model load time and reused for all context lengths.