Struct SdpaParams

Source

pub struct SdpaParams {
    pub n_heads: u32,
    pub n_kv_heads: u32,
    pub head_dim: u32,
    pub seq_len: u32,
    pub kv_seq_len: u32,
    pub scale: f32,
    pub kv_capacity: u32,
}

Expand description

Parameters for the SDPA kernel.

These describe the tensor shapes and head configuration for the attention computation.

Fields§

§n_heads: u32

Number of query attention heads (e.g. 16 for Gemma 4).

§n_kv_heads: u32

Number of key/value attention heads (may be less than n_heads for GQA).

§head_dim: u32

Dimension of each attention head.

§seq_len: u32

Query sequence length.

§kv_seq_len: u32

Key/value sequence length (may differ from seq_len in decode mode).

§scale: f32

Attention score scaling factor. Typically 1.0 / sqrt(head_dim), but models like Gemma 4 (which use QK norms) require scale = 1.0.

§kv_capacity: u32

KV cache capacity — the stride (in positions) between KV heads in the cache buffer. When the KV cache is pre-allocated to a fixed capacity larger than kv_seq_len, set this to the capacity so the kernel reads the correct memory offsets. When KV buffers are tightly packed (no extra capacity), set equal to kv_seq_len. Default: 0 means “use kv_seq_len as capacity” for backwards compatibility.

SdpaParams

Struct SdpaParams Copy item path

Fields§

Trait Implementations§

impl Clone for SdpaParams

fn clone(&self) -> SdpaParams

fn clone_from(&mut self, source: &Self)

impl Debug for SdpaParams

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Copy for SdpaParams

Auto Trait Implementations§

impl Freeze for SdpaParams

impl RefUnwindSafe for SdpaParams

impl Send for SdpaParams

impl Sync for SdpaParams

impl Unpin for SdpaParams

impl UnsafeUnpin for SdpaParams

impl UnwindSafe for SdpaParams

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Struct SdpaParams

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,