pub struct AttnConfig {
pub num_heads: usize,
pub num_kv_heads: usize,
pub head_dim: usize,
pub causal: bool,
pub scale: f32,
pub kv_seq_stride: usize,
pub sliding_window: usize,
}Expand description
Configuration for attention dispatch.
Fields§
§num_heads: usize§num_kv_heads: usize§head_dim: usize§causal: bool§scale: f32§kv_seq_stride: usizeStride (in rows) between head blocks in the KV buffer.
0 means contiguous (use kv_len, legacy behaviour).
Set to cache_capacity when flashing against a pre-allocated cache
that only has kv_len valid slots out of cache_capacity.
sliding_window: usizeSliding-window attention size (Mistral v0.1, Gemma).
0 = disabled (full causal attention).
w > 0 = each query position attends to the previous w KV positions
(still bounded by causal + pos_offset + qi + 1 as the upper end).
Trait Implementations§
Source§impl Clone for AttnConfig
impl Clone for AttnConfig
Source§fn clone(&self) -> AttnConfig
fn clone(&self) -> AttnConfig
Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for AttnConfig
impl Debug for AttnConfig
Auto Trait Implementations§
impl Freeze for AttnConfig
impl RefUnwindSafe for AttnConfig
impl Send for AttnConfig
impl Sync for AttnConfig
impl Unpin for AttnConfig
impl UnsafeUnpin for AttnConfig
impl UnwindSafe for AttnConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more