pub struct AttentionConfig {
pub batch_size: usize,
pub num_heads: usize,
pub query_len: usize,
pub kv_len: usize,
pub head_dim: usize,
pub causal: bool,
pub scale: f32,
pub precision: KernelPrecision,
pub block_q: usize,
pub block_kv: usize,
}Expand description
Configuration for standard attention.
Fields§
§batch_size: usizeBatch size.
num_heads: usizeNumber of attention heads.
query_len: usizeQuery sequence length.
kv_len: usizeKey/Value sequence length.
head_dim: usizeHead dimension.
causal: boolWhether to apply causal masking.
scale: f32Softmax scale factor (usually 1/sqrt(head_dim)).
precision: KernelPrecisionComputation precision.
block_q: usizeBlock size for query tiling.
block_kv: usizeBlock size for KV tiling.
Implementations§
Source§impl AttentionConfig
impl AttentionConfig
Sourcepub fn new(
batch_size: usize,
num_heads: usize,
query_len: usize,
kv_len: usize,
head_dim: usize,
) -> Self
pub fn new( batch_size: usize, num_heads: usize, query_len: usize, kv_len: usize, head_dim: usize, ) -> Self
Create config for the given dimensions.
Sourcepub fn with_causal(self, causal: bool) -> Self
pub fn with_causal(self, causal: bool) -> Self
Set causal masking.
Sourcepub fn with_precision(self, precision: KernelPrecision) -> Self
pub fn with_precision(self, precision: KernelPrecision) -> Self
Set precision mode.
Sourcepub fn with_block_sizes(self, block_q: usize, block_kv: usize) -> Self
pub fn with_block_sizes(self, block_q: usize, block_kv: usize) -> Self
Set block sizes for tiling.
Sourcepub fn validate(&self) -> ConfigResult<()>
pub fn validate(&self) -> ConfigResult<()>
Validate configuration values.
Trait Implementations§
Source§impl Clone for AttentionConfig
impl Clone for AttentionConfig
Source§fn clone(&self) -> AttentionConfig
fn clone(&self) -> AttentionConfig
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for AttentionConfig
impl Debug for AttentionConfig
Auto Trait Implementations§
impl Freeze for AttentionConfig
impl RefUnwindSafe for AttentionConfig
impl Send for AttentionConfig
impl Sync for AttentionConfig
impl Unpin for AttentionConfig
impl UnwindSafe for AttentionConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more