pub struct AttendConfig {
pub softmax_scale: f32,
pub n_kv_groups: usize,
}Expand description
Configuration for attention computation during decode.
Passed to CompressedKVCache::decode. Extensible without breaking
the trait signature — new fields can be added here.
Fields§
§softmax_scale: f32Softmax scaling factor, typically 1 / sqrt(head_dim).
n_kv_groups: usizeGQA group count: num_attention_heads / num_kv_heads.
Set to 1 for MHA (no grouping).
Auto Trait Implementations§
impl Freeze for AttendConfig
impl RefUnwindSafe for AttendConfig
impl Send for AttendConfig
impl Sync for AttendConfig
impl Unpin for AttendConfig
impl UnsafeUnpin for AttendConfig
impl UnwindSafe for AttendConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more