pub struct FlashAttention { /* private fields */ }Expand description
Flash attention with block-wise computation
Computes attention in tiles to minimize memory usage while maintaining numerical stability.
Implementations§
Trait Implementations§
Source§impl Attention for FlashAttention
impl Attention for FlashAttention
Source§fn compute(
&self,
query: &[f32],
keys: &[&[f32]],
values: &[&[f32]],
) -> AttentionResult<Vec<f32>>
fn compute( &self, query: &[f32], keys: &[&[f32]], values: &[&[f32]], ) -> AttentionResult<Vec<f32>>
Computes attention over the given query, keys, and values. Read more
Auto Trait Implementations§
impl Freeze for FlashAttention
impl RefUnwindSafe for FlashAttention
impl Send for FlashAttention
impl Sync for FlashAttention
impl Unpin for FlashAttention
impl UnwindSafe for FlashAttention
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more