pub struct FlashAttention { /* private fields */ }Expand description
Block-wise attention computation optimized for CPU cache locality.
Instead of materializing the full N×N attention matrix, processes the computation in blocks that fit in L1/L2 cache, achieving O(N) memory complexity instead of O(N²).
Implementations§
Source§impl FlashAttention
impl FlashAttention
Sourcepub fn new(config: FlashAttentionConfig) -> Self
pub fn new(config: FlashAttentionConfig) -> Self
Create a new FlashAttention with the given configuration.
Sourcepub fn with_dimensions(dimensions: usize) -> Self
pub fn with_dimensions(dimensions: usize) -> Self
Create with default configuration.
Sourcepub fn config(&self) -> &FlashAttentionConfig
pub fn config(&self) -> &FlashAttentionConfig
Returns a reference to the configuration.
Sourcepub fn attention(
&self,
queries: &[Vec<f32>],
keys: &[Vec<f32>],
values: &[Vec<f32>],
) -> Vec<Vec<f32>>
pub fn attention( &self, queries: &[Vec<f32>], keys: &[Vec<f32>], values: &[Vec<f32>], ) -> Vec<Vec<f32>>
Compute scaled dot-product attention using the block-wise algorithm.
For sequences of length N with dimension D:
- Naive: O(N²) memory (full attention matrix)
- Flash: O(N) memory (block-wise accumulation via online softmax)
§Arguments
queries- Query vectors [N_q × D]keys- Key vectors [N_k × D]values- Value vectors [N_k × D]
§Returns
Output vectors [N_q × D]
Sourcepub fn naive_attention(
&self,
queries: &[Vec<f32>],
keys: &[Vec<f32>],
values: &[Vec<f32>],
) -> Vec<Vec<f32>>
pub fn naive_attention( &self, queries: &[Vec<f32>], keys: &[Vec<f32>], values: &[Vec<f32>], ) -> Vec<Vec<f32>>
Naive attention implementation for benchmarking comparison.
Materializes the full N×N attention matrix: O(N²) memory.
Sourcepub fn benchmark(&self, num_vectors: usize) -> BenchmarkResult
pub fn benchmark(&self, num_vectors: usize) -> BenchmarkResult
Run a benchmark comparing naive vs flash attention.
Generates random vectors and measures wall-clock time for both methods. Also verifies that both implementations produce equivalent results.
Sourcepub fn self_attention(&self, sequence: &[Vec<f32>]) -> Vec<Vec<f32>>
pub fn self_attention(&self, sequence: &[Vec<f32>]) -> Vec<Vec<f32>>
Compute self-attention: a sequence attends to itself.
Convenience wrapper around attention(q, q, q).
Sourcepub fn cross_attention(
&self,
queries: &[Vec<f32>],
kv_sequence: &[Vec<f32>],
) -> Vec<Vec<f32>>
pub fn cross_attention( &self, queries: &[Vec<f32>], kv_sequence: &[Vec<f32>], ) -> Vec<Vec<f32>>
Compute cross-attention between two sequences.
Queries from one sequence attend to keys/values from another.
Sourcepub fn memory_estimate(&self, seq_len: usize) -> MemoryEstimate
pub fn memory_estimate(&self, seq_len: usize) -> MemoryEstimate
Estimate peak memory usage in bytes for a given sequence length.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for FlashAttention
impl RefUnwindSafe for FlashAttention
impl Send for FlashAttention
impl Sync for FlashAttention
impl Unpin for FlashAttention
impl UnsafeUnpin for FlashAttention
impl UnwindSafe for FlashAttention
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pipe for Twhere
T: ?Sized,
impl<T> Pipe for Twhere
T: ?Sized,
Source§fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
Source§fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
self and passes that borrow into the pipe function. Read moreSource§fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
self and passes that borrow into the pipe function. Read moreSource§fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
Source§fn pipe_borrow_mut<'a, B, R>(
&'a mut self,
func: impl FnOnce(&'a mut B) -> R,
) -> R
fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> R
Source§fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
self, then passes self.as_ref() into the pipe function.Source§fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
self, then passes self.as_mut() into the pipe
function.Source§fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
self, then passes self.deref() into the pipe function.Source§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<T> PolicyExt for Twhere
T: ?Sized,
impl<T> PolicyExt for Twhere
T: ?Sized,
Source§impl<T> Tap for T
impl<T> Tap for T
Source§fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
Borrow<B> of a value. Read moreSource§fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
BorrowMut<B> of a value. Read moreSource§fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
AsRef<R> view of a value. Read moreSource§fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
AsMut<R> view of a value. Read moreSource§fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
Deref::Target of a value. Read moreSource§fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
Deref::Target of a value. Read moreSource§fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
.tap() only in debug builds, and is erased in release builds.Source§fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
.tap_mut() only in debug builds, and is erased in release
builds.Source§fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
.tap_borrow() only in debug builds, and is erased in release
builds.Source§fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
.tap_borrow_mut() only in debug builds, and is erased in release
builds.Source§fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
.tap_ref() only in debug builds, and is erased in release
builds.Source§fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
.tap_ref_mut() only in debug builds, and is erased in release
builds.Source§fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
.tap_deref() only in debug builds, and is erased in release
builds.