Expand description
Sparse attention mechanisms for efficient computation on long sequences
This module provides sparse attention patterns that reduce complexity from O(n²) to sub-quadratic.
Re-exports§
pub use flash::FlashAttention;pub use linear::LinearAttention;pub use local_global::LocalGlobalAttention;pub use mask::AttentionMask;pub use mask::SparseMaskBuilder;
Modules§
- flash
- Flash attention - memory-efficient attention with tiled computation
- linear
- Linear attention using random feature approximation (Performer-style)
- local_
global - Local-Global attention for efficient long-range dependencies
- mask
- Sparse mask utilities for attention patterns