Module flash

ruvector_attention::sparse

Module flash

Expand description

Flash attention - memory-efficient attention with tiled computation

Memory: O(block_size) for attention matrix instead of O(n²)

Structs§

FlashAttention: Flash attention with block-wise computation