Skip to main content

Module attention

Module attention 

Source
Expand description

Attention kernels

Enums§

AttentionStrategy
Strategy used to select which attention implementation to run.

Functions§

attention
Launch an attention kernel with given strategy
attention_autotune
Executes autotune on attention operations
flash_attention
Launch a flash attention kernel