Expand description
Scaled dot-product attention kernel.
Matches attention-kernel-v1.yaml.
Attention(Q, K, V) = softmax(Q * K^T / sqrt(d_k)) * V
Each function provides one of three backends:
fn attention_scalar(...)– Pure Rust scalar reference (ground truth)unsafe fn attention_avx2(...)– AVX2 SIMD implementationfn attention_ptx() -> &'static str– PTX assembly source string
Functions§
- attention_
avx2 ⚠ - AVX2 scaled dot-product attention – delegates to scalar.
- attention_
ptx - PTX assembly for fused scaled dot-product attention.
- attention_
scalar - Scaled dot-product attention (scalar reference).