Skip to main content

Module gqa

provable_contracts::kernels

Module gqa

Expand description

Grouped Query Attention kernel.

Matches gqa-kernel-v1.yaml. KV head broadcasting: kv_head = query_head / (num_heads / num_kv_heads)

Each function provides one of three backends:

fn gqa_scalar(...) – Pure Rust scalar reference (ground truth)
unsafe fn gqa_avx2(...) – AVX2 SIMD implementation
fn gqa_ptx() -> &'static str – PTX assembly source string

Functions§

gqa_avx2^⚠: AVX2 Grouped Query Attention – delegates to scalar.
gqa_ptx: PTX assembly for Grouped Query Attention.
gqa_scalar: Grouped Query Attention (scalar reference).