Expand description
flash_attention op-diff harness — see crate::op_diff.
Dense causal attention. CpuBackend implements the reference; Metal/CUDA run their flash kernels against the same Q/K/V.
flash_attention op-diff harness — see crate::op_diff.
Dense causal attention. CpuBackend implements the reference; Metal/CUDA run their flash kernels against the same Q/K/V.