pub fn subgroup_f_add(val: f32) -> f32Expand description
Subgroup (wave/warp) floating-point add reduction.
Returns the sum of val across all invocations in the subgroup.
On SPIR-V: calls spirv_std::arch::subgroup_f_add.
On CUDA: uses warp-level __shfl_xor_sync butterfly reduction.
On CPU: returns val unchanged (subgroup size = 1).