Expand description
Sub-group optimized SPIR-V kernel generators for Intel GPUs.
This module provides SPIR-V generators that leverage Intel GPU sub-group operations (analogous to CUDA warps) for efficient intra-sub-group communication:
reduction_subgroup_spirv— Two-phase sub-group reductionscan_subgroup_spirv— Inclusive prefix sum via sub-group scangemm_subgroup_spirv— GEMM with sub-group shuffle for A-row broadcast
All kernels use the OpenCL SPIR-V execution model (Kernel) with
Physical64/OpenCL memory model and require GroupNonUniform family
capabilities.
Functions§
- gemm_
subgroup_ spirv - Generate an OpenCL SPIR-V compute kernel for GEMM with sub-group shuffle.
- reduction_
subgroup_ spirv - Generate an OpenCL SPIR-V compute kernel for sub-group optimized reduction.
- scan_
subgroup_ spirv - Generate an OpenCL SPIR-V compute kernel for sub-group scan (prefix sum).