pub fn cluster_launch<A: KernelArgs>(
kernel: &Kernel,
params: &ClusterLaunchParams,
stream: &Stream,
args: &A,
) -> CudaResult<()>Expand description
Launches a kernel with thread block cluster configuration.
On Hopper+ GPUs (compute capability 9.0+), this groups thread blocks into clusters for enhanced cooperation via distributed shared memory.
This function validates the cluster parameters and delegates to the
standard kernel launch. On hardware that supports clusters natively,
the CUDA driver would use cuLaunchKernelEx with cluster attributes.
§Parameters
kernel— the kernel to launch.params— cluster-aware launch parameters.stream— the stream to launch on.args— kernel arguments.
§Errors
Returns CudaError::InvalidValue if the parameters are invalid
(zero dimensions, grid not divisible by cluster, etc.), or another
error from the underlying kernel launch.