Function rcudnn_sys::cudaLaunchCooperativeKernel [−][src]
pub unsafe extern "C" fn cudaLaunchCooperativeKernel(
func: *const c_void,
gridDim: dim3,
blockDim: dim3,
args: *mut *mut c_void,
sharedMem: usize,
stream: cudaStream_t
) -> cudaError_t
Expand description
\brief Launches a device function where thread blocks can cooperate and synchronize as they execute
The function invokes kernel \p func on \p gridDim (\p gridDim.x × \p gridDim.y × \p gridDim.z) grid of blocks. Each block contains \p blockDim (\p blockDim.x × \p blockDim.y × \p blockDim.z) threads.
The device on which this kernel is invoked must have a non-zero value for the device attribute ::cudaDevAttrCooperativeLaunch.
The total number of blocks launched cannot exceed the maximum number of blocks per multiprocessor as returned by ::cudaOccupancyMaxActiveBlocksPerMultiprocessor (or ::cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags) times the number of multiprocessors as specified by the device attribute ::cudaDevAttrMultiProcessorCount.
The kernel cannot make use of CUDA dynamic parallelism.
If the kernel has N parameters the \p args should point to array of N pointers. Each pointer, from args[0] to args[N - 1], point to the region of memory from which the actual parameter will be copied.
For templated functions, pass the function symbol as follows: func_name<template_arg_0,…,template_arg_N>
\p sharedMem sets the amount of dynamic shared memory that will be available to each thread block.
\p stream specifies a stream the invocation is associated to.
\param func - Device function symbol \param gridDim - Grid dimentions \param blockDim - Block dimentions \param args - Arguments \param sharedMem - Shared memory \param stream - Stream identifier
\return ::cudaSuccess, ::cudaErrorInvalidDeviceFunction, ::cudaErrorInvalidConfiguration, ::cudaErrorLaunchFailure, ::cudaErrorLaunchTimeout, ::cudaErrorLaunchOutOfResources, ::cudaErrorCooperativeLaunchTooLarge, ::cudaErrorSharedObjectInitFailed \note_null_stream \notefnerr \note_init_rt \note_callback
\sa \ref ::cudaLaunchCooperativeKernel(const T *func, dim3 gridDim, dim3 blockDim, void **args, size_t sharedMem, cudaStream_t stream) “cudaLaunchCooperativeKernel (C++ API)”, ::cudaLaunchCooperativeKernelMultiDevice, ::cuLaunchCooperativeKernel