Function rcudnn_sys::cudaLaunchCooperativeKernel[−][src]

pub unsafe extern "C" fn cudaLaunchCooperativeKernel(
    func: *const c_void, 
    gridDim: dim3, 
    blockDim: dim3, 
    args: *mut *mut c_void, 
    sharedMem: usize, 
    stream: cudaStream_t
) -> cudaError_t

Expand description

\brief Launches a device function where thread blocks can cooperate and synchronize as they execute

The function invokes kernel \p func on \p gridDim (\p gridDim.x × \p gridDim.y × \p gridDim.z) grid of blocks. Each block contains \p blockDim (\p blockDim.x × \p blockDim.y × \p blockDim.z) threads.

The device on which this kernel is invoked must have a non-zero value for the device attribute ::cudaDevAttrCooperativeLaunch.

The total number of blocks launched cannot exceed the maximum number of blocks per multiprocessor as returned by ::cudaOccupancyMaxActiveBlocksPerMultiprocessor (or ::cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags) times the number of multiprocessors as specified by the device attribute ::cudaDevAttrMultiProcessorCount.

The kernel cannot make use of CUDA dynamic parallelism.

If the kernel has N parameters the \p args should point to array of N pointers. Each pointer, from args[0] to args[N - 1], point to the region of memory from which the actual parameter will be copied.

For templated functions, pass the function symbol as follows: func_name<template_arg_0,…,template_arg_N>

\p sharedMem sets the amount of dynamic shared memory that will be available to each thread block.

\p stream specifies a stream the invocation is associated to.

\param func - Device function symbol \param gridDim - Grid dimentions \param blockDim - Block dimentions \param args - Arguments \param sharedMem - Shared memory \param stream - Stream identifier

\return ::cudaSuccess, ::cudaErrorInvalidDeviceFunction, ::cudaErrorInvalidConfiguration, ::cudaErrorLaunchFailure, ::cudaErrorLaunchTimeout, ::cudaErrorLaunchOutOfResources, ::cudaErrorCooperativeLaunchTooLarge, ::cudaErrorSharedObjectInitFailed \note_null_stream \notefnerr \note_init_rt \note_callback

\sa \ref ::cudaLaunchCooperativeKernel(const T *func, dim3 gridDim, dim3 blockDim, void **args, size_t sharedMem, cudaStream_t stream) “cudaLaunchCooperativeKernel (C++ API)”, ::cudaLaunchCooperativeKernelMultiDevice, ::cuLaunchCooperativeKernel