Trait cudarc::driver::safe::LaunchAsync
source · pub unsafe trait LaunchAsync<Params> {
// Required methods
unsafe fn launch(
self,
cfg: LaunchConfig,
params: Params
) -> Result<(), DriverError>;
unsafe fn launch_on_stream(
self,
stream: &CudaStream,
cfg: LaunchConfig,
params: Params
) -> Result<(), DriverError>;
}
Expand description
Consumes a CudaFunction to execute asychronously on the device with
params determined by generic parameter Params
.
This is impl’d multiple times for different number and types of params. In
general, Params
should impl DeviceRepr.
let my_kernel: CudaFunction = dev.get_func("my_module", "my_kernel").unwrap();
let cfg: LaunchConfig = LaunchConfig {
grid_dim: (1, 1, 1),
block_dim: (1, 1, 1),
shared_mem_bytes: 0,
};
let params = (1i32, 2u64, 3usize);
unsafe { my_kernel.launch(cfg, params) }.unwrap();
§Safety
This is not safe really ever, because there’s no garuntee that Params
will work for any CudaFunction passed in. Great care should be taken
to ensure that CudaFunction works with Params
and that the correct
parameters have &mut
in front of them.
Additionally, kernels can mutate data that is marked as immutable,
such as &CudaSlice<T>
.
See LaunchAsync::launch for more details
Required Methods§
sourceunsafe fn launch(
self,
cfg: LaunchConfig,
params: Params
) -> Result<(), DriverError>
unsafe fn launch( self, cfg: LaunchConfig, params: Params ) -> Result<(), DriverError>
Launches the CudaFunction with the corresponding Params
.
§Safety
This method is very unsafe.
See cuda documentation notes on this as well: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#functions
params
can be changed regardless of&
or&mut
usage.params
will be changed at some later point after the function returns because the kernel is executed async.- There are no guaruntees that the
params
are the correct number/types/order forfunc
. - Specifying the wrong values for LaunchConfig can result in accessing/modifying values past memory limits.
§Asynchronous mutation
Since this library queues kernels to be launched on a single stream, and really the only way to modify crate::driver::CudaSlice is through kernels, mutating the same crate::driver::CudaSlice with multiple kernels is safe. This is because each kernel is executed sequentially on the stream.
Modifying a value on the host that is in used by a kernel is undefined behavior. But is hard to do accidentally.
Also for this reason, do not pass in any values to kernels that can be modified on the host. This is the reason DeviceRepr is not implemented for rust primitive references.
§Use after free
Since the drop implementation for crate::driver::CudaSlice also occurs on the device’s single stream, any kernels launched before the drop will complete before the value is actually freed.
If you launch a kernel or drop a value on a different stream this may not hold
sourceunsafe fn launch_on_stream(
self,
stream: &CudaStream,
cfg: LaunchConfig,
params: Params
) -> Result<(), DriverError>
unsafe fn launch_on_stream( self, stream: &CudaStream, cfg: LaunchConfig, params: Params ) -> Result<(), DriverError>
Launch the function on a stream concurrent to the device’s default work stream.
§Safety
This method is even more unsafe than LaunchAsync::launch, all the same rules apply, except now things are executing in parallel to each other.
That means that if any of the kernels modify the same memory location, you’ll get race conditions or potentially undefined behavior.