Safe Rust wrappers for the CUDA Runtime API.
The Runtime API is "higher level" than the Driver API: contexts are
implicit (each device has a primary context the runtime uses
automatically), kernels are typically linked at build time by nvcc,
and most operations dispatch to the current thread's current device.
baracuda-runtime mirrors the Driver-side types where it makes sense
([Device], [Stream], [Event], [DeviceBuffer]) and uses the
CUDA 12.0+ library API ([Library], [Kernel]) for loading PTX at
runtime — the Driver-API equivalent of Module::load_ptx +
Module::get_function.
Driver ↔ Runtime interop
CUstream and cudaStream_t are the same C type. With the
driver-interop feature, Stream::as_raw_driver() and
Event::as_raw_driver() return views usable by baracuda-driver
APIs. See [interop].