Crate cudarc

Source
Expand description

Safe abstractions over:

  1. CUDA driver API
  2. NVRTC API
  3. cuRAND API
  4. cuBLAS API

§crate organization

Each of the modules for the above is organized into three levels:

  1. A safe module which provides safe abstractions over the result module
  2. A result which is a thin wrapper around the sys module to ensure all functions return Result
  3. A sys module which contains the raw FFI bindings

§Core Concepts

At the core is the driver API, which exposes a bunch of structs, but the main ones are:

  1. driver::CudaContext is a handle to a specific device ordinal (e.g. 0, 1, 2, …)
  2. driver::CudaStream is how you submit work to a device
  3. driver::CudaSlice<T>, which represents a Vec<T> on the device, can be allocated using the aforementioned driver::CudaStream.

Here is a table of similar concepts between CPU and Cuda:

ConceptCPUCuda
Memory allocatorstd::alloc::GlobalAllocdriver::CudaContext
List of values on heapVec<T>driver::CudaSlice<T>
Slice&[T]driver::CudaView<T>
Mutable Slice&mut [T]driver::CudaViewMut<T>
FunctionFndriver::CudaFunction
Calling a functionmy_function(a, b, c)driver::LaunchArgs::launch()
Threadstd::thread::Threaddriver::CudaStream

§Combining the different APIs

All the highest level apis have been designed to work together.

§nvrtc

nvrtc::compile_ptx() outputs a nvrtc::Ptx, which can be loaded into a device with driver::CudaContext::load_module().

§cublas

cublas::CudaBlas can perform gemm operations using cublas::Gemm<T>, and cublas::Gemv<T>. Both of these traits can generically accept memory allocated by the driver in the form of: driver::CudaSlice<T>, driver::CudaView<T>, and driver::CudaViewMut<T>.

§curand

curand::CudaRng can fill a driver::CudaSlice<T> with random data, based on one of its available distributions.

§Combining safe/result/sys

The result and sys levels are very inter-changeable for each API. However, the safe apis don’t necessarily allow you to mix in the result level. This is to encourage going through the safe API when possible.

If you need some functionality that isn’t present in the safe api, please open a ticket.

Modules§

cublas
Wrappers around the cublas API, in three levels. See crate documentation for description of each.
cublaslt
cudnn
curand
Wrappers around the cuRAND API in three levels. See crate documentation for description of each.
driver
Wrappers around the CUDA driver API, in three levels. See crate documentation for description of each.
nvrtc
Wrappers around the Nvidia Runtime Compilation (nvrtc) API, in three levels. See crate documentation for description of each.
runtime
Wrappers around the CUDA Runtime API, in two levels: an unsafe low-level API and a (still unsafe) thin wrapper around it.
types
Exposes CudaTypeName which maps between rust type names and the corresponding cuda kernel type names.