Expand description
§CUDA Standard Library
The CUDA Standard Library provides a curated set of abstractions for writing performant, reliable, and understandable GPU kernels using the Rustc NVVM backend.
This library will build on non-nvptx targets or targets not using the nvvm backend. However, it will not
be usable, and it will throw linker errors if you attempt to use most of the functions in the library.
However, kernel
automatically cfg-gates the function annotated for nvptx64
or nvptx
, therefore,
no “actual” functions from this crate should be used when compiling for a non-nvptx target.
This crate cannot be used with the llvm ptx backend either, it heavily relies on external functions implicitly defined by the nvvm backend, as well as internal attributes.
§Structure
This library tries to follow the structure of the Rust standard library to some degree, where different concepts are separated into their own modules.
§The Prelude
In order to simplify imports, we provide a prelude module which contains GPU analogues to standard library
structures as well as common imports such as thread
.
Re-exports§
Modules§
- atomic
- Atomic Types for modification of numbers in multiple threads in a sound way.
- cfg
- Utilities for configuring code based on the specified compute capability.
- float
- Trait for float intrinsics for making floats work in no_std gpu environments.
- intrinsics
- Raw libdevice math intrinsics.
- io
- Utilities for printing to stdout from GPU threads.
- mem
- Support for allocating memory and using
alloc
using CUDA memory allocation system-calls. - misc
- Misc functions that do not exactly fit into other categories.
- prelude
- ptr
- CUDA-specific pointer handling logic.
- shared
- Static and Dynamic shared memory handling.
- thread
- Functions for dealing with the parallel thread execution model employed by CUDA.
- warp
- Functions that work over warps of threads.
Macros§
- assert_
eq - Asserts that two expression are equal and returns an
AssertionFailed
error to the application that launched the kernel if it is not true. - assert_
ne - Asserts that two expression are not equal and returns an
AssertionFailed
error to the application that launched the kernel if it is not true. - Alternative to
print!
which works on CUDA. Seeprint
for more info. - println
- Alternative to
println!
which works on CUDA. Seeprint
for more info. - shared_
array - Statically allocates a buffer large enough for
len
elements ofarray_type
, yielding a*mut array_type
that points to uninitialized shared memory.len
must be a constant expression.
Structs§
- bf16
- A 16-bit floating point type implementing the
bfloat16
format. - f16
- A 16-bit floating point type implementing the IEEE 754-2008 standard
binary16
a.k.ahalf
format.
Traits§
- Float
Ext - Extension trait for
f32
andf64
which provides high level functions for low level intrinsics for common math operations. You should generally use these functions over “manual” implementations because they are often much faster.
Attribute Macros§
- address_
space - Notifies the codegen to put a
static
/static mut
inside of a specific memory address space. This is mostly for internal use and/or advanced users, as the codegen andcuda_std
handle address space placement implicitly. Improper use of this macro could yield weird or undefined behavior. - externally_
visible - Notifies the codegen that this function is externally visible and should not be removed if it is not used by a kernel. Usually used for linking with other PTX/cubin files.
- gpu_
only - Creates a cpu version of the function which panics and cfg-gates the function for only nvptx/nvptx64.
- kernel
- Registers a function as a gpu kernel.