Crate cuda_std[−][src]
Expand description
CUDA Standard Library
The CUDA Standard Library provides a curated set of abstractions for writing performant, reliable, and understandable GPU kernels using the Rustc NVVM backend.
This library will build on non-nvptx targets or targets not using the nvvm backend. However, it will not
be usable, and it will throw linker errors if you attempt to use most of the functions in the library.
However, kernel
automatically cfg-gates the function annotated for nvptx64
or nvptx
, therefore,
no “actual” functions from this crate should be used when compiling for a non-nvptx target.
This crate cannot be used with the llvm ptx backend either, it heavily relies on external functions implicitly defined by the nvvm backend, as well as internal attributes.
Structure
This library tries to follow the structure of the Rust standard library to some degree, where different concepts are separated into their own modules.
The Prelude
In order to simplify imports, we provide a prelude module which contains GPU analogues to standard library
structures as well as common imports such as thread
.
Re-exports
Modules
Trait for float intrinsics for making floats work in no_std gpu environments.
Raw libdevice math intrinsics.
Utilities for printing to stdout from GPU threads.
Support for allocating memory and using alloc
using CUDA memory allocation system-calls.
Misc functions that do not exactly fit into other categories.
CUDA-specific pointer handling logic.
Static and Dynamic shared memory handling.
Functions for dealing with the parallel thread execution model employed by CUDA.
Functions that work over warps of threads.
Macros
Asserts that two expression are equal and returns an AssertionFailed
error to the application that launched the kernel
if it is not true.
Asserts that two expression are not equal and returns an AssertionFailed
error to the application that launched the kernel
if it is not true.
Statically allocates a buffer large enough for len
elements of array_type
, yielding
a *mut array_type
that points to uninitialized shared memory. len
must be a constant expression.
Structs
Traits
Attribute Macros
Notifies the codegen to put a static
/static mut
inside of a specific memory address space.
This is mostly for internal use and/or advanced users, as the codegen and cuda_std
handle address space placement
implicitly. Improper use of this macro could yield weird or undefined behavior.
Notifies the codegen that this function is externally visible and should not be removed if it is not used by a kernel. Usually used for linking with other PTX/cubin files.
Creates a cpu version of the function which panics and cfg-gates the function for only nvptx/nvptx64.
Registers a function as a gpu kernel.