Expand description

CUDA Context handling.

New vs Legacy contexts

The CUDA Driver API has two main ways of creating contexts. The “legacy” (legacy meaning this is what the de-facto way of doing it in cust was) context handling, and the “new” primary context handling. In the legacy way of handling contexts, a thread could posess multiple contexts inside of a stack, and users would explicitly create entire new contexts and set them as the current context at the top of the stack.

This is great for control, but it causes a myriad of issues when trying to interoperate with runtime API based libraries such as cuBLAS or cuFFT. Explicitly making and destroying contexts causes a lot of problems with the runtime API because the runtime API will implicitly use any context the driver API set as current. This sometimes causes segfaults and odd behavior that is overall hard to manage if trying to use other CUDA libraries.

The “new” primary context handling uses the same handling as the Runtime API. Instead of context stacks, only a single context exists for every device, and this context is reference-counted. Users can retain a handle to the primary context, increasing the reference count, and release the context once they are done using it. Because this is the same handling that the Runtime API uses, it is directly compatible with libraries such as cuBLAS.

Primary contexts also simplify the context API greatly, making new contexts on the same device will just use the same context. This means there is no need for unowned contexts when using multithreading. Users can simply make new contexts for every thread with no concern that the context will be prematurely destroyed.

So overall, we reccomend everyone use the new primary context handling, and avoid the old legacy handling. Doing so will make your use of cust more compatible with libraries like cuBLAS or cuFFT, as well as avoid potentially confusing context-based bugs.

Primary contexts are the default in cust, you can use the old legacy context handling with the legacy module.

Modules

Legacy context handling. Legacy (old) context management which preceded primary contexts.

Structs

Bit flags for initializing the CUDA context.

Type representing the context being currently used.

Enums

This enumeration represents configuration settings for devices which share hardware resources between L1 cache and shared memory.

This enumeration represents the limited resources which can be accessed through CurrentContext::get_resource_limit and CurrentContext::set_resource_limit.

This enumeration represents the options for configuring the shared memory bank size.

Traits