[][src]Module rustacuda::context

CUDA context management

Most CUDA functions require a context. A CUDA context is analogous to a CPU process - it's an isolated container for all runtime state, including configuration settings and the device/unified/page-locked memory allocations. Each context has a separate memory space, and pointers from one context do not work in another. Each context is associated with a single device. Although it is possible to have multiple contexts associated with a single device, this is strongly discouraged as it can cause a significant loss of performance.

CUDA keeps a thread-local stack of contexts which the programmer can push to or pop from. The top context in that stack is known as the "current" context and it is used in most CUDA API calls. One context can be safely made current in multiple CPU threads.

Safety

The CUDA context management API does not fit easily into Rust's safety guarantees.

The thread-local stack (as well as the fact that any context can be on the stack for any number of threads) means there is no clear owner for a CUDA context, but it still has to be cleaned up. Also, the fact that a context can be current to multiple threads at once means that there can be multiple implicit references to a context which are not controlled by Rust.

RustaCUDA handles ownership by providing an owning Context struct and a non-owning UnownedContext. When the Context is dropped, the backing context is destroyed. The context could be current on other threads, though. In this case, the context is still destroyed, and attempts to access the context on other threads will fail with an error. This is (mostly) safe, if a bit inconvenient. It's only mostly safe because other threads could be accessing that context while the destructor is running on this thread, which could result in undefined behavior.

In short, Rust's thread-safety guarantees cannot fully protect use of the context management functions. The programmer must ensure that no other OS threads are using the Context when it is dropped.

Examples

For most commmon uses (one device, one OS thread) it should suffice to create a single context:

use rustacuda::device::Device;
use rustacuda::context::{Context, ContextFlags};

rustacuda::init(rustacuda::CudaFlags::empty())?;
let device = Device::get_device(0)?;
let context = Context::create_and_push(ContextFlags::MAP_HOST | ContextFlags::SCHED_AUTO, device)?;
// call RustaCUDA functions which use the context

// The context will be destroyed when dropped or it falls out of scope.
drop(context);

If you have multiple OS threads that each submit work to the same device, you can get a handle to the single context and pass it to each thread.

// As before
let context =
    Context::create_and_push(ContextFlags::MAP_HOST | ContextFlags::SCHED_AUTO, device)?;
let mut join_handles = vec![];

for _ in 0..4 {
    let unowned = context.get_unowned();
    let join_handle = std::thread::spawn(move || {
        CurrentContext::set_current(&unowned).unwrap();
        // Call RustaCUDA functions which use the context
    });
    join_handles.push(join_handle);
}
// We must ensure that the other threads are not using the context when it's destroyed.
for handle in join_handles {
    handle.join().unwrap();
}
// Now it's safe to drop the context.
drop(context);

If you have multiple devices, each device needs its own context.

// Create and pop contexts for each device
let mut contexts = vec![];
for device in Device::devices()? {
    let device = device?;
    let ctx =
        Context::create_and_push(ContextFlags::MAP_HOST | ContextFlags::SCHED_AUTO, device)?;
    ContextStack::pop()?;
    contexts.push(ctx);
}
CurrentContext::set_current(&contexts[0])?;

// Call RustaCUDA functions which will use the context

Structs

Context

Owned handle to a CUDA context.

ContextFlags

Bit flags for initializing the CUDA context.

ContextStack

Type used to represent the thread-local context stack.

CurrentContext

Type representing the top context in the thread-local stack.

StreamPriorityRange

Struct representing a range of stream priorities.

UnownedContext

Non-owning handle to a CUDA context.

Enums

CacheConfig

This enumeration represents configuration settings for devices which share hardware resources between L1 cache and shared memory.

ResourceLimit

This enumeration represents the limited resources which can be accessed through CurrentContext::get_resource_limit and CurrentContext::set_resource_limit.

SharedMemoryConfig

This enumeration represents the options for configuring the shared memory bank size.

Traits

ContextHandle

Sealed trait for Context and UnownedContext. Not intended for use outside of RustaCUDA.