#[repr(C)]
pub struct DeviceBuffer<T: DeviceCopy> { /* private fields */ }
Expand description

Fixed-size device-side buffer. Provides basic access to device memory.

Implementations

Allocate a new device buffer large enough to hold size T’s, but without initializing the contents.

Errors

If the allocation fails, returns the error from CUDA. If size is large enough that size * mem::sizeof::<T>() overflows usize, then returns InvalidMemoryAllocation.

Safety

The caller must ensure that the contents of the buffer are initialized before reading from the buffer.

Examples
use cust::memory::*;
let mut buffer = unsafe { DeviceBuffer::uninitialized(5).unwrap() };
buffer.copy_from(&[0u64, 1, 2, 3, 4]).unwrap();

Allocates device memory asynchronously on a stream, without initializing it.

This doesn’t actually allocate if T is zero sized.

Safety

The allocated memory retains all of the unsafety of DeviceBuffer::uninitialized, with the additional consideration that the memory cannot be used until it is actually allocated on the stream. This means proper stream ordering semantics must be followed, such as only enqueing kernel launches that use the memory AFTER the allocation call.

You can synchronize the stream to ensure the memory allocation operation is complete.

Enqueues an operation to free the memory backed by this DeviceBuffer on a particular stream. The stream will free the allocation as soon as it reaches the operation in the stream. You can ensure the memory is freed by synchronizing the stream.

This function uses internal memory pool semantics. Async allocations will reserve memory in the default memory pool in the stream, and async frees will release the memory back to the pool for further use by async allocations.

The memory inside of the pool is all freed back to the OS once the stream is synchronized unless a custom pool is configured to not do so.

Examples
use cust::{memory::*, stream::*};
let stream = Stream::new(StreamFlags::DEFAULT, None)?;
let mut host_vals = [1, 2, 3];
unsafe {
    let mut allocated = DeviceBuffer::from_slice_async(&[4u8, 5, 6], &stream)?;
    allocated.async_copy_to(&mut host_vals, &stream)?;
    allocated.drop_async(&stream)?;
}
// ensure all async ops are done before trying to access the value
stream.synchronize()?;
assert_eq!(host_vals, [4, 5, 6]);

Creates a DeviceBuffer<T> directly from the raw components of another device buffer.

Safety

This is highly unsafe, due to the number of invariants that aren’t checked:

  • ptr needs to have been previously allocated via DeviceBuffer or cuda_malloc.
  • ptr’s T needs to have the same size and alignment as it was allocated with.
  • capacity needs to be the capacity that the pointer was allocated with.

Violating these may cause problems like corrupting the CUDA driver’s internal data structures.

The ownership of ptr is effectively transferred to the DeviceBuffer<T> which may then deallocate, reallocate or change the contents of memory pointed to by the pointer at will. Ensure that nothing else uses the pointer after calling this function.

Examples
use std::mem;
use cust::memory::*;

let mut buffer = DeviceBuffer::from_slice(&[0u64; 5]).unwrap();
let ptr = buffer.as_device_ptr();
let size = buffer.len();

mem::forget(buffer);

let buffer = unsafe { DeviceBuffer::from_raw_parts(ptr, size) };

Destroy a DeviceBuffer, returning an error.

Deallocating device memory can return errors from previous asynchronous work. This function destroys the given buffer and returns the error and the un-destroyed buffer on failure.

Example
use cust::memory::*;
let x = DeviceBuffer::from_slice(&[10, 20, 30]).unwrap();
match DeviceBuffer::drop(x) {
    Ok(()) => println!("Successfully destroyed"),
    Err((e, buf)) => {
        println!("Failed to destroy buffer: {:?}", e);
        // Do something with buf
    },
}
This is supported on crate feature bytemuck only.

Allocate device memory and fill it with zeroes (0u8).

This doesn’t actually allocate if T is zero-sized.

Examples
use cust::memory::*;
let mut zero = DeviceBuffer::zeroed(4).unwrap();
let mut values = [1u8, 2, 3, 4];
zero.copy_to(&mut values).unwrap();
assert_eq!(values, [0; 4]);
This is supported on crate feature bytemuck only.

Allocates device memory asynchronously and asynchronously fills it with zeroes (0u8).

This doesn’t actually allocate if T is zero-sized.

Safety

This method enqueues two operations on the stream: An async allocation and an async memset. Because of this, you must ensure that:

  • The memory is not used in any way before it is actually allocated on the stream. You can ensure this happens by synchronizing the stream explicitly or using events.
Examples
use cust::{memory::*, stream::*};
let stream = Stream::new(StreamFlags::DEFAULT, None)?;
let mut values = [1u8, 2, 3, 4];
unsafe {
    let mut zero = DeviceBuffer::zeroed_async(4, &stream)?;
    zero.async_copy_to(&mut values, &stream)?;
    zero.drop_async(&stream)?;
}
stream.synchronize()?;
assert_eq!(values, [0; 4]);
This is supported on crate feature bytemuck only.

Same as DeviceBuffer::try_cast but panics if the cast fails.

Panics

See DeviceBuffer::try_cast.

This is supported on crate feature bytemuck only.

Tries to convert a DeviceBuffer of type A to a DeviceBuffer of type B. Returning an error if it failed.

The length of the buffer after the conversion may have changed.

Failure
  • If the target type has a greater alignment requirement.
  • If the target element type is a different size and the output buffer wouldn’t have a whole number of elements. Such as 3 x u16 -> 1.5 x u32.
  • If either type is a ZST (but not both).

Allocate a new device buffer of the same size as slice, initialized with a clone of the data in slice.

Errors

If the allocation fails, returns the error from CUDA.

Examples
use cust::memory::*;
let values = [0u64; 5];
let mut buffer = DeviceBuffer::from_slice(&values).unwrap();

Asynchronously allocate a new buffer of the same size as slice, initialized with a clone of the data in slice.

Safety

For why this function is unsafe, see AsyncCopyDestination

Errors

If the allocation fails, returns the error from CUDA.

Examples
use cust::memory::*;
use cust::stream::{Stream, StreamFlags};

let stream = Stream::new(StreamFlags::NON_BLOCKING, None).unwrap();
let values = [0u64; 5];
unsafe {
    let mut buffer = DeviceBuffer::from_slice_async(&values, &stream).unwrap();
    stream.synchronize();
    // Perform some operation on the buffer
}

Explicitly creates a DeviceSlice from this buffer.

Methods from Deref<Target = DeviceSlice<T>>

Returns the number of elements in the slice.

Examples
use cust::memory::*;
let a = DeviceBuffer::from_slice(&[1, 2, 3]).unwrap();
assert_eq!(a.len(), 3);

Returns true if the slice has a length of 0.

Examples
use cust::memory::*;
let a : DeviceBuffer<u64> = unsafe { DeviceBuffer::uninitialized(0).unwrap() };
assert!(a.is_empty());

Return a raw device-pointer to the slice’s buffer.

The caller must ensure that the slice outlives the pointer this function returns, or else it will end up pointing to garbage. The caller must also ensure that the pointer is not dereferenced by the CPU.

Examples:

use cust::memory::*;
let a = DeviceBuffer::from_slice(&[1, 2, 3]).unwrap();
println!("{:p}", a.as_ptr());
This is supported on crate feature bytemuck only.

Sets the memory range of this buffer to contiguous 8-bit values of value.

In total it will set sizeof<T> * len values of value contiguously.

This is supported on crate feature bytemuck only.

Sets the memory range of this buffer to contiguous 8-bit values of value asynchronously.

In total it will set sizeof<T> * len values of value contiguously.

Safety

This operation is async so it does not complete immediately, it uses stream-ordering semantics. Therefore you should not read/write from/to the memory range until the operation is complete.

This is supported on crate feature bytemuck only.

Sets the memory range of this buffer to contiguous 16-bit values of value.

In total it will set (sizeof<T> / 2) * len values of value contiguously.

Panics

Panics if:

  • self.ptr % 2 != 0 (the pointer is not aligned to at least 2 bytes).
  • (size_of::<T>() * self.len) % 2 != 0 (the data size is not a multiple of 2 bytes)
This is supported on crate feature bytemuck only.

Sets the memory range of this buffer to contiguous 16-bit values of value asynchronously.

In total it will set (sizeof<T> / 2) * len values of value contiguously.

Panics

Panics if:

  • self.ptr % 2 != 0 (the pointer is not aligned to at least 2 bytes).
  • (size_of::<T>() * self.len) % 2 != 0 (the data size is not a multiple of 2 bytes)
Safety

This operation is async so it does not complete immediately, it uses stream-ordering semantics. Therefore you should not read/write from/to the memory range until the operation is complete.

This is supported on crate feature bytemuck only.

Sets the memory range of this buffer to contiguous 32-bit values of value.

In total it will set (sizeof<T> / 4) * len values of value contiguously.

Panics

Panics if:

  • self.ptr % 4 != 0 (the pointer is not aligned to at least 4 bytes).
  • (size_of::<T>() * self.len) % 4 != 0 (the data size is not a multiple of 4 bytes)
This is supported on crate feature bytemuck only.

Sets the memory range of this buffer to contiguous 32-bit values of value asynchronously.

In total it will set (sizeof<T> / 4) * len values of value contiguously.

Panics

Panics if:

  • self.ptr % 4 != 0 (the pointer is not aligned to at least 4 bytes).
  • (size_of::<T>() * self.len) % 4 != 0 (the data size is not a multiple of 4 bytes)
Safety

This operation is async so it does not complete immediately, it uses stream-ordering semantics. Therefore you should not read/write from/to the memory range until the operation is complete.

Sets this slice’s data to zero.

Sets this slice’s data to zero asynchronously.

Safety

This operation is async so it does not complete immediately, it uses stream-ordering semantics. Therefore you should not read/write from/to the memory range until the operation is complete.

Trait Implementations

Asynchronously copy data from source. source must be the same size as self. Read more

Asynchronously copy data to dest. dest must be the same size as self. Read more

Copy data from source. source must be the same size as self. Read more

Copy data to dest. dest must be the same size as self. Read more

Formats the value using the given formatter. Read more

The resulting type after dereferencing.

Dereferences the value.

Mutably dereferences the value.

Get the raw cuda device pointer

Get the size of the memory region in bytes

Executes the destructor for this type. Read more

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Performs the conversion.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.