Struct cust::memory::DeviceBuffer
source · [−]#[repr(C)]pub struct DeviceBuffer<T: DeviceCopy> { /* private fields */ }
Expand description
Fixed-size device-side buffer. Provides basic access to device memory.
Implementations
sourceimpl<T: DeviceCopy> DeviceBuffer<T>
impl<T: DeviceCopy> DeviceBuffer<T>
sourcepub unsafe fn uninitialized(size: usize) -> CudaResult<Self>
pub unsafe fn uninitialized(size: usize) -> CudaResult<Self>
Allocate a new device buffer large enough to hold size
T
’s, but without
initializing the contents.
Errors
If the allocation fails, returns the error from CUDA. If size
is large enough that
size * mem::sizeof::<T>()
overflows usize, then returns InvalidMemoryAllocation.
Safety
The caller must ensure that the contents of the buffer are initialized before reading from the buffer.
Examples
use cust::memory::*;
let mut buffer = unsafe { DeviceBuffer::uninitialized(5).unwrap() };
buffer.copy_from(&[0u64, 1, 2, 3, 4]).unwrap();
sourcepub unsafe fn uninitialized_async(
size: usize,
stream: &Stream
) -> CudaResult<Self>
pub unsafe fn uninitialized_async(
size: usize,
stream: &Stream
) -> CudaResult<Self>
Allocates device memory asynchronously on a stream, without initializing it.
This doesn’t actually allocate if T
is zero sized.
Safety
The allocated memory retains all of the unsafety of DeviceBuffer::uninitialized
, with
the additional consideration that the memory cannot be used until it is actually allocated
on the stream. This means proper stream ordering semantics must be followed, such as
only enqueing kernel launches that use the memory AFTER the allocation call.
You can synchronize the stream to ensure the memory allocation operation is complete.
sourcepub fn drop_async(self, stream: &Stream) -> CudaResult<()>
pub fn drop_async(self, stream: &Stream) -> CudaResult<()>
Enqueues an operation to free the memory backed by this DeviceBuffer
on a
particular stream. The stream will free the allocation as soon as it reaches
the operation in the stream. You can ensure the memory is freed by synchronizing
the stream.
This function uses internal memory pool semantics. Async allocations will reserve memory in the default memory pool in the stream, and async frees will release the memory back to the pool for further use by async allocations.
The memory inside of the pool is all freed back to the OS once the stream is synchronized unless a custom pool is configured to not do so.
Examples
use cust::{memory::*, stream::*};
let stream = Stream::new(StreamFlags::DEFAULT, None)?;
let mut host_vals = [1, 2, 3];
unsafe {
let mut allocated = DeviceBuffer::from_slice_async(&[4u8, 5, 6], &stream)?;
allocated.async_copy_to(&mut host_vals, &stream)?;
allocated.drop_async(&stream)?;
}
// ensure all async ops are done before trying to access the value
stream.synchronize()?;
assert_eq!(host_vals, [4, 5, 6]);
sourcepub unsafe fn from_raw_parts(
ptr: DevicePointer<T>,
capacity: usize
) -> DeviceBuffer<T>
pub unsafe fn from_raw_parts(
ptr: DevicePointer<T>,
capacity: usize
) -> DeviceBuffer<T>
Creates a DeviceBuffer<T>
directly from the raw components of another device buffer.
Safety
This is highly unsafe, due to the number of invariants that aren’t checked:
ptr
needs to have been previously allocated viaDeviceBuffer
orcuda_malloc
.ptr
’sT
needs to have the same size and alignment as it was allocated with.capacity
needs to be the capacity that the pointer was allocated with.
Violating these may cause problems like corrupting the CUDA driver’s internal data structures.
The ownership of ptr
is effectively transferred to the
DeviceBuffer<T>
which may then deallocate, reallocate or change the
contents of memory pointed to by the pointer at will. Ensure
that nothing else uses the pointer after calling this
function.
Examples
use std::mem;
use cust::memory::*;
let mut buffer = DeviceBuffer::from_slice(&[0u64; 5]).unwrap();
let ptr = buffer.as_device_ptr();
let size = buffer.len();
mem::forget(buffer);
let buffer = unsafe { DeviceBuffer::from_raw_parts(ptr, size) };
sourcepub fn drop(dev_buf: DeviceBuffer<T>) -> DropResult<DeviceBuffer<T>>
pub fn drop(dev_buf: DeviceBuffer<T>) -> DropResult<DeviceBuffer<T>>
Destroy a DeviceBuffer
, returning an error.
Deallocating device memory can return errors from previous asynchronous work. This function destroys the given buffer and returns the error and the un-destroyed buffer on failure.
Example
use cust::memory::*;
let x = DeviceBuffer::from_slice(&[10, 20, 30]).unwrap();
match DeviceBuffer::drop(x) {
Ok(()) => println!("Successfully destroyed"),
Err((e, buf)) => {
println!("Failed to destroy buffer: {:?}", e);
// Do something with buf
},
}
sourceimpl<T: DeviceCopy + Zeroable> DeviceBuffer<T>
impl<T: DeviceCopy + Zeroable> DeviceBuffer<T>
sourcepub fn zeroed(size: usize) -> CudaResult<Self>
This is supported on crate feature bytemuck
only.
pub fn zeroed(size: usize) -> CudaResult<Self>
bytemuck
only.Allocate device memory and fill it with zeroes (0u8
).
This doesn’t actually allocate if T
is zero-sized.
Examples
use cust::memory::*;
let mut zero = DeviceBuffer::zeroed(4).unwrap();
let mut values = [1u8, 2, 3, 4];
zero.copy_to(&mut values).unwrap();
assert_eq!(values, [0; 4]);
sourcepub unsafe fn zeroed_async(size: usize, stream: &Stream) -> CudaResult<Self>
This is supported on crate feature bytemuck
only.
pub unsafe fn zeroed_async(size: usize, stream: &Stream) -> CudaResult<Self>
bytemuck
only.Allocates device memory asynchronously and asynchronously fills it with zeroes (0u8
).
This doesn’t actually allocate if T
is zero-sized.
Safety
This method enqueues two operations on the stream: An async allocation and an async memset. Because of this, you must ensure that:
- The memory is not used in any way before it is actually allocated on the stream. You can ensure this happens by synchronizing the stream explicitly or using events.
Examples
use cust::{memory::*, stream::*};
let stream = Stream::new(StreamFlags::DEFAULT, None)?;
let mut values = [1u8, 2, 3, 4];
unsafe {
let mut zero = DeviceBuffer::zeroed_async(4, &stream)?;
zero.async_copy_to(&mut values, &stream)?;
zero.drop_async(&stream)?;
}
stream.synchronize()?;
assert_eq!(values, [0; 4]);
sourceimpl<A: DeviceCopy + Pod> DeviceBuffer<A>
impl<A: DeviceCopy + Pod> DeviceBuffer<A>
sourcepub fn cast<B: Pod + DeviceCopy>(self) -> DeviceBuffer<B>
This is supported on crate feature bytemuck
only.
pub fn cast<B: Pod + DeviceCopy>(self) -> DeviceBuffer<B>
bytemuck
only.sourcepub fn try_cast<B: Pod + DeviceCopy>(
self
) -> Result<DeviceBuffer<B>, PodCastError>
This is supported on crate feature bytemuck
only.
pub fn try_cast<B: Pod + DeviceCopy>(
self
) -> Result<DeviceBuffer<B>, PodCastError>
bytemuck
only.Tries to convert a DeviceBuffer
of type A
to a DeviceBuffer
of type B
. Returning
an error if it failed.
The length of the buffer after the conversion may have changed.
Failure
sourceimpl<T: DeviceCopy> DeviceBuffer<T>
impl<T: DeviceCopy> DeviceBuffer<T>
sourcepub fn from_slice(slice: &[T]) -> CudaResult<Self>
pub fn from_slice(slice: &[T]) -> CudaResult<Self>
sourcepub unsafe fn from_slice_async(slice: &[T], stream: &Stream) -> CudaResult<Self>
pub unsafe fn from_slice_async(slice: &[T], stream: &Stream) -> CudaResult<Self>
Asynchronously allocate a new buffer of the same size as slice
, initialized
with a clone of the data in slice
.
Safety
For why this function is unsafe, see AsyncCopyDestination
Errors
If the allocation fails, returns the error from CUDA.
Examples
use cust::memory::*;
use cust::stream::{Stream, StreamFlags};
let stream = Stream::new(StreamFlags::NON_BLOCKING, None).unwrap();
let values = [0u64; 5];
unsafe {
let mut buffer = DeviceBuffer::from_slice_async(&values, &stream).unwrap();
stream.synchronize();
// Perform some operation on the buffer
}
sourcepub fn as_slice(&self) -> &DeviceSlice<T>
pub fn as_slice(&self) -> &DeviceSlice<T>
Explicitly creates a DeviceSlice
from this buffer.
Methods from Deref<Target = DeviceSlice<T>>
pub fn as_host_vec(&self) -> CudaResult<Vec<T>>
sourcepub fn len(&self) -> usize
pub fn len(&self) -> usize
Returns the number of elements in the slice.
Examples
use cust::memory::*;
let a = DeviceBuffer::from_slice(&[1, 2, 3]).unwrap();
assert_eq!(a.len(), 3);
sourcepub fn is_empty(&self) -> bool
pub fn is_empty(&self) -> bool
Returns true
if the slice has a length of 0.
Examples
use cust::memory::*;
let a : DeviceBuffer<u64> = unsafe { DeviceBuffer::uninitialized(0).unwrap() };
assert!(a.is_empty());
sourcepub fn as_device_ptr(&self) -> DevicePointer<T>
pub fn as_device_ptr(&self) -> DevicePointer<T>
Return a raw device-pointer to the slice’s buffer.
The caller must ensure that the slice outlives the pointer this function returns, or else it will end up pointing to garbage. The caller must also ensure that the pointer is not dereferenced by the CPU.
Examples:
use cust::memory::*;
let a = DeviceBuffer::from_slice(&[1, 2, 3]).unwrap();
println!("{:p}", a.as_ptr());
sourcepub fn set_8(&mut self, value: u8) -> CudaResult<()>
This is supported on crate feature bytemuck
only.
pub fn set_8(&mut self, value: u8) -> CudaResult<()>
bytemuck
only.Sets the memory range of this buffer to contiguous 8-bit
values of value
.
In total it will set sizeof<T> * len
values of value
contiguously.
sourcepub unsafe fn set_8_async(
&mut self,
value: u8,
stream: &Stream
) -> CudaResult<()>
This is supported on crate feature bytemuck
only.
pub unsafe fn set_8_async(
&mut self,
value: u8,
stream: &Stream
) -> CudaResult<()>
bytemuck
only.Sets the memory range of this buffer to contiguous 8-bit
values of value
asynchronously.
In total it will set sizeof<T> * len
values of value
contiguously.
Safety
This operation is async so it does not complete immediately, it uses stream-ordering semantics. Therefore you should not read/write from/to the memory range until the operation is complete.
sourcepub fn set_16(&mut self, value: u16) -> CudaResult<()>
This is supported on crate feature bytemuck
only.
pub fn set_16(&mut self, value: u16) -> CudaResult<()>
bytemuck
only.Sets the memory range of this buffer to contiguous 16-bit
values of value
.
In total it will set (sizeof<T> / 2) * len
values of value
contiguously.
Panics
Panics if:
self.ptr % 2 != 0
(the pointer is not aligned to at least 2 bytes).(size_of::<T>() * self.len) % 2 != 0
(the data size is not a multiple of 2 bytes)
sourcepub unsafe fn set_16_async(
&mut self,
value: u16,
stream: &Stream
) -> CudaResult<()>
This is supported on crate feature bytemuck
only.
pub unsafe fn set_16_async(
&mut self,
value: u16,
stream: &Stream
) -> CudaResult<()>
bytemuck
only.Sets the memory range of this buffer to contiguous 16-bit
values of value
asynchronously.
In total it will set (sizeof<T> / 2) * len
values of value
contiguously.
Panics
Panics if:
self.ptr % 2 != 0
(the pointer is not aligned to at least 2 bytes).(size_of::<T>() * self.len) % 2 != 0
(the data size is not a multiple of 2 bytes)
Safety
This operation is async so it does not complete immediately, it uses stream-ordering semantics. Therefore you should not read/write from/to the memory range until the operation is complete.
sourcepub fn set_32(&mut self, value: u32) -> CudaResult<()>
This is supported on crate feature bytemuck
only.
pub fn set_32(&mut self, value: u32) -> CudaResult<()>
bytemuck
only.Sets the memory range of this buffer to contiguous 32-bit
values of value
.
In total it will set (sizeof<T> / 4) * len
values of value
contiguously.
Panics
Panics if:
self.ptr % 4 != 0
(the pointer is not aligned to at least 4 bytes).(size_of::<T>() * self.len) % 4 != 0
(the data size is not a multiple of 4 bytes)
sourcepub unsafe fn set_32_async(
&mut self,
value: u32,
stream: &Stream
) -> CudaResult<()>
This is supported on crate feature bytemuck
only.
pub unsafe fn set_32_async(
&mut self,
value: u32,
stream: &Stream
) -> CudaResult<()>
bytemuck
only.Sets the memory range of this buffer to contiguous 32-bit
values of value
asynchronously.
In total it will set (sizeof<T> / 4) * len
values of value
contiguously.
Panics
Panics if:
self.ptr % 4 != 0
(the pointer is not aligned to at least 4 bytes).(size_of::<T>() * self.len) % 4 != 0
(the data size is not a multiple of 4 bytes)
Safety
This operation is async so it does not complete immediately, it uses stream-ordering semantics. Therefore you should not read/write from/to the memory range until the operation is complete.
sourcepub fn set_zero(&mut self) -> CudaResult<()>
pub fn set_zero(&mut self) -> CudaResult<()>
Sets this slice’s data to zero.
sourcepub unsafe fn set_zero_async(&mut self, stream: &Stream) -> CudaResult<()>
pub unsafe fn set_zero_async(&mut self, stream: &Stream) -> CudaResult<()>
Sets this slice’s data to zero asynchronously.
Safety
This operation is async so it does not complete immediately, it uses stream-ordering semantics. Therefore you should not read/write from/to the memory range until the operation is complete.
pub fn index<Idx: DeviceSliceIndex<T>>(&self, idx: Idx) -> DeviceSlice<T>
Trait Implementations
sourceimpl<T: DeviceCopy> AsyncCopyDestination<DeviceBuffer<T>> for DeviceSlice<T>
impl<T: DeviceCopy> AsyncCopyDestination<DeviceBuffer<T>> for DeviceSlice<T>
sourceunsafe fn async_copy_from(
&mut self,
val: &DeviceBuffer<T>,
stream: &Stream
) -> CudaResult<()>
unsafe fn async_copy_from(
&mut self,
val: &DeviceBuffer<T>,
stream: &Stream
) -> CudaResult<()>
Asynchronously copy data from source
. source
must be the same size as self
. Read more
sourceunsafe fn async_copy_to(
&self,
val: &mut DeviceBuffer<T>,
stream: &Stream
) -> CudaResult<()>
unsafe fn async_copy_to(
&self,
val: &mut DeviceBuffer<T>,
stream: &Stream
) -> CudaResult<()>
Asynchronously copy data to dest
. dest
must be the same size as self
. Read more
sourceimpl<T: DeviceCopy> CopyDestination<DeviceBuffer<T>> for DeviceSlice<T>
impl<T: DeviceCopy> CopyDestination<DeviceBuffer<T>> for DeviceSlice<T>
sourcefn copy_from(&mut self, val: &DeviceBuffer<T>) -> CudaResult<()>
fn copy_from(&mut self, val: &DeviceBuffer<T>) -> CudaResult<()>
Copy data from source
. source
must be the same size as self
. Read more
sourcefn copy_to(&self, val: &mut DeviceBuffer<T>) -> CudaResult<()>
fn copy_to(&self, val: &mut DeviceBuffer<T>) -> CudaResult<()>
Copy data to dest
. dest
must be the same size as self
. Read more
sourceimpl<T: Debug + DeviceCopy> Debug for DeviceBuffer<T>
impl<T: Debug + DeviceCopy> Debug for DeviceBuffer<T>
sourceimpl<T: DeviceCopy> Deref for DeviceBuffer<T>
impl<T: DeviceCopy> Deref for DeviceBuffer<T>
type Target = DeviceSlice<T>
type Target = DeviceSlice<T>
The resulting type after dereferencing.
sourcefn deref(&self) -> &DeviceSlice<T>
fn deref(&self) -> &DeviceSlice<T>
Dereferences the value.
sourceimpl<T: DeviceCopy> DerefMut for DeviceBuffer<T>
impl<T: DeviceCopy> DerefMut for DeviceBuffer<T>
sourcefn deref_mut(&mut self) -> &mut DeviceSlice<T>
fn deref_mut(&mut self) -> &mut DeviceSlice<T>
Mutably dereferences the value.
sourceimpl<T: DeviceCopy> DeviceMemory for DeviceBuffer<T>
impl<T: DeviceCopy> DeviceMemory for DeviceBuffer<T>
sourcefn as_raw_ptr(&self) -> CUdeviceptr
fn as_raw_ptr(&self) -> CUdeviceptr
Get the raw cuda device pointer
sourcefn size_in_bytes(&self) -> usize
fn size_in_bytes(&self) -> usize
Get the size of the memory region in bytes
sourceimpl<T: DeviceCopy> Drop for DeviceBuffer<T>
impl<T: DeviceCopy> Drop for DeviceBuffer<T>
sourceimpl<T: DeviceCopy> GpuBuffer<T> for DeviceBuffer<T>
impl<T: DeviceCopy> GpuBuffer<T> for DeviceBuffer<T>
fn as_device_ptr(&self) -> DevicePointer<T>
fn len(&self) -> usize
impl<T: Send + DeviceCopy> Send for DeviceBuffer<T>
impl<T: Sync + DeviceCopy> Sync for DeviceBuffer<T>
Auto Trait Implementations
impl<T> RefUnwindSafe for DeviceBuffer<T> where
T: RefUnwindSafe,
impl<T> Unpin for DeviceBuffer<T>
impl<T> UnwindSafe for DeviceBuffer<T> where
T: RefUnwindSafe,
Blanket Implementations
sourceimpl<T> BorrowMut<T> for T where
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
const: unstable · sourcepub fn borrow_mut(&mut self) -> &mut T
pub fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more