pub struct DeviceBuffer<T: Copy> { /* private fields */ }Expand description
A contiguous buffer of T elements allocated in GPU device memory.
The buffer owns the underlying CUdeviceptr allocation and frees it on
drop. All copy operations validate that source and destination lengths
match, returning CudaError::InvalidValue on mismatch.
Implementations§
Source§impl<T: Copy> DeviceBuffer<T>
impl<T: Copy> DeviceBuffer<T>
Sourcepub fn view_as<U: Copy>(&self) -> CudaResult<BufferView<'_, U>>
pub fn view_as<U: Copy>(&self) -> CudaResult<BufferView<'_, U>>
Reinterprets this buffer as a different element type U (immutable).
The total byte size of the buffer must be evenly divisible by
size_of::<U>(). The resulting view has byte_size / size_of::<U>()
elements.
§Errors
Returns CudaError::InvalidValue if:
size_of::<U>()is zero (ZST).- The buffer’s byte size is not divisible by
size_of::<U>().
Sourcepub fn view_as_mut<U: Copy>(&mut self) -> CudaResult<BufferViewMut<'_, U>>
pub fn view_as_mut<U: Copy>(&mut self) -> CudaResult<BufferViewMut<'_, U>>
Reinterprets this buffer as a different element type U (mutable).
The total byte size of the buffer must be evenly divisible by
size_of::<U>(). The resulting view has byte_size / size_of::<U>()
elements.
§Errors
Returns CudaError::InvalidValue if:
size_of::<U>()is zero (ZST).- The buffer’s byte size is not divisible by
size_of::<U>().
Source§impl<T: Copy> DeviceBuffer<T>
impl<T: Copy> DeviceBuffer<T>
Sourcepub fn alloc(n: usize) -> CudaResult<Self>
pub fn alloc(n: usize) -> CudaResult<Self>
Allocates a device buffer capable of holding n elements of type T.
§Errors
CudaError::InvalidValueifnis zero.CudaError::OutOfMemoryif the GPU cannot satisfy the request.- Other driver errors propagated from
cuMemAlloc_v2.
Sourcepub fn zeroed(n: usize) -> CudaResult<Self>
pub fn zeroed(n: usize) -> CudaResult<Self>
Sourcepub fn from_host(data: &[T]) -> CudaResult<Self>
pub fn from_host(data: &[T]) -> CudaResult<Self>
Allocates a device buffer and copies the contents of data into it.
The resulting buffer has the same length as the input slice.
§Errors
CudaError::InvalidValueifdatais empty.- Other driver errors from allocation or the host-to-device copy.
Sourcepub fn copy_from_host(&mut self, src: &[T]) -> CudaResult<()>
pub fn copy_from_host(&mut self, src: &[T]) -> CudaResult<()>
Copies data from a host slice into this device buffer (synchronous).
The slice length must exactly match the buffer length.
§Errors
CudaError::InvalidValueifsrc.len() != self.len().- Other driver errors from
cuMemcpyHtoD_v2.
Sourcepub fn copy_to_host(&self, dst: &mut [T]) -> CudaResult<()>
pub fn copy_to_host(&self, dst: &mut [T]) -> CudaResult<()>
Copies this device buffer’s contents into a host slice (synchronous).
The slice length must exactly match the buffer length.
§Errors
CudaError::InvalidValueifdst.len() != self.len().- Other driver errors from
cuMemcpyDtoH_v2.
Sourcepub fn copy_from_device(&mut self, src: &DeviceBuffer<T>) -> CudaResult<()>
pub fn copy_from_device(&mut self, src: &DeviceBuffer<T>) -> CudaResult<()>
Copies the entire contents of another device buffer into this one.
Both buffers must have the same length.
§Errors
CudaError::InvalidValueifsrc.len() != self.len().- Other driver errors from
cuMemcpyDtoD_v2.
Sourcepub fn copy_from_host_async(
&mut self,
src: &[T],
stream: &Stream,
) -> CudaResult<()>
pub fn copy_from_host_async( &mut self, src: &[T], stream: &Stream, ) -> CudaResult<()>
Asynchronously copies data from a host slice into this device buffer.
The copy is enqueued on stream and may not be complete when this
function returns. The caller must ensure that src remains valid
(i.e., is not moved or dropped) until the stream has been
synchronised. For guaranteed correctness, prefer using a
PinnedBuffer as the source.
§Errors
CudaError::InvalidValueifsrc.len() != self.len().- Other driver errors from
cuMemcpyHtoDAsync_v2.
Sourcepub fn copy_to_host_async(
&self,
dst: &mut [T],
stream: &Stream,
) -> CudaResult<()>
pub fn copy_to_host_async( &self, dst: &mut [T], stream: &Stream, ) -> CudaResult<()>
Asynchronously copies this device buffer’s contents into a host slice.
The copy is enqueued on stream and may not be complete when this
function returns. The caller must ensure that dst remains valid
and is not read until the stream has been synchronised. For
guaranteed correctness, prefer using a
PinnedBuffer as the destination.
§Errors
CudaError::InvalidValueifdst.len() != self.len().- Other driver errors from
cuMemcpyDtoHAsync_v2.
Sourcepub fn is_empty(&self) -> bool
pub fn is_empty(&self) -> bool
Returns true if the buffer contains zero elements.
In practice this is always false because alloc
rejects zero-length allocations.
Sourcepub fn as_device_ptr(&self) -> CUdeviceptr
pub fn as_device_ptr(&self) -> CUdeviceptr
Returns the raw CUdeviceptr handle for this buffer.
This is useful when passing the pointer to kernel launch parameters or other low-level driver calls.
Sourcepub fn slice(&self, offset: usize, len: usize) -> CudaResult<DeviceSlice<'_, T>>
pub fn slice(&self, offset: usize, len: usize) -> CudaResult<DeviceSlice<'_, T>>
Returns a borrowed DeviceSlice referencing a sub-range of this
buffer starting at element offset and spanning len elements.
§Errors
Returns CudaError::InvalidValue if the requested range exceeds
the buffer bounds (i.e., offset + len > self.len()).