pub struct GpuBuffer<T> { /* private fields */ }Expand description
A typed buffer in GPU device memory, wrapping cudarc’s CudaSlice<T>.
Created via KaioDevice::alloc_from or KaioDevice::alloc_zeros.
§Memory management
GpuBuffer does not implement Drop manually — cudarc’s
CudaSlice handles device memory deallocation automatically when
the buffer is dropped. The CudaSlice holds an Arc<CudaContext>
internally, ensuring the CUDA context outlives the allocation.
§Representation — load-bearing
#[repr(transparent)] guarantees this newtype has identical memory
layout, size, and alignment to its sole field CudaSlice<T>. The
kaio-candle bridge crate relies on this to cast &CudaSlice<T>
(borrowed from candle’s CudaStorage) to &GpuBuffer<T> for passing
into kaio-ops kernel entry points without round-tripping through an
owned clone.
Do not remove #[repr(transparent)] or add a second field without
coordinating with kaio-candle. The soundness-assertion tests at the
bottom of this module will fail at compile time if the layout diverges.
Implementations§
Source§impl<T> GpuBuffer<T>
impl<T> GpuBuffer<T>
Sourcepub fn from_cuda_slice(inner: CudaSlice<T>) -> Self
pub fn from_cuda_slice(inner: CudaSlice<T>) -> Self
Wrap an existing cudarc CudaSlice as a GpuBuffer.
Takes ownership of the slice. The returned GpuBuffer drops the
underlying device allocation via cudarc’s normal Drop on its own
drop.
Used by bridge crates (e.g. kaio-candle) to consume a
fresh-allocated slice back into the KAIO buffer type after a kernel
produces its output.
Sourcepub fn into_cuda_slice(self) -> CudaSlice<T>
pub fn into_cuda_slice(self) -> CudaSlice<T>
Consume the buffer and return the underlying cudarc CudaSlice.
Used by bridge crates to hand the owned output slice back to the
host framework (e.g. wrapping into candle_core::CudaStorage) after
a KAIO kernel has written into the buffer.
Source§impl<T: DeviceRepr + Default + Clone + Unpin> GpuBuffer<T>
impl<T: DeviceRepr + Default + Clone + Unpin> GpuBuffer<T>
Sourcepub fn to_host(&self, device: &KaioDevice) -> Result<Vec<T>>
pub fn to_host(&self, device: &KaioDevice) -> Result<Vec<T>>
Transfer buffer contents from device to host.
Requires a reference to the KaioDevice that created this buffer
(for stream access). The device is borrowed, not consumed.
§Example
let device = KaioDevice::new(0)?;
let buf = device.alloc_from(&[1.0f32, 2.0, 3.0])?;
let host_data = buf.to_host(&device)?;
assert_eq!(host_data, vec![1.0, 2.0, 3.0]);