pub struct KaioDevice { /* private fields */ }Expand description
A KAIO GPU device — wraps a CUDA context and its default stream.
Created via KaioDevice::new with a device ordinal (0 for the first GPU).
All allocation and transfer operations go through the default stream.
§Example
let device = KaioDevice::new(0)?;
let buf = device.alloc_from(&[1.0f32, 2.0, 3.0])?;
let host = buf.to_host(&device)?;Implementations§
Source§impl KaioDevice
impl KaioDevice
Sourcepub fn new(ordinal: usize) -> Result<Self>
pub fn new(ordinal: usize) -> Result<Self>
Create a new device targeting the GPU at the given ordinal.
Ordinal 0 is the first GPU. Returns an error if no GPU exists at that ordinal or if the CUDA driver fails to initialize.
Sourcepub fn info(&self) -> Result<DeviceInfo>
pub fn info(&self) -> Result<DeviceInfo>
Query basic information about this device.
Sourcepub fn alloc_from<T: DeviceRepr>(&self, data: &[T]) -> Result<GpuBuffer<T>>
pub fn alloc_from<T: DeviceRepr>(&self, data: &[T]) -> Result<GpuBuffer<T>>
Allocate device memory and copy data from a host slice.
Sourcepub fn alloc_zeros<T: DeviceRepr + ValidAsZeroBits>(
&self,
len: usize,
) -> Result<GpuBuffer<T>>
pub fn alloc_zeros<T: DeviceRepr + ValidAsZeroBits>( &self, len: usize, ) -> Result<GpuBuffer<T>>
Allocate zero-initialized device memory.
Sourcepub fn stream(&self) -> &Arc<CudaStream>
pub fn stream(&self) -> &Arc<CudaStream>
Access the underlying CUDA stream for kernel launch operations.
Used with cudarc’s launch_builder to launch kernels. In Phase 2,
the proc macro will generate typed wrappers that hide this.
Sourcepub fn load_ptx(&self, ptx_text: &str) -> Result<KaioModule>
pub fn load_ptx(&self, ptx_text: &str) -> Result<KaioModule>
Load a PTX module from source text and return a crate::module::KaioModule.
The PTX text is passed to the CUDA driver’s cuModuleLoadData —
no NVRTC compilation occurs. The driver JIT-compiles the PTX for
the current GPU.
§Example
let module = device.load_ptx(&ptx_text)?;
let func = module.function("vector_add")?;