Skip to main content

KaioDevice

Struct KaioDevice 

Source
pub struct KaioDevice { /* private fields */ }
Expand description

A KAIO GPU device — wraps a CUDA context and its default stream.

Created via KaioDevice::new with a device ordinal (0 for the first GPU). All allocation and transfer operations go through the default stream.

§Example

let device = KaioDevice::new(0)?;
let buf = device.alloc_from(&[1.0f32, 2.0, 3.0])?;
let host = buf.to_host(&device)?;

Implementations§

Source§

impl KaioDevice

Source

pub fn new(ordinal: usize) -> Result<Self>

Create a new device targeting the GPU at the given ordinal.

Ordinal 0 is the first GPU. Returns an error if no GPU exists at that ordinal or if the CUDA driver fails to initialize.

Source

pub fn info(&self) -> Result<DeviceInfo>

Query basic information about this device.

Source

pub fn ordinal(&self) -> usize

CUDA device ordinal (0-indexed) this device wraps.

Used by bridge crates (e.g. kaio-candle) to cross-check that a host-framework device and a KaioDevice refer to the same GPU.

Source

pub fn alloc_from<T: DeviceRepr>(&self, data: &[T]) -> Result<GpuBuffer<T>>

Allocate device memory and copy data from a host slice.

Source

pub fn alloc_zeros<T: DeviceRepr + ValidAsZeroBits>( &self, len: usize, ) -> Result<GpuBuffer<T>>

Allocate zero-initialized device memory.

Source

pub fn stream(&self) -> &Arc<CudaStream>

Access the underlying CUDA stream for kernel launch operations.

Used with cudarc’s launch_builder to launch kernels. In Phase 2, the proc macro will generate typed wrappers that hide this.

Source

pub fn load_ptx(&self, ptx_text: &str) -> Result<KaioModule>

👎Deprecated since 0.2.1:

use load_module(&PtxModule) — runs PtxModule::validate() for readable SM-mismatch errors

Load a PTX module from source text and return a crate::module::KaioModule.

The PTX text is passed to the CUDA driver’s cuModuleLoadData — no NVRTC compilation occurs. The driver JIT-compiles the PTX for the current GPU.

§Deprecated — prefer load_module

The module path runs PtxModule::validate before the driver sees the PTX, catching SM mismatches (e.g. mma.sync on sub-Ampere targets) with readable KaioError::Validation errors instead of cryptic ptxas failures deep in the driver.

This function remains public for raw-PTX use cases (external PTX files, hand-written PTX for research, bypassing validation intentionally). It is not scheduled for removal in the 0.2.x line.

§Migration

Before:

let ptx_text: String = build_my_ptx();
let module = device.load_ptx(&ptx_text)?;

After:

use kaio_core::ir::PtxModule;
let ptx_module: PtxModule = build_my_module("sm_80");
let module = device.load_module(&ptx_module)?;
Source

pub fn load_module(&self, module: &PtxModule) -> Result<KaioModule>

Validate, emit, and load a kaio_core::ir::PtxModule on the device.

This is the preferred entrypoint when the caller has an in-memory PtxModule (as opposed to raw PTX text). Before the PTX text is handed to the driver, kaio_core::ir::PtxModule::validate checks that the module’s target SM supports every feature used by its kernels — raising KaioError::Validation if e.g. a mma.sync op is present but the target is sm_70.

Surfacing the error at this layer gives the user a readable message (“mma.sync.m16n8k16 requires sm_80+, target is sm_70”) instead of a cryptic ptxas error from deep in the driver.

Trait Implementations§

Source§

impl Debug for KaioDevice

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V