Skip to main content

KernelBuilder

Struct KernelBuilder 

Source
pub struct KernelBuilder { /* private fields */ }
Expand description

Builder for constructing complete PTX kernel modules.

KernelBuilder follows the fluent builder pattern: chain configuration methods, supply a body closure, and call build to produce the final PTX text.

§Example

use oxicuda_ptx::builder::KernelBuilder;
use oxicuda_ptx::arch::SmVersion;
use oxicuda_ptx::ir::PtxType;

let ptx = KernelBuilder::new("vector_add")
    .target(SmVersion::Sm80)
    .param("a", PtxType::U64)
    .param("b", PtxType::U64)
    .param("c", PtxType::U64)
    .param("n", PtxType::U32)
    .body(|b| {
        let tid = b.global_thread_id_x();
        let n_reg = b.load_param_u32("n");
        b.if_lt_u32(tid, n_reg, |b| {
            b.comment("kernel body goes here");
        });
        b.ret();
    })
    .build()
    .expect("PTX generation failed");

assert!(ptx.contains(".entry vector_add"));
assert!(ptx.contains(".target sm_80"));

Implementations§

Source§

impl KernelBuilder

Source

pub fn new(name: &str) -> Self

Creates a new kernel builder with the given kernel name.

The default target is SmVersion::Sm80 (Ampere). Call target to override.

Source

pub const fn target(self, sm: SmVersion) -> Self

Sets the target GPU architecture for this kernel.

This determines the .target and .version directives in the generated PTX, and also controls which instructions the BodyBuilder may emit.

Source

pub fn param(self, name: &str, ty: PtxType) -> Self

Adds a kernel parameter with the given name and type.

Parameters are emitted in declaration order in the .entry signature. Common types: PtxType::U64 for pointers, PtxType::U32 / PtxType::F32 for scalar arguments.

Source

pub fn shared_mem(self, name: &str, ty: PtxType, count: usize) -> Self

Declares a static shared memory allocation.

This generates a .shared .align declaration at the top of the kernel body. The total size is count * ty.size_bytes() bytes.

Source

pub const fn max_threads_per_block(self, n: u32) -> Self

Sets the .maxntid directive, hinting to ptxas the maximum number of threads per block this kernel will be launched with.

This can improve register allocation and occupancy planning.

Source

pub fn body<F>(self, f: F) -> Self
where F: FnOnce(&mut BodyBuilder<'_>) + 'static,

Supplies the body closure that generates the kernel’s instructions.

The closure receives a mutable reference to a BodyBuilder which provides the instruction emission API (loads, stores, arithmetic, control flow, tensor core ops, etc.).

Source

pub fn build(self) -> Result<String, PtxGenError>

Consumes the builder and generates the complete PTX module text.

§Errors

Returns PtxGenError::MissingBody if no body closure was provided. Returns PtxGenError::FormatError if string formatting fails.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.