Skip to main content

ControlOp

Enum ControlOp 

Source
pub enum ControlOp {
    SetP {
        dst: Register,
        cmp_op: CmpOp,
        lhs: Operand,
        rhs: Operand,
        ty: PtxType,
    },
    SetPAnd {
        dst: Register,
        cmp_op: CmpOp,
        lhs: Operand,
        rhs: Operand,
        ty: PtxType,
        src_pred: Register,
    },
    BraPred {
        pred: Register,
        target: String,
        negate: bool,
    },
    Bra {
        target: String,
    },
    Ret,
    BarSync {
        barrier_id: u32,
    },
    ShflSyncDown {
        dst: Register,
        src: Register,
        delta: Operand,
        c: u32,
        mask: u32,
    },
    ShflSyncUp {
        dst: Register,
        src: Register,
        delta: Operand,
        c: u32,
        mask: u32,
    },
    ShflSyncBfly {
        dst: Register,
        src: Register,
        lane_mask: Operand,
        c: u32,
        mask: u32,
    },
}
Expand description

Control flow PTX instruction variants.

Variants§

§

SetP

Set predicate from comparison: setp.{cmp_op}{ty} pred, lhs, rhs;

Compares lhs and rhs and writes the result to a predicate register. Example: setp.ge.u32 %p1, %r1, %r2;

Fields

§dst: Register

Destination predicate register.

§cmp_op: CmpOp

Comparison operation.

§lhs: Operand

Left-hand operand (register or immediate).

§rhs: Operand

Right-hand operand (register or immediate).

§ty: PtxType

PTX type for the comparison.

§

SetPAnd

Set predicate from comparison ANDed with a source predicate: setp.{cmp_op}.and{ty} pred, lhs, rhs, src_pred;

Computes pred = (lhs CmpOp rhs) AND src_pred in one instruction. Used for compact edge-tile bounds checking — combines a row check with an existing col-check predicate without a separate and.pred. Sprint 6.7 (multi-warp matmul_tc edge tiles) is the first user. Example: setp.lt.and.u32 %p3, %r5, %r10, %p2;

Fields

§dst: Register

Destination predicate register.

§cmp_op: CmpOp

Comparison operation applied to lhs/rhs.

§lhs: Operand

Left-hand operand of the comparison.

§rhs: Operand

Right-hand operand of the comparison.

§ty: PtxType

PTX type for the comparison.

§src_pred: Register

Source predicate AND’d with the comparison result.

§

BraPred

Predicated branch: @{pred} bra {target}; or @!{pred} bra {target};

Branches to target label if pred is true (or false when negated). Uses PtxWriter::line() instead of instruction() because the @pred mnemonic target; format doesn’t fit the comma-separated operand pattern.

Examples:

  • @%p1 bra $L__BB0_2; — branch if pred is true
  • @!%p1 bra IF_END_0; — branch if pred is false (Phase 2 if/else)

Fields

§pred: Register

Predicate register to test.

§target: String

Label name to branch to.

§negate: bool

When true, negate the predicate (@!pred). Deferred from Sprint 1.4, needed for Phase 2 if/else lowering where setp matches the source comparison and @!pred bra skips the then-block when the condition is false.

§

Bra

Unconditional branch: bra {target};

Not used in vector_add but included for Phase 3 loop support.

Fields

§target: String

Label name to branch to.

§

Ret

Return from kernel: ret;

§

BarSync

Block-level barrier synchronization: bar.sync {barrier_id};

All threads in the block must reach this instruction before any can proceed. Barrier 0 is the conventional default. Example: bar.sync 0;

Fields

§barrier_id: u32

Barrier identifier (0 is conventional for single-barrier use).

§

ShflSyncDown

Warp shuffle down: shfl.sync.down.b32 dst, src, delta, c, membermask;

Each thread reads from the thread delta lanes below it within the warp. The c operand packs clamp width (see PTX ISA 8.7 S9.7.8). Example: shfl.sync.down.b32 %r2, %r1, 1, 31, 0xFFFFFFFF;

Fields

§dst: Register

Destination register.

§src: Register

Source register (value to share).

§delta: Operand

Delta (offset) — how many lanes down.

§c: u32

Pre-packed clamp/width value (encoding is caller’s responsibility).

§mask: u32

Member mask (0xFFFFFFFF = full warp).

§

ShflSyncUp

Warp shuffle up: shfl.sync.up.b32 dst, src, delta, c, membermask;

Each thread reads from the thread delta lanes above it.

Fields

§dst: Register

Destination register.

§src: Register

Source register.

§delta: Operand

Delta (offset) — how many lanes up.

§c: u32

Pre-packed clamp/width value.

§mask: u32

Member mask.

§

ShflSyncBfly

Warp shuffle butterfly (XOR): shfl.sync.bfly.b32 dst, src, lane_mask, c, membermask;

Each thread reads from the thread at lane XOR lane_mask.

Fields

§dst: Register

Destination register.

§src: Register

Source register.

§lane_mask: Operand

Lane mask for XOR operation.

§c: u32

Pre-packed clamp/width value.

§mask: u32

Member mask.

Trait Implementations§

Source§

impl Clone for ControlOp

Source§

fn clone(&self) -> ControlOp

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for ControlOp

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Emit for ControlOp

Source§

fn emit(&self, w: &mut PtxWriter) -> Result

Write this node’s PTX representation to the writer.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.