Skip to main content

WarpState

Struct WarpState 

Source
pub struct WarpState { /* private fields */ }
Expand description

Per-warp shared state used to emulate warp-level operations.

In a real GPU each warp executes in lock-step and has hardware support for cross-lane communication. On the CPU we emulate this by having all threads in a “warp” share a WarpState and synchronise explicitly via barriers.

Implementations§

Source§

impl WarpState

Source

pub fn new() -> Self

Create a new warp state with all lanes active.

Source

pub fn set_lane_active(&self, lane_id: u32)

Set a lane as active.

Source

pub fn set_lane_inactive(&self, lane_id: u32)

Set a lane as inactive.

Source

pub fn active_mask(&self) -> u32

Get the current active mask.

Source

pub fn is_lane_active(&self, lane_id: u32) -> bool

Returns true if the specified lane is currently active.

Source

pub fn shuffle(&self, lane_id: u32, value: u32, src_lane: u32) -> u32

Emulate __shfl_sync: read the value from src_lane.

The caller (at lane_id) first writes its own value, then after a barrier reads from src_lane. In a single-threaded emulation context, the caller can pre-populate all lanes and then read.

§Arguments
  • lane_id - The calling thread’s lane within the warp (0..31)
  • value - The value this lane contributes
  • src_lane - The lane to read from

Returns the value from src_lane, or this lane’s own value if src_lane is out of range.

Source

pub fn shuffle_xor(&self, lane_id: u32, value: u32, lane_mask: u32) -> u32

Emulate __shfl_xor_sync: read from lane_id ^ lane_mask.

Source

pub fn shuffle_up(&self, lane_id: u32, value: u32, delta: u32) -> u32

Emulate __shfl_up_sync: read from lane_id - delta. If the source lane would be negative, return the caller’s own value.

Source

pub fn shuffle_down(&self, lane_id: u32, value: u32, delta: u32) -> u32

Emulate __shfl_down_sync: read from lane_id + delta. If the source lane would be >= WARP_SIZE, return the caller’s own value.

Source

pub fn shuffle_f32(&self, lane_id: u32, value: f32, src_lane: u32) -> f32

Shuffle an f32 value (reinterpret bits through u32).

Source

pub fn shuffle_xor_f32(&self, lane_id: u32, value: f32, lane_mask: u32) -> f32

Shuffle XOR with f32.

Source

pub fn shuffle_up_f32(&self, lane_id: u32, value: f32, delta: u32) -> f32

Shuffle up with f32.

Source

pub fn shuffle_down_f32(&self, lane_id: u32, value: f32, delta: u32) -> f32

Shuffle down with f32.

Source

pub fn vote_all(&self, lane_id: u32, predicate: bool) -> bool

Emulate __all_sync: returns true if all active lanes have predicate == true.

Source

pub fn vote_any(&self, lane_id: u32, predicate: bool) -> bool

Emulate __any_sync: returns true if any active lane has predicate == true.

Source

pub fn ballot(&self, lane_id: u32, predicate: bool) -> u32

Emulate __ballot_sync: returns a bitmask where bit i is set if lane i is active and its predicate is true.

Source

pub fn reduce_sum_f32(&self, lane_id: u32, value: f32) -> f32

Warp-level sum reduction using shuffle_down (butterfly pattern).

Assumes all 32 lanes call this with their value. Returns the sum at lane 0; other lanes get a partial result.

Source

pub fn reduce_max_f32(&self, lane_id: u32, value: f32) -> f32

Warp-level max reduction.

Source

pub fn reduce_min_f32(&self, lane_id: u32, value: f32) -> f32

Warp-level min reduction.

Source

pub fn popc_ballot(&self, lane_id: u32, predicate: bool) -> u32

Count the number of active lanes with a true predicate (popcount of ballot).

Trait Implementations§

Source§

impl Default for WarpState

Source§

fn default() -> Self

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> Downcast<T> for T

Source§

fn downcast(&self) -> &T

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> Upcast<T> for T

Source§

fn upcast(&self) -> Option<&T>

Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WasmNotSend for T
where T: Send,

Source§

impl<T> WasmNotSendSync for T

Source§

impl<T> WasmNotSync for T
where T: Sync,