Expand description
Warp Types: Type-safe GPU warp programming via linear typestate.
Prevents warp divergence bugs at compile time using linear typestate. A diverged warp literally cannot call shuffle — the method doesn’t exist.
§Core Idea
use warp_types::*;
let warp: Warp<All> = Warp::kernel_entry();
let data = data::PerLane::new(42i32);
// OK: shuffle on full warp
let _shuffled = warp.shuffle_xor(data, 1);
// After diverge, shuffle is gone from the type:
let (evens, odds) = warp.diverge_even_odd();
// evens.shuffle_xor(data, 1); // COMPILE ERROR — method not found
let merged: Warp<All> = merge(evens, odds);§Module Overview
active_set— Lane subset types (All,Even,Odd, …) and complement proofswarp—Warp<S>type parameterized by active setdata— Value categories:PerLane<T>,Uniform<T>,SingleLane<T, N>diverge— Split warps by predicate (produces complementary sub-warps)- merge — Rejoin complementary sub-warps (compile-time verified)
shuffle— Shuffle/ballot/reduce (restricted toWarp<All>) + permutation algebrafence— Fence-divergence interactions (§5.6) — type-state write trackingblock— Block-level: shared memory ownership, inter-block sessions, reductionsproof— Soundness proof sketch (progress + preservation)platform— CPU/GPU platform trait for dual-mode algorithmsgradual—DynWarp↔Warp<S>bridge for gradual typing (§9.4)gpu— GPU intrinsics for nvptx64 and amdgpu targetscub— Typed CUB-equivalent warp primitives (reduce, scan, broadcast)sort— Typed warp-level bitonic sorttile— Cooperative Groups: thread block tiles with typed shuffle safetydynamic— Data-dependent divergence with structural complement guaranteessimwarp— Multi-lane warp simulator with real shuffle semantics (testing)
Re-exports§
pub use active_set::ActiveSet;pub use active_set::All;pub use active_set::CanDiverge;pub use active_set::ComplementOf;pub use active_set::ComplementWithin;pub use active_set::Empty;pub use active_set::Even;pub use active_set::EvenHigh;pub use active_set::EvenLow;pub use active_set::HighHalf;pub use active_set::Lane0;pub use active_set::LowHalf;pub use active_set::NotLane0;pub use active_set::Odd;pub use active_set::OddHigh;pub use active_set::OddLow;pub use block::BlockId;pub use block::ThreadId;pub use data::LaneId;pub use data::PerLane;pub use data::Role;pub use data::SingleLane;pub use data::Uniform;pub use data::WarpId;pub use dynamic::DynDiverge;pub use fence::Fenced;pub use fence::FullWrite;pub use fence::GlobalRegion;pub use fence::PartialWrite;pub use fence::Unwritten;pub use fence::WriteState;pub use gradual::DynWarp;pub use merge::merge;pub use merge::merge_within;pub use platform::CpuSimd;pub use platform::GpuWarp32;pub use platform::GpuWarp64;pub use platform::Platform;pub use platform::SimdVector;pub use shuffle::BallotResult;pub use shuffle::Compose;pub use shuffle::HasDual;pub use shuffle::Identity;pub use shuffle::Permutation;pub use shuffle::RotateDown;pub use shuffle::RotateUp;pub use shuffle::ShuffleSafe;pub use shuffle::Xor;pub use tile::Tile;pub use warp::Warp;
Modules§
- active_
set - Active set types: compile-time lane subset tracking.
- block
- Block-level types: shared memory ownership and inter-block sessions.
- cub
- Typed CUB-equivalent warp primitives.
- data
- GPU data types: uniform vs per-lane value distinction.
- diverge
- Diverge operations: splitting a warp by predicate.
- dynamic
- Data-dependent divergence with structural complement guarantees.
- fence
- Fence-divergence interaction types (§5.6).
- gpu
- GPU intrinsics for nvptx64 and amdgpu targets.
- gradual
- Gradual typing:
DynWarp↔Warp<S>bridge. - merge
- Merge operations: reconverging diverged warps.
- platform
- Platform abstraction for CPU/GPU unified targeting
- prelude
- Convenience prelude — import everything needed for typical usage.
- shuffle
- Shuffle operations and permutation algebra.
- simwarp
- SimWarp: multi-lane warp simulator with real shuffle semantics.
- sort
- Typed warp-level bitonic sort.
- tile
- Cooperative Groups: thread block tiles with typed shuffle safety.
- warp
- The core
Warp<S>type: a warp with compile-time tracked active lanes.
Constants§
- WARP_
SIZE - Number of lanes per warp/wavefront.
Traits§
- GpuValue
- Marker trait for types that can live in GPU registers.
Functions§
- zero_
overhead_ butterfly - Zero-overhead benchmark: 5 shuffle permutations + butterfly reduction.
- zero_
overhead_ diverge_ merge - Diverge-merge round trip: the type system’s core mechanism.
Attribute Macros§
- warp_
kernel - Mark a function as a GPU kernel entry point.