pub enum AxisType {
Outer,
Global,
Warp,
Local,
Loop,
GroupReduce,
Reduce,
Upcast,
Unroll,
Thread,
Placeholder,
}Expand description
Axis type for loop ranges and reductions.
Variants§
Outer
Outer kernel-level scheduling dimension (doesn’t go inside kernels).
Used to mark ranges that exist at the scheduling/orchestration level but don’t become part of kernel execution. These ranges are used during kernel splitting to identify boundaries.
Global
GPU grid dimension.
Warp
Warp/wavefront dimension.
Local
GPU block/workgroup dimension (local memory scope).
Loop
Regular loop.
GroupReduce
Grouped reduction.
Reduce
Reduction axis.
Upcast
Vectorization axis (upcast).
Unroll
Unrolled loop.
Thread
Thread dimension.
Placeholder
Temporary canonicalized range for RESHAPE caching (Tinygrad: AxisType.PLACEHOLDER).
Substituted in before _apply_reshape and substituted back after.
Implementations§
Source§impl AxisType
impl AxisType
Sourcepub const fn is_kernel_boundary(&self) -> bool
pub const fn is_kernel_boundary(&self) -> bool
Returns true if this axis type represents a kernel boundary.
Kernel boundary ranges (Outer) exist at the scheduling level and don’t go inside individual kernels. During kernel splitting, operations with outer ranges are skipped from being packaged into KERNEL ops.
Sourcepub const fn priority(self) -> i32
pub const fn priority(self) -> i32
Returns the priority for sorting ranges.
Lower values are outer loops, higher values are inner loops. Matches Tinygrad’s axis_to_pos ordering for kernel optimization.
Priority Order:
- Outer: -2 (kernel-level boundary)
- Loop: -1 (not yet parallelized)
- Global/Thread: 0 (outer parallelism)
- Warp: 1 (sub-group parallelism)
- Local/GroupReduce: 2 (workgroup parallelism + synchronization)
- Upcast: 3 (vectorization)
- Reduce: 4 (reduction loops)
- Unroll: 5 (unrolled loops, innermost)
Sourcepub const fn letter(self) -> char
pub const fn letter(self) -> char
Returns the single-letter code for this axis type.
Used in kernel name generation and debug output.
Letter Codes:
- O: Outer
- L: Loop
- g: Global
- t: Thread
- w: Warp
- l: Local
- G: GroupReduce
- u: Upcast
- R: Reduce
- r: Unroll
Sourcepub const fn is_parallel(self) -> bool
pub const fn is_parallel(self) -> bool
Returns true if this is a parallelizable axis type.
Parallel axes represent GPU/thread dispatch dimensions that don’t contribute to accumulator placement in reduce_to_acc.
Trait Implementations§
Source§impl<'de> Deserialize<'de> for AxisType
impl<'de> Deserialize<'de> for AxisType
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Source§impl Ord for AxisType
impl Ord for AxisType
Source§impl PartialOrd for AxisType
impl PartialOrd for AxisType
impl Copy for AxisType
impl Eq for AxisType
impl StructuralPartialEq for AxisType
Auto Trait Implementations§
impl Freeze for AxisType
impl RefUnwindSafe for AxisType
impl Send for AxisType
impl Sync for AxisType
impl Unpin for AxisType
impl UnsafeUnpin for AxisType
impl UnwindSafe for AxisType
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<Q, K> Comparable<K> for Q
impl<Q, K> Comparable<K> for Q
Source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
Source§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
key and return true if they are equal.