Expand description
Executes multiple independent global matmuls with optional broadcasting.
Structs§
- ColMajor
Global Partition Matmul - Iterates on global matmuls in a col major fashion
- Hypercube
Config - Determines how to launch the hypercube, i.e. anything relevant to CubeCount and where a Cube at a cube position should work Similar to [HyperCubeSelection] but injected in kernel as comptime struct
- Hypercube
Selection - Determines how to launch the hypercube, i.e. anything relevant to CubeCount and where a Cube at a cube position should work
- Partitioned
Batch Matmul Family - Simple partitioned batch matmul family for any precision
- RowMajor
Global Partition Matmul - Iterates on global matmuls in a row major fashion
Enums§
- Cube
Count Input - CubeCountPlan stripped of non-essential runtime information
- Cube
Count Input Args - Cube
Count Plan Selection - Front-facing configuration when crafting a MatmulSelection Allows choosing a strategy before knowing actual values
- Global
Order Selection - Used to create [GlobalOrder].
- SmAllocation
- Controls how Streaming Multiprocessors (SMs) are assigned cubes.
Traits§
- Batch
Config - Configuration for the batch matmul level.
- Batch
Matmul - Provides matrix multiplication operations at the batch level.
- Batch
Matmul Family - A family of matmuls working with any precision.