pub struct TileConfig {
pub block_m: usize,
pub block_n: usize,
pub block_k: usize,
pub thread_m: usize,
pub thread_n: usize,
}Expand description
Tile configuration for the tiled GEMM algorithm
These parameters control the blocking strategy at different memory levels. Optimal values depend on the target hardware’s cache hierarchy.
Fields§
§block_m: usizeBlock tile size for M dimension (shared memory / L1 cache level)
block_n: usizeBlock tile size for N dimension
block_k: usizeBlock tile size for K dimension (loop tiling)
thread_m: usizeRegister tile size for M dimension (per thread/lane)
thread_n: usizeRegister tile size for N dimension (per thread/lane)
Implementations§
Source§impl TileConfig
impl TileConfig
Sourcepub const CUDA: Self
pub const CUDA: Self
CUDA-optimized configuration (Ampere/Ada architecture)
- 128×128 block tiles fit well in shared memory
- 8×8 register tiles maximize occupancy
- BLOCK_K=8 balances memory bandwidth vs compute
Sourcepub const WGPU: Self
pub const WGPU: Self
WebGPU-optimized configuration
- 64×64 block tiles for broader device compatibility
- 4×4 register tiles reduce register pressure
Sourcepub const CPU_AVX: Self
pub const CPU_AVX: Self
CPU AVX2/AVX-512 optimized configuration
- 48×8 tiles fit in L1 cache (32KB)
- 6×8 register tiles fill 12-16 YMM/ZMM registers
Sourcepub const fn threads_per_block(&self) -> usize
pub const fn threads_per_block(&self) -> usize
Number of threads per block (for GPU backends)
Trait Implementations§
Source§impl Clone for TileConfig
impl Clone for TileConfig
Source§fn clone(&self) -> TileConfig
fn clone(&self) -> TileConfig
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for TileConfig
impl Debug for TileConfig
Source§impl Default for TileConfig
impl Default for TileConfig
impl Copy for TileConfig
Auto Trait Implementations§
impl Freeze for TileConfig
impl RefUnwindSafe for TileConfig
impl Send for TileConfig
impl Sync for TileConfig
impl Unpin for TileConfig
impl UnsafeUnpin for TileConfig
impl UnwindSafe for TileConfig
Blanket Implementations§
Source§impl<T> ArchivePointee for T
impl<T> ArchivePointee for T
Source§type ArchivedMetadata = ()
type ArchivedMetadata = ()
The archived version of the pointer metadata for this type.
Source§fn pointer_metadata(
_: &<T as ArchivePointee>::ArchivedMetadata,
) -> <T as Pointee>::Metadata
fn pointer_metadata( _: &<T as ArchivePointee>::ArchivedMetadata, ) -> <T as Pointee>::Metadata
Converts some archived metadata to the pointer metadata for itself.
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> LayoutRaw for T
impl<T> LayoutRaw for T
Source§fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>
fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>
Returns the layout of the type.
Source§impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
Source§unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool
unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool
Returns whether the given value has been niched. Read more
Source§fn resolve_niched(out: Place<NichedOption<T, N1>>)
fn resolve_niched(out: Place<NichedOption<T, N1>>)
Writes data to
out indicating that a T is niched.