pub struct TilingConfig {
pub name: String,
pub macro_tile: TcbGeometry,
pub midi_tile: TcbGeometry,
pub micro_tile: TcbGeometry,
pub backend: TilingBackend,
}Expand description
Complete tiling configuration for a kernel
Contains geometry for all three tiling levels, enabling hierarchical cache-aware execution.
Fields§
§name: StringKernel name for identification
macro_tile: TcbGeometryMacro-tile geometry (L3/Global)
midi_tile: TcbGeometryMidi-tile geometry (L2/Shared)
micro_tile: TcbGeometryMicro-tile geometry (Registers)
backend: TilingBackendTarget backend
Implementations§
Source§impl TilingConfig
impl TilingConfig
Sourcepub fn gpu_q4k_matvec() -> Self
pub fn gpu_q4k_matvec() -> Self
Create configuration for GPU Q4_K MatVec
Optimized for single-token generation where M=1.
Sourcepub fn gpu_q4k_matmul() -> Self
pub fn gpu_q4k_matmul() -> Self
Create configuration for GPU Q4_K MatMul (batched)
Optimized for prefill where M > 1.
Sourcepub fn gpu_softmax() -> Self
pub fn gpu_softmax() -> Self
Create configuration for GPU Softmax
Sourcepub fn cpu_avx512_matmul() -> Self
pub fn cpu_avx512_matmul() -> Self
Create configuration for CPU AVX-512 MatMul
Optimized for 512-bit wide SIMD:
- 16 floats per ZMM register
- 32 ZMM registers available
- 4×16 micro-kernel uses 8 registers (4 accumulators + 4 scratch)
Sourcepub fn cpu_avx512_q4k_matvec() -> Self
pub fn cpu_avx512_q4k_matvec() -> Self
Create configuration for CPU AVX-512 Q4K MatVec
Optimized for Q4_K quantized inference with 512-bit SIMD. Key differences from AVX2:
- 64-byte aligned for cache line optimization
- 4×1 micro-kernel processes 4 rows simultaneously
- K=256 aligned to Q4_K superblock
Sourcepub fn cpu_avx512_vnni_q4k_q8k() -> Self
pub fn cpu_avx512_vnni_q4k_q8k() -> Self
Create configuration for AVX-512 VNNI Q4K×Q8K integer dot product
AVX-512 VNNI (Vector Neural Network Instructions) provides:
- VPDPBUSD: 8-bit unsigned × 8-bit signed multiply-add to i32
- VPDPWSSD: 16-bit signed × 16-bit signed multiply-add to i32
This enables pure integer Q4K×Q8K without intermediate f32 conversion.
Sourcepub fn cpu_avx2_matmul() -> Self
pub fn cpu_avx2_matmul() -> Self
Create configuration for CPU AVX2 MatMul
Sourcepub fn cpu_avx2_q4k_matvec() -> Self
pub fn cpu_avx2_q4k_matvec() -> Self
Create configuration for CPU Q4_K MatVec (AVX2)
Sourcepub fn cpu_rmsnorm() -> Self
pub fn cpu_rmsnorm() -> Self
Create configuration for RMSNorm (CPU)
Sourcepub fn validate(&self) -> Result<(), TilingError>
pub fn validate(&self) -> Result<(), TilingError>
Validate that tiling configuration is internally consistent
Sourcepub fn num_macro_tiles(&self, m: u32, n: u32) -> u32
pub fn num_macro_tiles(&self, m: u32, n: u32) -> u32
Calculate total number of macro-tiles for given problem size
Sourcepub fn midi_tiles_per_macro(&self) -> u32
pub fn midi_tiles_per_macro(&self) -> u32
Calculate total number of midi-tiles within a macro-tile
Sourcepub fn micro_tiles_per_midi(&self) -> u32
pub fn micro_tiles_per_midi(&self) -> u32
Calculate total number of micro-tiles within a midi-tile
Trait Implementations§
Source§impl Clone for TilingConfig
impl Clone for TilingConfig
Source§fn clone(&self) -> TilingConfig
fn clone(&self) -> TilingConfig
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for TilingConfig
impl Debug for TilingConfig
Source§impl<'de> Deserialize<'de> for TilingConfig
impl<'de> Deserialize<'de> for TilingConfig
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Auto Trait Implementations§
impl Freeze for TilingConfig
impl RefUnwindSafe for TilingConfig
impl Send for TilingConfig
impl Sync for TilingConfig
impl Unpin for TilingConfig
impl UnsafeUnpin for TilingConfig
impl UnwindSafe for TilingConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> FmtForward for T
impl<T> FmtForward for T
Source§fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
self to use its Binary implementation when Debug-formatted.Source§fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
self to use its Display implementation when
Debug-formatted.Source§fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
self to use its LowerExp implementation when
Debug-formatted.Source§fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
self to use its LowerHex implementation when
Debug-formatted.Source§fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
self to use its Octal implementation when Debug-formatted.Source§fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
self to use its Pointer implementation when
Debug-formatted.Source§fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
self to use its UpperExp implementation when
Debug-formatted.Source§fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
self to use its UpperHex implementation when
Debug-formatted.Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pipe for Twhere
T: ?Sized,
impl<T> Pipe for Twhere
T: ?Sized,
Source§fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
Source§fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
self and passes that borrow into the pipe function. Read moreSource§fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
self and passes that borrow into the pipe function. Read moreSource§fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
Source§fn pipe_borrow_mut<'a, B, R>(
&'a mut self,
func: impl FnOnce(&'a mut B) -> R,
) -> R
fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> R
Source§fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
self, then passes self.as_ref() into the pipe function.Source§fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
self, then passes self.as_mut() into the pipe
function.Source§fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
self, then passes self.deref() into the pipe function.Source§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<T> Tap for T
impl<T> Tap for T
Source§fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
Borrow<B> of a value. Read moreSource§fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
BorrowMut<B> of a value. Read moreSource§fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
AsRef<R> view of a value. Read moreSource§fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
AsMut<R> view of a value. Read moreSource§fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
Deref::Target of a value. Read moreSource§fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
Deref::Target of a value. Read moreSource§fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
.tap() only in debug builds, and is erased in release builds.Source§fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
.tap_mut() only in debug builds, and is erased in release
builds.Source§fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
.tap_borrow() only in debug builds, and is erased in release
builds.Source§fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
.tap_borrow_mut() only in debug builds, and is erased in release
builds.Source§fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
.tap_ref() only in debug builds, and is erased in release
builds.Source§fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
.tap_ref_mut() only in debug builds, and is erased in release
builds.Source§fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
.tap_deref() only in debug builds, and is erased in release
builds.