pub struct KernelFusionConfig {
pub compute_capability: (u32, u32),
pub warp_size: usize,
pub max_threads_per_block: usize,
pub shared_memory_size: usize,
pub mixed_precision: bool,
pub use_tensor_cores: bool,
pub coalescing_level: CoalescingLevel,
}Expand description
Configuration for GPU kernel fusion optimization.
Fields§
§compute_capability: (u32, u32)Target GPU compute capability (e.g., 7.5 for V100, 8.0 for A100)
warp_size: usizeWarp size (typically 32 for NVIDIA GPUs)
max_threads_per_block: usizeMaximum threads per block
Shared memory size per block in bytes
mixed_precision: boolEnable mixed precision (FP16/FP32) kernels
use_tensor_cores: boolEnable tensor core operations where possible
coalescing_level: CoalescingLevelMemory coalescing optimization level
Implementations§
Source§impl KernelFusionConfig
impl KernelFusionConfig
Sourcepub fn optimal_block_size(&self, param_count: usize) -> usize
pub fn optimal_block_size(&self, param_count: usize) -> usize
Gets optimal block size for given parameter count.
Sourcepub fn memory_alignment(&self) -> usize
pub fn memory_alignment(&self) -> usize
Gets memory alignment requirement based on coalescing level.
Trait Implementations§
Source§impl Clone for KernelFusionConfig
impl Clone for KernelFusionConfig
Source§fn clone(&self) -> KernelFusionConfig
fn clone(&self) -> KernelFusionConfig
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for KernelFusionConfig
impl Debug for KernelFusionConfig
Auto Trait Implementations§
impl Freeze for KernelFusionConfig
impl RefUnwindSafe for KernelFusionConfig
impl Send for KernelFusionConfig
impl Sync for KernelFusionConfig
impl Unpin for KernelFusionConfig
impl UnwindSafe for KernelFusionConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more