Module kernel_tuning

Expand description

Automatic kernel tuning for hardware adaptation

This module provides automatic performance tuning for kernel operations across different hardware backends. It profiles kernel execution times and adaptively selects optimal parameters (block sizes, thread counts, memory layouts) for the specific hardware being used.

§Features

Auto-tuning: Automatic parameter selection through benchmarking
Hardware Detection: Platform capability detection and profiling
Caching: Persistent tuning results for faster subsequent runs
Multi-Backend: Support for CUDA, ROCm, Metal, CPU, and more
Adaptive: Dynamic adjustment based on tensor sizes and operations

§Examples

use trustformers_core::kernel_tuning::{KernelTuner, TuningConfig, Operation};

// Create tuner with default configuration
let mut tuner = KernelTuner::new(TuningConfig::default())?;

// Auto-tune matrix multiplication parameters for 1024x768 * 768x512
let params = tuner.tune_matmul(1024, 512, 768)?;
println!("Optimal block size: {:?}", params.block_size);

Structs§

KernelParams: Tuned kernel parameters
KernelTuner: Automatic kernel tuner
PlatformInfo: Platform characteristics for tuning decisions
TuningConfig: Tuning configuration
TuningStatistics: Statistics about tuning results

Enums§

Backend: Hardware backend types
Operation: Kernel operation types for tuning

Functions§

get_kernel_tuner: Get or initialize the global kernel tuner

Module kernel_tuning

Module kernel_tuning Copy item path

§Features

§Examples

Structs§

Enums§

Functions§

Module kernel_tuning