Skip to main content

Module kernel_tuning

Module kernel_tuning 

Source
Expand description

Automatic kernel tuning for hardware adaptation

This module provides automatic performance tuning for kernel operations across different hardware backends. It profiles kernel execution times and adaptively selects optimal parameters (block sizes, thread counts, memory layouts) for the specific hardware being used.

§Features

  • Auto-tuning: Automatic parameter selection through benchmarking
  • Hardware Detection: Platform capability detection and profiling
  • Caching: Persistent tuning results for faster subsequent runs
  • Multi-Backend: Support for CUDA, ROCm, Metal, CPU, and more
  • Adaptive: Dynamic adjustment based on tensor sizes and operations

§Examples

use trustformers_core::kernel_tuning::{KernelTuner, TuningConfig, Operation};

// Create tuner with default configuration
let mut tuner = KernelTuner::new(TuningConfig::default())?;

// Auto-tune matrix multiplication parameters for 1024x768 * 768x512
let params = tuner.tune_matmul(1024, 512, 768)?;
println!("Optimal block size: {:?}", params.block_size);

Structs§

KernelParams
Tuned kernel parameters
KernelTuner
Automatic kernel tuner
PlatformInfo
Platform characteristics for tuning decisions
TuningConfig
Tuning configuration
TuningStatistics
Statistics about tuning results

Enums§

Backend
Hardware backend types
Operation
Kernel operation types for tuning

Functions§

get_kernel_tuner
Get or initialize the global kernel tuner