pub struct Tuner { /* private fields */ }Expand description
Workgroup-size autotuner.
Implementations§
Source§impl Tuner
impl Tuner
Sourcepub fn new(adapter_fp: &str, mode: Mode) -> Self
pub fn new(adapter_fp: &str, mode: Mode) -> Self
Build a new tuner for the adapter fingerprinted as adapter_fp.
Sourcepub fn cache_path_for_adapter(adapter_fp: &str) -> PathBuf
pub fn cache_path_for_adapter(adapter_fp: &str) -> PathBuf
Cache file path for a given adapter fingerprint.
Sourcepub fn candidates_for(&self, max_invocations: u32) -> Vec<u32>
pub fn candidates_for(&self, max_invocations: u32) -> Vec<u32>
Candidate workgroup sizes bounded by max_invocations.
Sourcepub const fn default_workgroup_size() -> [u32; 3]
pub const fn default_workgroup_size() -> [u32; 3]
Default workgroup size used without cache data.
Sourcepub fn resolve(&self, program_fp: &str) -> [u32; 3]
pub fn resolve(&self, program_fp: &str) -> [u32; 3]
Resolve the workgroup size for a program key.
Sourcepub fn resolve_key(&self, key: &TunerProgramKey) -> [u32; 3]
pub fn resolve_key(&self, key: &TunerProgramKey) -> [u32; 3]
Resolve the workgroup size for a typed program/static-shape key.
Sourcepub fn record_decision(&mut self, program_fp: impl Into<String>, size: [u32; 3])
pub fn record_decision(&mut self, program_fp: impl Into<String>, size: [u32; 3])
Record a sweep outcome in memory.
Sourcepub fn record_key_decision(&mut self, key: TunerProgramKey, size: [u32; 3])
pub fn record_key_decision(&mut self, key: TunerProgramKey, size: [u32; 3])
Record a sweep outcome for a typed key.
Sourcepub fn best_of<T: BackendTimer>(
&self,
program: &Program,
candidates: impl IntoIterator<Item = [u32; 3]>,
timer: &mut T,
) -> Result<Option<TuningMeasurement>, T::Error>
pub fn best_of<T: BackendTimer>( &self, program: &Program, candidates: impl IntoIterator<Item = [u32; 3]>, timer: &mut T, ) -> Result<Option<TuningMeasurement>, T::Error>
Measure candidate sizes and choose the fastest one.
§Errors
Returns a backend timing error from BackendTimer.
Sourcepub fn best_of_natural_gradient<T: BackendTimer>(
&self,
program: &Program,
candidates: impl IntoIterator<Item = [u32; 3]>,
timer: &mut T,
fisher_inv_sqrt_q16: &[u32],
policy: NaturalGradientPolicy,
) -> Result<Result<NaturalGradientTuningStep, NaturalGradientTuningError>, T::Error>
pub fn best_of_natural_gradient<T: BackendTimer>( &self, program: &Program, candidates: impl IntoIterator<Item = [u32; 3]>, timer: &mut T, fisher_inv_sqrt_q16: &[u32], policy: NaturalGradientPolicy, ) -> Result<Result<NaturalGradientTuningStep, NaturalGradientTuningError>, T::Error>
Measure candidates, then choose the next probe with a Fisher-preconditioned natural-gradient policy.
This is the concrete runtime handoff for VYRE_AUTOTUNER=natural.
It reuses the same backend timer as Self::best_of, records every
measured candidate, and feeds those measurements into
NaturalGradientPolicy. The returned step includes both the raw
fastest measurement and the Fisher-directed next candidate.
§Errors
Returns backend timing errors from BackendTimer or policy errors
from NaturalGradientPolicy.
Sourcepub fn natural_gradient_step(
&self,
measurements: &[TuningMeasurement],
fisher_inv_sqrt_q16: &[u32],
policy: NaturalGradientPolicy,
) -> Result<NaturalGradientTuningStep, NaturalGradientTuningError>
pub fn natural_gradient_step( &self, measurements: &[TuningMeasurement], fisher_inv_sqrt_q16: &[u32], policy: NaturalGradientPolicy, ) -> Result<NaturalGradientTuningStep, NaturalGradientTuningError>
Convert measured candidates into a Fisher-preconditioned next probe.
This keeps the best-of-N timing hook compatible while giving CUDA and
other GPU backends a richer update rule than “pick the current fastest
sample forever.” Backends can feed fisher_inv_sqrt_q16 from the
primitive-backed natural-gradient self-substrate path.
§Errors
Returns NaturalGradientTuningError when the policy input is
malformed.