Skip to main content

MegakernelLaunchPolicy

Struct MegakernelLaunchPolicy 

Source
pub struct MegakernelLaunchPolicy {
Show 13 fields pub sizing: MegakernelSizingPolicy, pub min_hit_capacity: u32, pub hit_capacity_multiplier: u32, pub saturated_waves: u32, pub hot_opcode_threshold: u32, pub hot_window_threshold: u32, pub jit_queue_len_threshold: u32, pub priority_age_threshold: u32, pub sparse_frontier_threshold_bps: u16, pub dense_frontier_threshold_bps: u16, pub memory_pressure_threshold_bps: u16, pub fusion_edge_threshold: u32, pub scratch_bytes_per_hit: u32,
}
Expand description

Single policy surface for megakernel launch sizing and telemetry-driven routing.

Fields§

§sizing: MegakernelSizingPolicy

Sizing policy for worker counts and grid geometry.

§min_hit_capacity: u32

Minimum capacity for sparse-hit results.

§hit_capacity_multiplier: u32

Multiplier for expected hits to determine capacity.

§saturated_waves: u32

Number of waves that define a saturated queue.

§hot_opcode_threshold: u32

Threshold for promoting hot opcodes to JIT.

§hot_window_threshold: u32

Threshold for promoting hot windows to JIT.

§jit_queue_len_threshold: u32

Queue length threshold to prefer JIT over interpreter.

§priority_age_threshold: u32

Priority age threshold to trigger aging promotions.

§sparse_frontier_threshold_bps: u16

Frontier density at or below this value uses sparse expansion.

§dense_frontier_threshold_bps: u16

Frontier density at or above this value uses dense propagation.

§memory_pressure_threshold_bps: u16

Memory pressure at or above this value uses the memory-constrained path.

§fusion_edge_threshold: u32

Minimum graph edge count before dense hot work is eligible for fusion.

§scratch_bytes_per_hit: u32

Conservative resident scratch bytes needed per sparse-hit entry.

Implementations§

Source§

impl MegakernelLaunchPolicy

Source

pub const fn standard() -> Self

Standard launch policy used by VYRE megakernel dispatchers.

Source

pub fn launch_cache_stats() -> MegakernelLaunchCacheStats

Return launch recommendation cache telemetry for the current thread.

Source

pub fn reset_launch_cache_for_thread()

Clear launch recommendation cache entries and counters for this thread.

Source

pub fn recommend( &self, request: MegakernelLaunchRequest, ) -> Result<MegakernelLaunchRecommendation, BackendError>

Recommend geometry, hit capacity, and interpreter/JIT route.

§Errors

Returns BackendError when required adapter limits are zero or derived launch values cannot fit the u32 ring protocol.

Source

pub fn recommend_with_topology_evidence( &self, request: MegakernelLaunchRequest, ) -> Result<(MegakernelLaunchRecommendation, MegakernelTopologyEvidence), BackendError>

Recommend a launch and emit topology evidence for parity benches.

§Errors

Returns BackendError when the underlying recommendation cannot be built from the request or adapter limits.

Source

pub fn recommend_with_promotion_evidence( &self, request: MegakernelLaunchRequest, ) -> Result<(MegakernelLaunchRecommendation, MegakernelPromotionEvidence), BackendError>

Recommend a launch and emit hot opcode/window promotion evidence.

§Errors

Returns BackendError when the underlying recommendation cannot be built from the request or adapter limits.

Source

pub fn recommend_with_previous_topology( &self, request: MegakernelLaunchRequest, previous_topology: MegakernelDispatchTopology, ) -> Result<MegakernelLaunchRecommendation, BackendError>

Recommend a launch while preserving the previous topology inside a narrow hysteresis band.

CUDA resident graphs and long-running dataflow streams should use this entry point when they can track the last successful topology. It prevents borderline frontier-density or memory-pressure telemetry from repeatedly switching kernel variants, invalidating launch plans, and disturbing cache locality at scale.

§Errors

Returns BackendError when required adapter limits are zero or derived launch values cannot fit the u32 ring protocol.

Source

pub fn autotune_hit_capacity_multiplier( &self, candidate_multipliers: &[u32], costs: &[f64], ) -> u32

Select the best hit_capacity_multiplier from a candidate set.

candidate_multipliers are the multipliers to try; costs[i] is the observed dispatch latency (or any minimization metric) when candidate_multipliers[i] was used. Lower cost wins; the minimum observed cost selects the multiplier.

Returns the chosen multiplier. If candidate_multipliers is empty, returns the policy’s existing hit_capacity_multiplier.

Source

pub fn autotune_workgroup_size( &self, candidate_sizes: &[u32], costs: &[f64], current_size: u32, ) -> u32

Select the best workgroup-size from a candidate set.

candidate_sizes[i] is paired with costs[i] (lower is better). Returns the chosen size or the policy’s sizing.default_workgroup_size_x() fallback.

Source

pub fn natural_gradient_autotune_step( m_inv_sqrt: &[f64], grad: &[f64], n: u32, learning_rate: f64, ) -> Vec<f64>

Compute the next-step parameter delta for a continuous autotune knob using a Fisher-preconditioned natural-gradient step.

m_inv_sqrt: inverse-square-root of the Fisher block (n×n row-major). Passing an identity matrix reduces the natural gradient to plain gradient descent.

grad: plain gradient ∂latency/∂param (length n).

Returns the parameter delta -lr · M_inv_sqrt · grad.

P-DRIVER-8: every continuous autotune knob (workgroup size, hit-capacity, fixpoint iteration count, …) should follow the natural-gradient direction by default - Fisher-preconditioned descent converges 5-10× faster than plain gradient on the elongated-valley latency surfaces typical of GPU autotuning.

Source

pub fn try_natural_gradient_autotune_step( m_inv_sqrt: &[f64], grad: &[f64], n: u32, learning_rate: f64, ) -> Result<Vec<f64>, BackendError>

Compute the next-step parameter delta with fallible output staging.

§Errors

Returns BackendError when host staging cannot be reserved for the natural-gradient vector.

Source

pub fn natural_gradient_autotune_step_into( m_inv_sqrt: &[f64], grad: &[f64], n: u32, learning_rate: f64, out: &mut Vec<f64>, )

Compute the natural-gradient autotune step into caller-owned storage.

Source

pub fn try_natural_gradient_autotune_step_into( m_inv_sqrt: &[f64], grad: &[f64], n: u32, learning_rate: f64, out: &mut Vec<f64>, ) -> Result<(), BackendError>

Compute the natural-gradient autotune step into caller-owned storage with fallible host staging.

§Errors

Returns BackendError when host staging cannot be reserved for the natural-gradient vector.

Trait Implementations§

Source§

impl Clone for MegakernelLaunchPolicy

Source§

fn clone(&self) -> MegakernelLaunchPolicy

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Copy for MegakernelLaunchPolicy

Source§

impl Debug for MegakernelLaunchPolicy

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for MegakernelLaunchPolicy

Source§

fn default() -> Self

Returns the “default value” for a type. Read more
Source§

impl Eq for MegakernelLaunchPolicy

Source§

impl Hash for MegakernelLaunchPolicy

Source§

fn hash<__H: Hasher>(&self, state: &mut __H)

Feeds this value into the given Hasher. Read more
1.3.0 · Source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
Source§

impl PartialEq for MegakernelLaunchPolicy

Source§

fn eq(&self, other: &MegakernelLaunchPolicy) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 (const: unstable) · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl StructuralPartialEq for MegakernelLaunchPolicy

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

Source§

fn equivalent(&self, key: &K) -> bool

Checks if this value is equivalent to the given key. Read more
Source§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

Source§

fn equivalent(&self, key: &K) -> bool

Compare self to key and return true if they are equal.
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more