Skip to main content

CompiledGraph

Struct CompiledGraph 

Source
pub struct CompiledGraph { /* private fields */ }
Expand description

A compiled graph ready for execution.

Created by crate::Session::compile. Holds the fused + memory-planned graph and all pre-allocated execution state. Call CompiledGraph::run repeatedly with different inputs — zero allocation per call.

Implementations§

Source§

impl CompiledGraph

Source

pub fn device(&self) -> Device

Which device this graph runs on.

Source

pub fn set_param(&mut self, name: &str, data: &[f32])

Set a named parameter (model weight). Call once per parameter after compilation.

Source

pub fn run(&mut self, inputs: &[(&str, &[f32])]) -> Vec<Vec<f32>>

Execute the graph with named inputs. Returns one Vec<f32> per graph output (copies from arena).

Source

pub fn run_raw(&mut self, inputs: &[(&str, &[f32])]) -> Vec<(*const f32, usize)>

Execute and return raw pointers to output data (zero-copy). Data is valid until the next run/run_raw call.

§Safety

The returned pointers point into the arena. Do not use after the next call to run/run_raw (arena data will be overwritten).

Source

pub fn run_slots(&mut self, inputs: &[&[f32]]) -> &[(usize, usize)]

Fastest execution: inputs by slot index (order matches graph input declaration). Returns output (offset, len) pairs. Read data via arena_ptr().add(offset). Zero HashMap lookup, zero Vec allocation, zero name matching.

Source

pub fn arena_ptr(&self) -> *const u8

Arena pointer for reading output data after run_slots.

Source

pub fn bind_handle(&mut self, name: &str, data: &[f32]) -> bool

Bind a persistent buffer (KV-cache, optimizer state, etc.). Stays alive across run() calls; the backend uses it as the graph input with the matching name. Returns true if the backend supports persistent handles.

Source

pub fn read_handle(&self, name: &str) -> Option<Vec<f32>>

Read the current contents of a persistent buffer.

Source

pub fn bind_gpu_handle(&mut self, name: &str, data: &[f32]) -> bool

GPU-resident MLX input (no-op on non-MLX backends).

Source

pub fn has_gpu_handle(&self, name: &str) -> bool

Source

pub fn set_gpu_handle_feed( &mut self, handle_name: &str, output_index: usize, ) -> bool

Source

pub fn read_gpu_handle(&self, name: &str) -> Option<Vec<f32>>

Source

pub fn run_feed_gpu_handle( &mut self, inputs: &[(&str, &[f32])], handle_name: &str, output_index: usize, ) -> Option<Vec<f32>>

Run, refresh GPU handle from output, return that output vector.

Source

pub fn set_active_extent(&mut self, extent: Option<(usize, usize)>)

Hint subsequent run calls to process only the first actual rows along the bucket axis (out of upper, the compile extent). Backends that support per-kernel active-extent dispatch honor this; others ignore it. Pass None to clear.

See BucketedCompileCache::run_padded for the canonical caller.

Source

pub fn set_moe_resident_experts(&mut self, mask: &[bool])

TIDE merged MoE placement (mask[expert] device-resident if any layer has it).

Source

pub fn set_moe_resident_experts_per_layer(&mut self, masks: &[&[bool]])

Per MoE layer placement (forward order). Preferred on CPU over merged mask.

Source

pub fn enable_moe_topk_capture(&mut self, num_experts: usize) -> bool

Capture MoE router TopK on next forward (CPU). Returns false if unsupported.

Source

pub fn take_moe_topk_capture(&mut self) -> Option<Vec<Vec<u32>>>

Per-layer expert indices from the last forward (MoE router TopK order).

Source

pub fn take_moe_residency_stats(&mut self) -> Option<MoeResidencyStats>

GroupedMatMul GPU/CPU token accounting from the last forward (CPU).

Source

pub fn commit_no_wait(&mut self, inputs: &[(&str, &[f32])])

Encode + commit a forward pass without waiting for the device.

Outputs of intermediate calls are stomped — use run_pipelined when you need each call’s outputs back. Pair with sync_pending to drain. CPU is synchronous, so this falls back to run.

Source

pub fn sync_pending(&mut self)

Wait for every command queued by commit_no_wait. CPU is a no-op.

Source

pub fn run_pipelined( &mut self, input_sets: &[Vec<(&str, &[f32])>], ) -> Vec<Vec<Vec<f32>>>

Pipelined batch run. Issues one commit per input set, syncs once at the end. On Metal, each commit gets its own output snapshot (allocated + blit-copied), so subsequent commits stomping the shared arena don’t corrupt earlier runs’ outputs. Returns out[run_idx][output_idx][element_idx].

Source

pub fn set_param_typed(&mut self, name: &str, data: &[u8], dtype: DType)

Set a named parameter from raw bytes in the given dtype. The backend handles the widen-to-f32 (or zero-widen, when supported natively) on the way in. Lets callers feed F16/BF16 weights without a host-side cast.

Source

pub fn run_typed( &mut self, inputs: &[(&str, &[u8], DType)], ) -> Vec<(Vec<u8>, DType)>

Execute with typed inputs and return outputs in their declared graph dtype, byte-encoded. Mirrors the wgpu / MLX zero-widen semantics on f32-arena backends (CPU + Metal) by widening at the boundary.

Trait Implementations§

Source§

impl Clone for CompiledGraph

Source§

fn clone(&self) -> Self

Deep-clones the underlying executable via ExecutableGraph::clone_box. Backends that don’t support cloning will panic at this point.

1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WasmNotSend for T
where T: Send,