pub struct MoeOffloadEstimate {
pub expert_param_bytes: usize,
pub num_moe_layers: usize,
pub num_experts: usize,
pub gpu_expert_budget_per_layer: usize,
pub all_expert_weight_bytes: usize,
pub resident_expert_weight_bytes: usize,
}Expand description
Estimate peak memory for running graph on a session bound to
registry. Pure analysis — runs the memory planner internally
and queries the registry for weight bytes; doesn’t compile or
execute.
MoE offload sizing (TIDE enable_predictive_expert_offload).
Fields§
§expert_param_bytes: usizeBytes for one expert FFN (gate+up+down) at runtime dtype.
num_moe_layers: usize§num_experts: usize§gpu_expert_budget_per_layer: usizeExperts pinned on device per layer after budget clamp.
all_expert_weight_bytes: usizeAll experts resident on host+device (upper bound).
resident_expert_weight_bytes: usizeOnly gpu_expert_budget_per_layer experts per layer on device.
Implementations§
Source§impl MoeOffloadEstimate
impl MoeOffloadEstimate
Sourcepub fn peak_with_offload(&self, base: &MemoryEstimate) -> usize
pub fn peak_with_offload(&self, base: &MemoryEstimate) -> usize
Resident expert weights + non-expert peak from MemoryEstimate.
Trait Implementations§
Source§impl Clone for MoeOffloadEstimate
impl Clone for MoeOffloadEstimate
Source§fn clone(&self) -> MoeOffloadEstimate
fn clone(&self) -> MoeOffloadEstimate
Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreAuto Trait Implementations§
impl Freeze for MoeOffloadEstimate
impl RefUnwindSafe for MoeOffloadEstimate
impl Send for MoeOffloadEstimate
impl Sync for MoeOffloadEstimate
impl Unpin for MoeOffloadEstimate
impl UnsafeUnpin for MoeOffloadEstimate
impl UnwindSafe for MoeOffloadEstimate
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more