Skip to main content

Module expert_pool

Module expert_pool 

Source
Expand description

MoE expert residency pool (TIDE-style predictive offload).

Mirrors the policy in ims-kdks/TIDE LLaDA2MoeSparseMoeBlock: rank experts by token hits, refresh placement every τ steps, paired promote/demote to limit PCIe churn.

Router logits and expert indices are unchanged — placement only.

Structs§

ExpertPool
Tracks which logical experts are GPU-resident and applies TIDE placement updates.
ExpertPoolConfig
Configuration for ExpertPool.
ExpertPoolStats
Cumulative counters (TIDE offload_stats).
ExpertRefreshResult
Result of one placement refresh.

Enums§

ExpertRefreshPolicy
When to re-run hit counting and expert placement.
MoEExecMode
Per-forward hint from the runner (maps to TIDE refresh_experts).

Functions§

gpu_expert_budget_from_vram
merged_resident_mask
Union of GPU-resident experts across per-layer pools (legacy single graph mask).
per_layer_resident_masks
Per-layer resident bitmasks (TIDE placement; one row per MoE FFN in forward order).