Expand description
Job placement algorithm for multi-node adapter training (GPU-SHARE Phase 3, §3.3).
Scores each node for each adapter job and assigns greedily:
score = (free_vram / adapter_budget) × gpu_flops_factor × (1 / current_load)Where gpu_flops_factor normalizes different GPU types:
- RTX 4090: 1.0 (reference)
- Jetson Orin: 0.06 (8 SMs vs 128)
- CPU (Intel): 0.01
Structs§
- Adapter
Job - A pending adapter job to place on a cluster node.
- Node
Load - Current load state of a node for placement scoring.
- Placement
Decision - Result of placing an adapter on a node.
Functions§
- place_
adapters - Place adapter jobs across cluster nodes greedily.
- score_
node - Score a node for a given adapter budget.