Expand description
Cluster configuration for multi-node GPU training (GPU-SHARE Phase 3, §3.2).
Parses cluster.yaml files describing heterogeneous training clusters
with mixed GPU types (RTX 4090, Jetson, CPU-only nodes).
§Example
nodes:
- name: desktop
host: localhost
gpus:
- uuid: GPU-abcd-1234
type: rtx-4090
vram_mb: 24564
memory_type: discrete
max_adapters: 3Structs§
- Cluster
Config - Top-level cluster configuration.
- GpuConfig
- Configuration for a single GPU on a node.
- GpuCost
Model - GPU dispatch cost_model (PW-01: 5× PCIe Rule)
- Node
Config - Configuration for a single training node.
Enums§
- Cluster
Validation Error - Cluster configuration validation errors.
- Memory
Type - GPU memory architecture.
- Transport
- Transport method for connecting to a node.