Skip to main content

Module cluster

Module cluster 

Source
Expand description

Cluster configuration for multi-node GPU training (GPU-SHARE Phase 3, §3.2).

Parses cluster.yaml files describing heterogeneous training clusters with mixed GPU types (RTX 4090, Jetson, CPU-only nodes).

§Example

nodes:
  - name: desktop
    host: localhost
    gpus:
      - uuid: GPU-abcd-1234
        type: rtx-4090
        vram_mb: 24564
        memory_type: discrete
    max_adapters: 3

Structs§

ClusterConfig
Top-level cluster configuration.
GpuConfig
Configuration for a single GPU on a node.
GpuCostModel
GPU dispatch cost_model (PW-01: 5× PCIe Rule)
NodeConfig
Configuration for a single training node.

Enums§

ClusterValidationError
Cluster configuration validation errors.
MemoryType
GPU memory architecture.
Transport
Transport method for connecting to a node.