Expand description
Zero-config setup protocol for master and worker nodes.
Runs before the normal inference lifecycle:
- Workers advertise via mDNS, master discovers them.
- Master computes layer assignments based on GPU VRAM.
- Master connects to each worker, authenticates, assigns layers, and pushes model data if the worker doesn’t have it cached.
- Workers load their assigned layers and signal readiness.
After setup, both sides proceed with normal Context::from_args() / inference.
Functions§
- compute_
layer_ assignments - Compute layer assignments proportional to each worker’s estimated TFLOPS,
accounting for the master’s own compute so it retains its fair share of layers.
When
layer_size_bytes> 0, each worker’s assignment is capped by its per-GPU VRAM to prevent out-of-memory errors on multi-GPU nodes. The master’s local layers are also capped bymaster_max_layersto avoid OOM. - master_
setup - Run the full zero-config master setup.
- worker_
setup - worker_
setup_ with_ progress
Type Aliases§
- Setup
Progress Fn - Run the zero-config worker setup.