Expand description
GPU load balancing for distributing index-building work across multiple devices.
This module provides:
GpuLoadBalancer: runtime tracking of per-device workloads and selection of the least-loaded device for a new task.WorkloadDistributor: static splitting of a large index job into per-device contiguous chunks.
§Pure Rust Policy
No CUDA runtime calls are made here. All load-balancing logic is Pure Rust and
operates on abstract device descriptors (SimpleGpuDevice).
Structs§
- GpuLoad
Balancer - Distributes GPU work across multiple devices using a least-loaded strategy.
- Simple
GpuDevice - Lightweight descriptor of a GPU device used for load balancing decisions.
- Workload
Chunk - A contiguous slice of a vector dataset assigned to a specific GPU.
- Workload
Distributor - Splits a large vector index job across multiple GPU devices proportionally to their memory capacity.