Expand description
Multi-GPU load balancing for distributed vector index operations
This module provides round-robin and workload-aware distribution of vector search and index building tasks across multiple GPU devices.
§Architecture
The multi-GPU system consists of:
MultiGpuManager: Central coordinator managing all GPU workersGpuWorker: Per-device worker with its own queue and metricsLoadBalancer: Strategy-based dispatcher (round-robin or workload-aware)MultiGpuTask: Task type enum for different GPU operations
§Feature Gating
All CUDA runtime interactions are gated with #[cfg(feature = "cuda")].
The load balancing logic itself is Pure Rust.
Structs§
- GpuDevice
Metrics - Real-time metrics for a single GPU device
- GpuTask
Result - Result of a GPU task execution
- Multi
GpuConfig - Configuration for multi-GPU manager
- Multi
GpuConfig Factory - Factory for creating multi-GPU configurations for common scenarios
- Multi
GpuManager - Central multi-GPU manager
- Multi
GpuStats - Statistics for the multi-GPU manager
Enums§
- GpuTask
Output - Output data for different task types
- Load
Balancing Strategy - Load balancing strategy for multi-GPU distribution
- Multi
GpuTask - A task that can be dispatched to a GPU device
- Task
Priority - Task priority level