Skip to main content

Module multi_gpu

Module multi_gpu

Expand description

Multi-GPU load balancing for distributed vector index operations

This module provides round-robin and workload-aware distribution of vector search and index building tasks across multiple GPU devices.

§Architecture

The multi-GPU system consists of:

MultiGpuManager: Central coordinator managing all GPU workers
GpuWorker: Per-device worker with its own queue and metrics
LoadBalancer: Strategy-based dispatcher (round-robin or workload-aware)
MultiGpuTask: Task type enum for different GPU operations

§Feature Gating

All CUDA runtime interactions are gated with #[cfg(feature = "cuda")]. The load balancing logic itself is Pure Rust.

Structs§

GpuDeviceMetrics: Real-time metrics for a single GPU device
GpuTaskResult: Result of a GPU task execution
MultiGpuConfig: Configuration for multi-GPU manager
MultiGpuConfigFactory: Factory for creating multi-GPU configurations for common scenarios
MultiGpuManager: Central multi-GPU manager
MultiGpuStats: Statistics for the multi-GPU manager

Enums§

GpuTaskOutput: Output data for different task types
LoadBalancingStrategy: Load balancing strategy for multi-GPU distribution
MultiGpuTask: A task that can be dispatched to a GPU device
TaskPriority: Task priority level