Skip to main content

Module multi_gpu

Module multi_gpu 

Source
Expand description

Multi-GPU load balancing for distributed vector index operations

This module provides round-robin and workload-aware distribution of vector search and index building tasks across multiple GPU devices.

§Architecture

The multi-GPU system consists of:

  • MultiGpuManager: Central coordinator managing all GPU workers
  • GpuWorker: Per-device worker with its own queue and metrics
  • LoadBalancer: Strategy-based dispatcher (round-robin or workload-aware)
  • MultiGpuTask: Task type enum for different GPU operations

§Feature Gating

All CUDA runtime interactions are gated with #[cfg(feature = "cuda")]. The load balancing logic itself is Pure Rust.

Structs§

GpuDeviceMetrics
Real-time metrics for a single GPU device
GpuTaskResult
Result of a GPU task execution
MultiGpuConfig
Configuration for multi-GPU manager
MultiGpuConfigFactory
Factory for creating multi-GPU configurations for common scenarios
MultiGpuManager
Central multi-GPU manager
MultiGpuStats
Statistics for the multi-GPU manager

Enums§

GpuTaskOutput
Output data for different task types
LoadBalancingStrategy
Load balancing strategy for multi-GPU distribution
MultiGpuTask
A task that can be dispatched to a GPU device
TaskPriority
Task priority level