Expand description
Multi-GPU coordination, topology discovery, and cross-GPU messaging.
This module provides infrastructure for coordinating work across multiple GPUs, including:
- Device Selection - Load balancing strategies for kernel placement
- Topology Discovery - NVLink/PCIe detection and bandwidth estimation
- Cross-GPU K2K Router - Kernel-to-kernel messaging across GPUs
- Kernel Migration - Move kernels between GPUs with state transfer
§Example
ⓘ
use ringkernel_core::multi_gpu::{MultiGpuBuilder, GpuTopology, CrossGpuK2KRouter};
let coordinator = MultiGpuBuilder::new()
.load_balancing(LoadBalancingStrategy::LeastLoaded)
.enable_p2p(true)
.build();
// Discover topology
let topology = coordinator.discover_topology();
// Create cross-GPU router
let router = CrossGpuK2KRouter::new(coordinator.clone());
router.route_message(source_kernel, dest_kernel, envelope).await?;Structs§
- Cross
Device Transfer - Helper for cross-device data transfer.
- Cross
GpuK2K Router - Routes K2K messages across GPU boundaries.
- Cross
GpuRouter Stats - Statistics for cross-GPU K2K routing.
- Cross
GpuRouter Stats Snapshot - Snapshot of router statistics.
- Device
Info - Information about a GPU device.
- Device
Status - Status of a device in the multi-GPU coordinator.
- Device
Unregister Result - Result of unregistering a device from the coordinator.
- GpuConnection
- Connection between two GPUs.
- GpuTopology
- GPU topology graph describing all device interconnections.
- HotReload
Config - Configuration for kernel hot reload operations.
- HotReload
Manager - Manager for kernel hot reload operations.
- HotReload
Request - Request to hot reload a kernel.
- HotReload
Result - Result of a completed hot reload.
- HotReload
Stats Snapshot - Snapshot of hot reload statistics.
- Kernel
Code Source - Kernel code source for hot reload.
- Kernel
Migration Plan - Plan for migrating a single kernel during device unregister.
- Kernel
Migrator - Migrator that uses checkpoints for kernel state transfer between GPUs.
- Migration
Request - Request to migrate a kernel between devices.
- Migration
Result - Result of a completed migration.
- Migration
Stats - Statistics for kernel migrations.
- Migration
Stats Snapshot - Snapshot of migration statistics.
- Multi
GpuBuilder - Builder for multi-GPU coordinator.
- Multi
GpuConfig - Configuration for multi-GPU coordination.
- Multi
GpuCoordinator - Multi-GPU coordinator for managing kernels across devices.
- Multi
GpuStats - Multi-GPU coordinator statistics.
- Pending
K2KMessage - A pending cross-GPU K2K message.
Enums§
- HotReload
State - State of a hot reload operation.
- Interconnect
Type - Type of interconnect between GPUs.
- Kernel
Code Format - Kernel code format.
- Load
Balancing Strategy - Strategy for balancing load across devices.
- Migration
Priority - Priority for kernel migration.
- Migration
State - State of a kernel migration.
- Routing
Decision - Decision for how to route a K2K message.
Traits§
- HotReloadable
Kernel - Trait for kernels that support hot reload.
- Migratable
Kernel - Trait for kernels that support live migration.