Module multi_gpu

Module multi_gpu 

Source
Expand description

Multi-GPU coordination, topology discovery, and cross-GPU messaging.

This module provides infrastructure for coordinating work across multiple GPUs, including:

  • Device Selection - Load balancing strategies for kernel placement
  • Topology Discovery - NVLink/PCIe detection and bandwidth estimation
  • Cross-GPU K2K Router - Kernel-to-kernel messaging across GPUs
  • Kernel Migration - Move kernels between GPUs with state transfer

§Example

use ringkernel_core::multi_gpu::{MultiGpuBuilder, GpuTopology, CrossGpuK2KRouter};

let coordinator = MultiGpuBuilder::new()
    .load_balancing(LoadBalancingStrategy::LeastLoaded)
    .enable_p2p(true)
    .build();

// Discover topology
let topology = coordinator.discover_topology();

// Create cross-GPU router
let router = CrossGpuK2KRouter::new(coordinator.clone());
router.route_message(source_kernel, dest_kernel, envelope).await?;

Structs§

CrossDeviceTransfer
Helper for cross-device data transfer.
CrossGpuK2KRouter
Routes K2K messages across GPU boundaries.
CrossGpuRouterStats
Statistics for cross-GPU K2K routing.
CrossGpuRouterStatsSnapshot
Snapshot of router statistics.
DeviceInfo
Information about a GPU device.
DeviceStatus
Status of a device in the multi-GPU coordinator.
DeviceUnregisterResult
Result of unregistering a device from the coordinator.
GpuConnection
Connection between two GPUs.
GpuTopology
GPU topology graph describing all device interconnections.
HotReloadConfig
Configuration for kernel hot reload operations.
HotReloadManager
Manager for kernel hot reload operations.
HotReloadRequest
Request to hot reload a kernel.
HotReloadResult
Result of a completed hot reload.
HotReloadStatsSnapshot
Snapshot of hot reload statistics.
KernelCodeSource
Kernel code source for hot reload.
KernelMigrationPlan
Plan for migrating a single kernel during device unregister.
KernelMigrator
Migrator that uses checkpoints for kernel state transfer between GPUs.
MigrationRequest
Request to migrate a kernel between devices.
MigrationResult
Result of a completed migration.
MigrationStats
Statistics for kernel migrations.
MigrationStatsSnapshot
Snapshot of migration statistics.
MultiGpuBuilder
Builder for multi-GPU coordinator.
MultiGpuConfig
Configuration for multi-GPU coordination.
MultiGpuCoordinator
Multi-GPU coordinator for managing kernels across devices.
MultiGpuStats
Multi-GPU coordinator statistics.
PendingK2KMessage
A pending cross-GPU K2K message.

Enums§

HotReloadState
State of a hot reload operation.
InterconnectType
Type of interconnect between GPUs.
KernelCodeFormat
Kernel code format.
LoadBalancingStrategy
Strategy for balancing load across devices.
MigrationPriority
Priority for kernel migration.
MigrationState
State of a kernel migration.
RoutingDecision
Decision for how to route a K2K message.

Traits§

HotReloadableKernel
Trait for kernels that support hot reload.
MigratableKernel
Trait for kernels that support live migration.