Expand description
Peer-to-peer (P2P) memory copy operations for multi-GPU workloads.
This module provides functions to check, enable, and disable peer access between CUDA devices, as well as copy data between device buffers on different GPUs.
Peer access enables direct GPU-to-GPU memory transfers over PCIe or NVLink without staging through host memory, significantly improving transfer bandwidth in multi-GPU configurations.
§Platform note
The underlying cuDeviceCanAccessPeer, cuCtxEnablePeerAccess,
cuCtxDisablePeerAccess, and cuMemcpyPeer driver functions are not
yet loaded by oxicuda-driver. All functions currently return
CudaError::NotSupported as placeholders. The API surface is
established here so that downstream crates can program against it.
§Example
use oxicuda_driver::device::Device;
use oxicuda_memory::peer_copy;
oxicuda_driver::init()?;
let dev0 = Device::get(0)?;
let dev1 = Device::get(1)?;
if peer_copy::can_access_peer(&dev0, &dev1)? {
peer_copy::enable_peer_access(&dev0, &dev1)?;
// Now D2D copies between dev0 and dev1 can go over NVLink/PCIe
// peer_copy::copy_peer(&mut dst_buf, &dev1, &src_buf, &dev0)?;
}Functions§
- can_
access_ peer - Checks whether
devicecan directly access memory onpeer. - copy_
peer - Copies data between device buffers on different GPUs (synchronous).
- copy_
peer_ async - Copies data between device buffers on different GPUs (asynchronous).
- disable_
peer_ access - Disables peer access from the current context’s device to
peer. - enable_
peer_ access - Enables peer access from the current context’s device to
peer.