pub struct CudaBackend {
pub device_info: CudaDeviceInfo,
/* private fields */
}Expand description
CUDA compute backend.
Without cuda-backend feature: no-op stub; Self::try_new always returns
CudaInitError::FeatureNotEnabled. All buffer and kernel methods operate
on CPU shadows so unit tests compile and run on any platform.
With cuda-backend feature: real cudarc device context; Self::try_new
calls CudaContext::new(ordinal) and returns an error if the CUDA driver is
absent (e.g. on macOS or a Linux machine without an NVIDIA driver). Buffer
methods perform actual host↔device memcpy via the default stream.
Fields§
§device_info: CudaDeviceInfoDevice information (filled from driver attributes when a real context is active).
Implementations§
Source§impl CudaBackend
impl CudaBackend
Sourcepub fn try_new(ordinal: u32) -> Result<Self, CudaInitError>
pub fn try_new(ordinal: u32) -> Result<Self, CudaInitError>
Attempt to create a CUDA backend on device ordinal.
- Without
cuda-backendfeature: always returnsErr(CudaInitError::FeatureNotEnabled). - With
cuda-backendfeature: callsCudaContext::new(ordinal). ReturnsErr(CudaInitError::DeviceError(...))if the CUDA driver is absent or the ordinal is invalid.
Sourcepub fn is_available(&self) -> bool
pub fn is_available(&self) -> bool
True if a real CUDA device context is active.
Sourcepub fn device_info(&self) -> &CudaDeviceInfo
pub fn device_info(&self) -> &CudaDeviceInfo
Device information.
Sourcepub fn create_buffer(&mut self, len: usize) -> CudaBufferHandle
pub fn create_buffer(&mut self, len: usize) -> CudaBufferHandle
Allocate a device buffer that can hold len f64 values.
Real path: calls CudaStream::alloc_zeros::<u8>(len * 8) and stores
the returned CudaSlice<u8>. Falls back to a CPU-shadow buffer when
no real context is active.
Sourcepub fn alloc_unified(&mut self, len: usize) -> CudaBufferHandle
pub fn alloc_unified(&mut self, len: usize) -> CudaBufferHandle
Allocate a unified memory buffer (accessible from both CPU and GPU).
In the current implementation unified memory is backed by the same
CudaSlice<u8> path as a regular buffer; true UM page migration would
require UnifiedSlice from cudarc which is gated on additional CUDA
driver capabilities. Falls back to a CPU-shadow buffer in the stub.
Sourcepub fn write_buffer(&mut self, handle: CudaBufferHandle, data: &[f64])
pub fn write_buffer(&mut self, handle: CudaBufferHandle, data: &[f64])
Upload data to the device buffer at handle.
Real path: CudaStream::memcpy_htod — synchronous on the default stream.
Stub path: copies into the CPU shadow.
Sourcepub fn read_buffer(&self, handle: CudaBufferHandle) -> Vec<f64>
pub fn read_buffer(&self, handle: CudaBufferHandle) -> Vec<f64>
Download data from the device buffer at handle.
Real path: CudaStream::clone_dtoh — synchronous copy to a new Vec<f64>.
Stub path: returns a clone of the CPU shadow.
Sourcepub fn register_kernel(&mut self, name: &str, ptx_source: &str)
pub fn register_kernel(&mut self, name: &str, ptx_source: &str)
Register a PTX kernel source and associate it with name.
Real path: loads the module via CudaContext::load_module and retrieves
the named function. Stub path: records the name only.
Sourcepub fn compile_and_register(
&mut self,
name: &str,
cuda_c_source: &str,
) -> Result<(), CudaInitError>
pub fn compile_and_register( &mut self, name: &str, cuda_c_source: &str, ) -> Result<(), CudaInitError>
Compile a CUDA C kernel at runtime via NVRTC and register it.
Real path: calls cudarc::nvrtc::compile_ptx then loads the module.
Stub path: records the name and returns Ok(()).
Sourcepub fn launch(
&mut self,
name: &str,
buffers: &[CudaBufferHandle],
grid_x: u32,
block_x: u32,
)
pub fn launch( &mut self, name: &str, buffers: &[CudaBufferHandle], grid_x: u32, block_x: u32, )
Launch a registered kernel with buffer arguments only.
§Parameters
name— kernel name as passed toSelf::register_kernelorSelf::compile_and_registerbuffers— buffer handles bound as kernel arguments (in order)grid_x— number of thread blocks in X dimensionblock_x— number of threads per block in X dimension
For kernels that take scalar arguments (e.g. an integer particle count
or floating-point smoothing length), use Self::launch_with_scalars
instead — calling launch against a kernel whose signature includes
scalar parameters will pass uninitialised registers to those slots and
is undefined behaviour.
Real path: retrieves the stored CudaFunction and dispatches via
CudaStream::launch_builder. Currently up to two buffer arguments
are forwarded; extend as needed for higher-arity kernels.
Stub path: no-op.
Sourcepub fn launch_with_scalars(
&mut self,
name: &str,
buffers: &[CudaBufferHandle],
scalars_i32: &[i32],
scalars_f64: &[f64],
grid_x: u32,
block_x: u32,
)
pub fn launch_with_scalars( &mut self, name: &str, buffers: &[CudaBufferHandle], scalars_i32: &[i32], scalars_f64: &[f64], grid_x: u32, block_x: u32, )
Launch a registered kernel passing buffer and scalar arguments.
Scalars are appended to the kernel argument list after the buffer
arguments in the order i32 scalars then f64 scalars; the kernel
signature must match that ordering exactly.
§Parameters
name— kernel name as passed toSelf::register_kernelorSelf::compile_and_registerbuffers— buffer handles bound as the leading kernel argumentsscalars_i32—i32scalars appended after the buffersscalars_f64—f64scalars appended after thei32scalarsgrid_x— number of thread blocks in X dimensionblock_x— number of threads per block in X dimension
Stub path: no-op.
Sourcepub fn synchronize(&mut self)
pub fn synchronize(&mut self)
Synchronise the device (blocks until all submitted work completes).
Real path: CudaStream::synchronize().
Stub path: immediate return.
Sourcepub fn device_count() -> u32
pub fn device_count() -> u32
Return the number of CUDA devices available on this system.
Real path: calls cudarc::driver::result::device::get_count().
Stub path: always returns 0.
Sourcepub fn query_device_info(ordinal: u32) -> Result<CudaDeviceInfo, CudaInitError>
pub fn query_device_info(ordinal: u32) -> Result<CudaDeviceInfo, CudaInitError>
Query device attributes for device ordinal without creating a backend.
Stub path: always returns Err(CudaInitError::NotAvailable).
Real path: returns basic info derived from the driver (name, total mem, CC).
Trait Implementations§
Auto Trait Implementations§
impl Freeze for CudaBackend
impl RefUnwindSafe for CudaBackend
impl Send for CudaBackend
impl Sync for CudaBackend
impl Unpin for CudaBackend
impl UnsafeUnpin for CudaBackend
impl UnwindSafe for CudaBackend
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more