Skip to main content

CudaBackend

Struct CudaBackend 

Source
pub struct CudaBackend {
    pub device_info: CudaDeviceInfo,
    /* private fields */
}
Expand description

CUDA compute backend.

Without cuda-backend feature: no-op stub; Self::try_new always returns CudaInitError::FeatureNotEnabled. All buffer and kernel methods operate on CPU shadows so unit tests compile and run on any platform.

With cuda-backend feature: real cudarc device context; Self::try_new calls CudaContext::new(ordinal) and returns an error if the CUDA driver is absent (e.g. on macOS or a Linux machine without an NVIDIA driver). Buffer methods perform actual host↔device memcpy via the default stream.

Fields§

§device_info: CudaDeviceInfo

Device information (filled from driver attributes when a real context is active).

Implementations§

Source§

impl CudaBackend

Source

pub fn try_new(ordinal: u32) -> Result<Self, CudaInitError>

Attempt to create a CUDA backend on device ordinal.

  • Without cuda-backend feature: always returns Err(CudaInitError::FeatureNotEnabled).
  • With cuda-backend feature: calls CudaContext::new(ordinal). Returns Err(CudaInitError::DeviceError(...)) if the CUDA driver is absent or the ordinal is invalid.
Source

pub fn new_stub() -> Self

Create a CPU-fallback stub (useful for unit testing without a GPU).

Source

pub fn is_available(&self) -> bool

True if a real CUDA device context is active.

Source

pub fn device_info(&self) -> &CudaDeviceInfo

Device information.

Source

pub fn create_buffer(&mut self, len: usize) -> CudaBufferHandle

Allocate a device buffer that can hold len f64 values.

Real path: calls CudaStream::alloc_zeros::<u8>(len * 8) and stores the returned CudaSlice<u8>. Falls back to a CPU-shadow buffer when no real context is active.

Source

pub fn alloc_unified(&mut self, len: usize) -> CudaBufferHandle

Allocate a unified memory buffer (accessible from both CPU and GPU).

In the current implementation unified memory is backed by the same CudaSlice<u8> path as a regular buffer; true UM page migration would require UnifiedSlice from cudarc which is gated on additional CUDA driver capabilities. Falls back to a CPU-shadow buffer in the stub.

Source

pub fn write_buffer(&mut self, handle: CudaBufferHandle, data: &[f64])

Upload data to the device buffer at handle.

Real path: CudaStream::memcpy_htod — synchronous on the default stream. Stub path: copies into the CPU shadow.

Source

pub fn read_buffer(&self, handle: CudaBufferHandle) -> Vec<f64>

Download data from the device buffer at handle.

Real path: CudaStream::clone_dtoh — synchronous copy to a new Vec<f64>. Stub path: returns a clone of the CPU shadow.

Source

pub fn register_kernel(&mut self, name: &str, ptx_source: &str)

Register a PTX kernel source and associate it with name.

Real path: loads the module via CudaContext::load_module and retrieves the named function. Stub path: records the name only.

Source

pub fn compile_and_register( &mut self, name: &str, cuda_c_source: &str, ) -> Result<(), CudaInitError>

Compile a CUDA C kernel at runtime via NVRTC and register it.

Real path: calls cudarc::nvrtc::compile_ptx then loads the module. Stub path: records the name and returns Ok(()).

Source

pub fn launch( &mut self, name: &str, buffers: &[CudaBufferHandle], grid_x: u32, block_x: u32, )

Launch a registered kernel with buffer arguments only.

§Parameters
  • name — kernel name as passed to Self::register_kernel or Self::compile_and_register
  • buffers — buffer handles bound as kernel arguments (in order)
  • grid_x — number of thread blocks in X dimension
  • block_x — number of threads per block in X dimension

For kernels that take scalar arguments (e.g. an integer particle count or floating-point smoothing length), use Self::launch_with_scalars instead — calling launch against a kernel whose signature includes scalar parameters will pass uninitialised registers to those slots and is undefined behaviour.

Real path: retrieves the stored CudaFunction and dispatches via CudaStream::launch_builder. Currently up to two buffer arguments are forwarded; extend as needed for higher-arity kernels.

Stub path: no-op.

Source

pub fn launch_with_scalars( &mut self, name: &str, buffers: &[CudaBufferHandle], scalars_i32: &[i32], scalars_f64: &[f64], grid_x: u32, block_x: u32, )

Launch a registered kernel passing buffer and scalar arguments.

Scalars are appended to the kernel argument list after the buffer arguments in the order i32 scalars then f64 scalars; the kernel signature must match that ordering exactly.

§Parameters
  • name — kernel name as passed to Self::register_kernel or Self::compile_and_register
  • buffers — buffer handles bound as the leading kernel arguments
  • scalars_i32i32 scalars appended after the buffers
  • scalars_f64f64 scalars appended after the i32 scalars
  • grid_x — number of thread blocks in X dimension
  • block_x — number of threads per block in X dimension

Stub path: no-op.

Source

pub fn synchronize(&mut self)

Synchronise the device (blocks until all submitted work completes).

Real path: CudaStream::synchronize(). Stub path: immediate return.

Source

pub fn device_count() -> u32

Return the number of CUDA devices available on this system.

Real path: calls cudarc::driver::result::device::get_count(). Stub path: always returns 0.

Source

pub fn query_device_info(ordinal: u32) -> Result<CudaDeviceInfo, CudaInitError>

Query device attributes for device ordinal without creating a backend.

Stub path: always returns Err(CudaInitError::NotAvailable). Real path: returns basic info derived from the driver (name, total mem, CC).

Trait Implementations§

Source§

impl Debug for CudaBackend

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> Downcast<T> for T

Source§

fn downcast(&self) -> &T

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> Upcast<T> for T

Source§

fn upcast(&self) -> Option<&T>

Source§

impl<T> WasmNotSend for T
where T: Send,

Source§

impl<T> WasmNotSendSync for T

Source§

impl<T> WasmNotSync for T
where T: Sync,