Expand description
Provides a safe and convenient wrapper for the CUDA cuDNN API.
This crate (1.0.0) was developed against cuDNN v3.
§Architecture
This crate provides three levels of entrace.
FFI
The ffi
module exposes the foreign function interface and cuDNN specific types. Usually,
there should be no use to touch it if you only want to use cuDNN in you application. The ffi
is provided by the rust-cudnn-sys
crate and gets reexported here.
Low-Level
The api
module exposes already a complete and safe wrapper for the cuDNN API, including proper
Rust Errors. Usually there should be not need to use the API
directly though, as the Cudnn
module,
as described in the next block, provides all the API functionality but provides a more convenient interface.
High-Level
The cudnn
module exposes the Cudnn
struct, which provides a very convenient, easy-to-understand interface
for the cuDNN API. There should be not much need to obtain and read the cuDNN manual. Initialize the Cudnn
struct and you can call the available methods wich are representing all the available cuDNN operations.
§Examples
extern crate rcudnn as cudnn;
extern crate libc;
use cudnn::{Cudnn, TensorDescriptor};
use cudnn::utils::{ScalParams, DataType};
fn main() {
// Initialize a new cuDNN context and allocates resources.
let cudnn = Cudnn::new().unwrap();
// Create a cuDNN Tensor Descriptor for `src` and `dest` memory.
let src_desc = TensorDescriptor::new(&[2, 2, 2], &[4, 2, 1], DataType::Float).unwrap();
let dest_desc = TensorDescriptor::new(&[2, 2, 2], &[4, 2, 1], DataType::Float).unwrap();
let acti = cudnn.init_activation().unwrap();
// Obtain the `src` and memory pointer on the GPU.
// NOTE: You wouldn't do it like that. You need to really allocate memory on the GPU with e.g. CUDA or Collenchyma.
let src_data: *const ::libc::c_void = ::std::ptr::null();
let dest_data: *mut ::libc::c_void = ::std::ptr::null_mut();
// Now you can compute the forward sigmoid activation on your GPU.
cudnn.sigmoid_forward::<f32>(&acti, &src_desc, src_data, &dest_desc, dest_data, ScalParams::default());
}
§Notes
rust-cudnn was developed at Autumn for the Rust Machine Intelligence Framework Leaf.
rust-cudnn is part of the High-Performance Computation Framework Collenchyma, for the Neural Network Plugin. Rust CUDNN is now maintained by Juice
Modules§
- Defines Cuda Device Memory.
- Describes utility functionality for CUDA cuDNN.
Structs§
- Defines the Cuda cuDNN API.
- Describes a ActivationDescriptor.
- Describes a Convolution Descriptor.
- Provides a the high-level interface to CUDA’s cuDNN.
- Describes a DropoutDescriptor.
- Describes a Filter Descriptor.
- Describes a LRN Descriptor.
- Describes a Pooling Descriptor.
- Describes a Recurrent Descriptor.
- Describes a TensorDescriptor.
- Specifies an access policy for a window, a contiguous extent of memory beginning at base_ptr and ending at base_ptr + num_bytes. Partition into many segments and assign segments such that. sum of “hit segments” / window == approx. ratio. sum of “miss segments” / window == approx 1-ratio. Segments and ratio specifications are fitted to the capabilities of the architecture. Accesses in a hit segment apply the hitProp access policy. Accesses in a miss segment apply the missProp access policy.
- Sparse CUDA array and CUDA mipmapped array properties
- CUDA Channel format descriptor
- CUDA device properties
- CUDA extent
- External memory buffer descriptor
- External memory handle descriptor
- Win32 handle referencing the semaphore object. Valid when type is one of the following:
- External memory mipmap descriptor
- External semaphore handle descriptor
- Win32 handle referencing the semaphore object. Valid when type is one of the following:
- External semaphore signal node parameters
- External semaphore signal parameters, compatible with driver type
- Parameters for fence objects
- Parameters for keyed mutex objects
- External semaphore wait node parameters
- External semaphore wait parameters, compatible with driver type
- Parameters for fence objects
- Parameters for keyed mutex objects
- CUDA function attributes
- CUDA graphics interop resource
- CUDA host node parameters
- CUDA IPC event handle
- CUDA IPC memory handle
- CUDA GPU kernel node parameters
- CUDA launch parameters
- Memory access descriptor
- Memory allocation node parameters
- Specifies a memory location.
- Specifies the properties of allocations made from the pool.
- Opaque data for exporting a pool allocation
- CUDA 3D memory copying parameters
- CUDA 3D cross-device memory copying parameters
- CUDA Memset node parameters
- CUDA Pitched memory pointer
- CUDA pointer attributes
- CUDA 3D position
- CUDA resource descriptor
- CUDA resource view descriptor
- CUDA texture descriptor
- CUDA Surface reference
- CUDA texture reference
Enums§
- Defines CUDA’s cuDNN errors.
- Specifies performance hint with ::cudaAccessPolicyWindow for hitProp and missProp members.
- CUDA cooperative group scope
- Channel format kind
- CUDA device compute modes
- CUDA device attributes
- CUDA device P2P attributes
- CUDA error types
- CUDA error types
- External memory handle types
- External semaphore handle types
- CUDA GPUDirect RDMA flush writes APIs supported on the device
- CUDA GPUDirect RDMA flush writes scopes
- CUDA GPUDirect RDMA flush writes targets
- CUDA function attributes that can be set using ::cudaFuncSetAttribute
- CUDA function cache configurations
- CUDA GPUDirect RDMA flush writes ordering features of the device
- Flags to specify search options to be used with ::cudaGetDriverEntryPoint For more details see ::cuGetProcAddress
- CUDA Graph debug write options
- CUDA Graph Update error types
- Flags for instantiating a graph
- Graph memory attributes
- CUDA Graph node types
- CUDA graphics interop array indices for cube maps
- CUDA graphics interop map flags
- CUDA graphics interop register flags
- Graph kernel node Attributes
- CUDA Limits
- Specifies the memory protection flags for mapping.
- Flags for specifying particular handle types
- Defines the allocation types available
- Specifies the type of location
- CUDA memory pool attributes
- CUDA range attributes
- CUDA memory copy types
- CUDA Memory Advise values
- CUDA memory types
- CUDA Profiler Output modes
- CUDA Profiler Output modes
- CUDA resource types
- CUDA texture resource view formats
- Shared memory carveout configurations. These may be passed to cudaFuncSetAttribute
- CUDA shared memory configuration
- Stream Attributes
- Possible modes for stream capture thread interactions. For more details see ::cudaStreamBeginCapture and ::cudaThreadExchangeStreamCaptureMode
- Possible stream capture statuses returned by ::cudaStreamIsCapturing
- Flags for ::cudaStreamUpdateCaptureDependencies
- CUDA Surface boundary modes
- CUDA Surface format modes
- CUDA texture address modes
- CUDA texture filter modes
- CUDA texture read modes
- Flags for user objects for graphs
- Flags for retaining user object references for graphs
Constants§
Functions§
- \brief Gets info about the specified cudaArray
- \brief Gets a CUDA array plane from a CUDA array
- \brief Binds an array to a surface
- \brief Binds a memory area to a texture
- \brief Binds a 2D memory area to a texture
- \brief Binds an array to a texture
- \brief Binds a mipmapped array to a texture
- \brief Select compute-device which best matches criteria
- \brief Returns a channel descriptor using the specified format
- \brief Creates a surface object
- \brief Creates a texture object
- \brief Resets all persisting lines in cache to normal status.
- \brief Destroys an external memory object.
- \brief Destroys an external semaphore
- \brief Destroys a surface object
- \brief Destroys a texture object
- \brief Queries if a device may directly access a peer device’s memory.
- \brief Disables direct access to memory allocations on a peer device.
- \brief Enables direct access to memory allocations on a peer device.
- \brief Returns information about the device
- \brief Returns a handle to a compute device
- \brief Returns the preferred cache configuration for the current device.
- \brief Returns the default mempool of a device
- \brief Returns resource limits
- \brief Gets the current mempool for a device
- \brief Return NvSciSync attributes that this device can support.
- \brief Queries attributes of the link between two devices.
- \brief Returns a PCI Bus Id string for the device
- \brief Returns the shared memory configuration for the current device.
- \brief Returns numerical values that correspond to the least and greatest stream priorities.
- \brief Destroy all allocations and reset all state on the current device in the current process.
- \brief Sets the preferred cache configuration for the current device.
- \brief Set resource limits
- \brief Sets the current memory pool of a device
- \brief Sets the shared memory configuration for the current device.
- \brief Wait for compute device to finish
- \brief Returns the latest version of CUDA supported by the driver
- \brief Creates an event object
- \brief Creates an event object with the specified flags
- \brief Destroys an event object
- \brief Computes the elapsed time between events
- \brief Queries an event’s status
- \brief Records an event
- \brief Waits for an event to complete
- \brief Maps a buffer onto an imported memory object
- \brief Maps a CUDA mipmapped array onto an external memory object
- \brief Frees memory on the device
- \brief Frees an array on the device
- \brief Frees memory with stream ordered semantics
- \brief Frees page-locked memory
- \brief Frees a mipmapped array on the device
- \brief Find out attributes for a given function
- \brief Set attributes for a given function
- \brief Sets the preferred cache configuration for a device function
- \brief Sets the shared memory configuration for a device function
- \brief Get the channel descriptor of an array
- \brief Returns which device is currently being used
- \brief Returns the number of compute-capable devices
- \brief Gets the flags for the current device
- \brief Returns information about the compute-device
- \brief Returns the requested driver API function pointer
- \brief Returns the string representation of an error code enum name
- \brief Returns the description string for an error code
- \cond impl_private
- \brief Get pointer to device entry function that matches entry function \p symbolPtr
- \brief Returns the last error from a runtime call
- \brief Gets a mipmap level of a CUDA mipmapped array
- \brief Returns a surface object’s resource descriptor Returns the resource descriptor for the surface object specified by \p surfObject.
- \brief Get the surface reference associated with a symbol
- \brief Finds the address associated with a CUDA symbol
- \brief Finds the size of the object associated with a CUDA symbol
- \brief Get the alignment offset of a texture
- \brief Returns a texture object’s resource descriptor
- \brief Returns a texture object’s resource view descriptor
- \brief Returns a texture object’s texture descriptor
- \brief Get the texture reference associated with a symbol
- \brief Creates a child graph node and adds it to a graph
- \brief Adds dependency edges to a graph.
- \brief Creates an empty node and adds it to a graph
- \brief Creates a host execution node and adds it to a graph
- \brief Creates a kernel execution node and adds it to a graph
- \brief Creates a memcpy node and adds it to a graph
- \brief Creates a memset node and adds it to a graph
- \brief Gets a handle to the embedded graph of a child graph node
- \brief Clones a graph
- \brief Creates a graph
- \brief Write a DOT file describing graph structure
- \brief Destroys a graph
- \brief Remove a node from the graph
- \brief Destroys an executable graph
- \brief Sets the parameters for a host node in the given graphExec.
- \brief Sets the parameters for a kernel node in the given graphExec
- \brief Sets the parameters for a memcpy node in the given graphExec.
- \brief Sets the parameters for a memset node in the given graphExec.
- \brief Check whether an executable graph can be updated with a graph and perform the update if possible
- \brief Returns a graph’s dependency edges
- \brief Returns a graph’s nodes
- \brief Returns a graph’s root nodes
- \brief Returns a host node’s parameters
- \brief Sets a host node’s parameters
- \brief Creates an executable graph from a graph
- \brief Copies attributes from source node to destination node.
- \brief Queries node attribute.
- \brief Returns a kernel node’s parameters
- \brief Sets node attribute.
- \brief Sets a kernel node’s parameters
- \brief Launches an executable graph in a stream
- \brief Returns a memcpy node’s parameters
- \brief Sets a memcpy node’s parameters
- \brief Returns a memset node’s parameters
- \brief Sets a memset node’s parameters
- \brief Finds a cloned version of a node
- \brief Returns a node’s dependencies
- \brief Returns a node’s dependent nodes
- \brief Returns a node’s type
- \brief Release a user object reference from a graph
- \brief Removes dependency edges from a graph.
- \brief Retain a reference to a user object from a graph
- \brief Map graphics resources for access by CUDA
- \brief Get a mipmapped array through which to access a mapped graphics resource.
- \brief Get an device pointer through which to access a mapped graphics resource.
- \brief Set usage flags for mapping a graphics resource
- \brief Get an array through which to access a subresource of a mapped graphics resource.
- \brief Unmap graphics resources.
- \brief Unregisters a graphics resource for access by CUDA
- \brief Allocates page-locked memory on the host
- \brief Passes back device pointer of mapped host memory allocated by cudaHostAlloc or registered by cudaHostRegister
- \brief Passes back flags used to allocate pinned host memory allocated by cudaHostAlloc
- \brief Registers an existing host memory range for use by CUDA
- \brief Unregisters a memory range that was registered with cudaHostRegister
- \brief Imports an external memory object
- \brief Imports an external semaphore
- \brief Attempts to close memory mapped with cudaIpcOpenMemHandle
- \brief Gets an interprocess handle for a previously allocated event
- \brief Gets an interprocess memory handle for an existing device memory allocation
- \brief Opens an interprocess event handle for use in the current process
- \brief Opens an interprocess memory handle exported from another process and returns a device pointer usable in the local process.
- \brief Launches a device function where thread blocks can cooperate and synchronize as they execute
- \brief Launches device functions on multiple devices where thread blocks can cooperate and synchronize as they execute
- \brief Enqueues a host function call in a stream
- \brief Launches a device function
- \brief Allocate memory on the device
- \brief Allocates logical 1D, 2D, or 3D memory objects on the device
- \brief Allocate an array on the device
- \brief Allocate an array on the device
- \brief Allocates memory with stream ordered semantics
- \brief Allocates memory from a specified pool with stream ordered semantics.
- \brief Allocates page-locked memory on the host
- \brief Allocate a mipmapped array on the device
- \brief Allocates pitched memory on the device
- \brief Advise about the usage of a given memory range
- \brief Gets free and total device memory
- \brief Creates a memory pool
- \brief Destroys the specified memory pool
- \brief Export data to share a memory pool allocation between processes.
- \brief Exports a memory pool to the requested handle type.
- \brief Returns the accessibility of a pool from a device
- \brief Gets attributes of a memory pool
- \brief imports a memory pool from a shared handle.
- \brief Import a memory pool allocation from another process.
- \brief Controls visibility of pools between devices
- \brief Sets attributes of a memory pool
- \brief Tries to release memory back to the OS
- \brief Prefetches memory to the specified destination device
- \brief Query an attribute of a given memory range
- \brief Query attributes of a given memory range.
- \brief Copies data between host and device
- \brief Copies data between host and device
- \brief Copies data between host and device
- \brief Copies data between host and device
- \brief Copies data between host and device
- \brief Copies data between host and device
- \brief Copies data between host and device
- \brief Copies data between host and device
- \brief Copies data between 3D objects
- \brief Copies data between 3D objects
- \brief Copies memory between devices
- \brief Copies memory between devices asynchronously.
- \brief Copies data between host and device
- \brief Copies data between host and device
- \brief Copies data between host and device
- \brief Copies data between host and device
- \brief Copies data from the given symbol on the device
- \brief Copies data from the given symbol on the device
- \brief Copies memory between two devices
- \brief Copies memory between two devices asynchronously.
- \brief Copies data between host and device
- \brief Copies data between host and device
- \brief Copies data to the given symbol on the device
- \brief Copies data to the given symbol on the device
- \brief Initializes or sets device memory to a value
- \brief Initializes or sets device memory to a value
- \brief Initializes or sets device memory to a value
- \brief Initializes or sets device memory to a value
- \brief Initializes or sets device memory to a value
- \brief Initializes or sets device memory to a value
- \brief Returns dynamic shared memory available per block when launching \p numBlocks blocks on SM.
- \brief Returns occupancy for a device function
- \brief Returns occupancy for a device function with the specified flags
- \brief Returns the last error from a runtime call
- \brief Returns attributes about a specified pointer
- \brief Returns the CUDA Runtime version
- \brief Set device to be used for GPU executions
- \brief Sets flags to be used for device executions
- \brief Converts a double argument to be executed on a device
- \brief Converts a double argument after execution on a device
- \brief Set a list of devices that can be used for CUDA
- \brief Add a callback to a compute stream
- \brief Begins graph capture on a stream
- \brief Copies attributes from source stream to destination stream.
- \brief Create an asynchronous stream
- \brief Create an asynchronous stream
- \brief Create an asynchronous stream with the specified priority
- \brief Destroys and cleans up an asynchronous stream
- \brief Ends capture on a stream, returning the captured graph
- \brief Queries stream attribute.
- \brief Query capture status of a stream
- \brief Query a stream’s capture state (11.3+)
- \brief Query the flags of a stream
- \brief Query the priority of a stream
- \brief Returns a stream’s capture status
- \brief Queries an asynchronous stream for completion status
- \brief Sets stream attribute.
- \brief Waits for stream tasks to complete
- \brief Update the set of dependencies in a capturing stream (11.3+)
- \brief Make a compute stream wait on an event
- \brief Swaps the stream capture interaction mode for a thread
- \brief Exit and clean up from CUDA launches
- \brief Returns the preferred cache configuration for the current device.
- \brief Returns resource limits
- \brief Sets the preferred cache configuration for the current device.
- \brief Set resource limits
- \brief Wait for compute device to finish
- \brief Unbinds a texture
- \brief Create a user object
- \brief Release a reference to a user object
- \brief Retain a reference to a user object
- Create an empty tensor transform descriptor
- Destroys a previously created tensor transform descriptor.
- Retrieves the values stored in a previously initialized tensor transform descriptor.
- Create a destination descriptor for cudnnTransformTensor
- Initialize a previously created tensor transform descriptor.
- Return C Handle for a Vector of Tensor Descriptors
Type Aliases§
- CUDA array (as source copy argument)
- CUDA array
- CUDA event types
- CUDA external memory
- CUDA external semaphore
- CUDA function
- CUDA executable (launchable) graph
- CUDA graph node.
- CUDA graph
- CUDA graphics resource types
- CUDA host function \param userData Argument value passed to the function
- CUDA IPC event handle
- CUDA IPC memory handle
- CUDA memory pool
- CUDA mipmapped array (as source argument)
- CUDA mipmapped array
- Type of stream callback functions. \param stream The stream as passed to ::cudaStreamAddCallback, may be NULL. \param status ::cudaSuccess or any persistent error on the stream. \param userData User parameter provided at registration.
- CUDA stream
- An opaque value that represents a CUDA Surface object
- An opaque value that represents a CUDA texture object
- CUDA user object for graphs
Unions§
- Graph kernel node attributes union, used with ::cudaGraphKernelNodeSetAttribute/::cudaGraphKernelNodeGetAttribute
- Stream attributes union used with ::cudaStreamSetAttribute/::cudaStreamGetAttribute