pub struct Graph { /* private fields */ }Implementations§
Source§impl Graph
impl Graph
Sourcepub unsafe fn from_raw_in_context(
handle: cudaGraph_t,
ctx: Arc<Context>,
) -> Result<Self>
pub unsafe fn from_raw_in_context( handle: cudaGraph_t, ctx: Arc<Context>, ) -> Result<Self>
Sourcepub fn create_buffer<T>(&mut self, length: usize) -> Result<GraphBuffer<T>>
pub fn create_buffer<T>(&mut self, length: usize) -> Result<GraphBuffer<T>>
Allocates graph-retained device memory.
The returned buffer can be used with graph-buffer node APIs. Any graph or executable graph that records the buffer retains the underlying device allocation for replay.
§Errors
Returns an error if CUDA cannot allocate device memory, the requested byte count overflows, or CUDA reports runtime initialization diagnostics.
Sourcepub fn zeroes_buffer<T>(&mut self, length: usize) -> Result<GraphBuffer<T>>
pub fn zeroes_buffer<T>(&mut self, length: usize) -> Result<GraphBuffer<T>>
Allocates graph-retained device memory initialized to zero bytes.
§Errors
Returns an error if CUDA cannot allocate or initialize device memory, the requested byte count overflows, or CUDA reports runtime initialization diagnostics.
Sourcepub fn buffer_from_slice<T>(&mut self, values: &[T]) -> Result<GraphBuffer<T>>
pub fn buffer_from_slice<T>(&mut self, values: &[T]) -> Result<GraphBuffer<T>>
Allocates graph-retained device memory initialized from a host slice.
§Errors
Returns an error if CUDA cannot allocate or copy device memory, the requested byte count overflows, or CUDA reports runtime initialization diagnostics.
pub fn instantiate(&self) -> Result<ExecutableGraph>
Sourcepub fn instantiate_with_flags(
&self,
flags: GraphInstantiateFlags,
) -> Result<ExecutableGraph>
pub fn instantiate_with_flags( &self, flags: GraphInstantiateFlags, ) -> Result<ExecutableGraph>
Instantiates graph as an executable graph. The graph is validated for any structural constraints or intra-node constraints which were not previously validated. If instantiation is successful, returns an instantiated executable graph.
flags controls the behavior of instantiation and subsequent graph launches.
Valid flags are:
-
GraphInstantiateFlags::AUTO_FREE_ON_LAUNCH, which configures a graph containing memory allocation nodes to automatically free any unfreed memory allocations before the graph is relaunched. -
GraphInstantiateFlags::DEVICE_LAUNCH, which configures the graph for launch from the device. If this flag is passed, the executable graph handle returned can be used to launch the graph from both the host and device. This flag can only be used on platforms which support unified addressing. This flag cannot be used in conjunction withGraphInstantiateFlags::AUTO_FREE_ON_LAUNCH. -
GraphInstantiateFlags::USE_NODE_PRIORITY, which causes the graph to use the priorities from the per-node attributes rather than the priority of the launch stream during execution. Priorities are only available on kernel nodes and are copied from stream priority during stream capture.
If the graph contains any allocation or free nodes, there can be at most one executable graph in existence for that graph at a time. An attempt to instantiate a second executable graph before dropping the first results in an error. The same also applies if the graph contains any device-updatable kernel nodes.
If the graph contains kernels which call device-side ExecutableGraph::launch from multiple devices, this results in an error.
Graphs instantiated for launch on the device have additional restrictions which do not apply to host graphs:
- The graph’s nodes must reside on a single device.
- The graph can only contain kernel nodes, memcpy nodes, memset nodes, and child graph nodes.
- The graph cannot be empty and must contain at least one kernel, memcpy, or memset node. Operation-specific restrictions are outlined below.
- Kernel nodes:
- Use of CUDA Dynamic Parallelism is not permitted.
- Cooperative launches are permitted as long as MPS is not in use.
- Memcpy nodes:
- Only copies involving device memory and/or pinned device-mapped host memory are permitted.
- Copies involving CUDA arrays are not permitted.
- Both operands must be accessible from the current device, and the current device must match the device of other nodes in the graph.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub fn try_clone(&self) -> Result<Self>
pub fn try_clone(&self) -> Result<Self>
Creates a copy of original_graph.
All parameters are copied into the cloned graph.
The original graph may be modified after this call without affecting the clone.
Child graph nodes in the original graph are recursively copied into the clone.
Cloning is not supported for graphs that contain memory allocation nodes, memory free nodes, or conditional nodes.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
pub fn add_dependency(&mut self, from: GraphNode, to: GraphNode) -> Result<()>
pub fn add_dependencies( &mut self, from: &[GraphNode], to: &[GraphNode], ) -> Result<()>
Sourcepub fn add_dependencies_with_data(
&mut self,
from: &[GraphNode],
to: &[GraphNode],
edge_data: &[GraphEdgeData],
) -> Result<()>
pub fn add_dependencies_with_data( &mut self, from: &[GraphNode], to: &[GraphNode], edge_data: &[GraphEdgeData], ) -> Result<()>
Elements in from and to at corresponding indices define each dependency to add.
Each node in from and to must belong to this graph.
If from and to are empty, the call returns without modifying the graph.
Specifying an existing dependency returns an error.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
pub fn remove_dependency( &mut self, from: GraphNode, to: GraphNode, ) -> Result<()>
pub fn remove_dependencies( &mut self, from: &[GraphNode], to: &[GraphNode], ) -> Result<()>
Sourcepub fn remove_dependencies_with_data(
&mut self,
from: &[GraphNode],
to: &[GraphNode],
edge_data: &[GraphEdgeData],
) -> Result<()>
pub fn remove_dependencies_with_data( &mut self, from: &[GraphNode], to: &[GraphNode], edge_data: &[GraphEdgeData], ) -> Result<()>
Elements in from and to at corresponding indices define each dependency to remove.
Each node in from and to must belong to this graph.
If from and to are empty, the call returns without modifying the graph.
Specifying an edge that does not exist in the graph, with data matching edge_data, results in an error.
Passing an empty edge_data slice is equivalent to passing default edge data for each edge.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
pub fn add_edges(&mut self, edges: &[GraphEdge]) -> Result<()>
pub fn remove_edges(&mut self, edges: &[GraphEdge]) -> Result<()>
Sourcepub fn add_empty_node(
&mut self,
dependencies: &[GraphNode],
) -> Result<GraphNode>
pub fn add_empty_node( &mut self, dependencies: &[GraphNode], ) -> Result<GraphNode>
Creates a node that performs no operation and adds it to the graph with the given dependencies. The dependency list may be empty, in which case the node is placed at the graph root. It may not contain duplicate entries.
An empty node performs no operation during execution, but can be used for transitive ordering. For example, a phased execution graph with 2 groups of n nodes with a barrier between them can be represented using an empty node and 2*n dependency edges, rather than no empty node and n^2 dependency edges.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation or reports runtime initialization
diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.
Sourcepub fn add_event_record_node(
&mut self,
dependencies: &[GraphNode],
event: &Event,
) -> Result<GraphNode>
pub fn add_event_record_node( &mut self, dependencies: &[GraphNode], event: &Event, ) -> Result<GraphNode>
Creates an event record node and adds it to the graph with the given dependencies and event. The dependency list may be empty, in which case the node is placed at the graph root. It may not contain duplicate entries.
Each graph launch records event to capture execution of the node’s dependencies.
These nodes may not be used in loops or conditionals.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub fn add_event_wait_node(
&mut self,
dependencies: &[GraphNode],
event: &Event,
) -> Result<GraphNode>
pub fn add_event_wait_node( &mut self, dependencies: &[GraphNode], event: &Event, ) -> Result<GraphNode>
Creates an event wait node and adds it to the graph with the given dependencies and event. The dependency list may be empty, in which case the node is placed at the graph root. It may not contain duplicate entries.
The graph node waits for all work captured in event.
See sys::cuEventRecord for details on what is captured by an event.
Synchronization is performed efficiently on the device when applicable.
event may come from a different context or device than the launch stream.
These nodes may not be used in loops or conditionals.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub unsafe fn add_host_node(
&mut self,
dependencies: &[GraphNode],
params: &HostNodeParams,
) -> Result<GraphNode>
pub unsafe fn add_host_node( &mut self, dependencies: &[GraphNode], params: &HostNodeParams, ) -> Result<GraphNode>
Creates a CPU execution node and adds it to the graph with the given dependencies and host-node parameters. The dependency list may be empty, in which case the node is placed at the graph root. It may not contain duplicate entries.
When the graph is launched, the node invokes the specified CPU function. Host nodes are not supported under MPS with pre-Volta GPUs.
Graph objects are not threadsafe.
§Safety
CUDA stores the raw callback function and user-data pointer in the graph
node for later replay. The caller must ensure params remains valid
according to HostNodeParams::new for every graph instantiation and
launch that can execute this node.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub unsafe fn add_kernel_node<'a, P>(
&mut self,
dependencies: &[GraphNode],
function: DeviceFunction,
config: &LaunchConfig,
params: P,
) -> Result<GraphNode>where
P: KernelLaunchArgs<'a>,
pub unsafe fn add_kernel_node<'a, P>(
&mut self,
dependencies: &[GraphNode],
function: DeviceFunction,
config: &LaunchConfig,
params: P,
) -> Result<GraphNode>where
P: KernelLaunchArgs<'a>,
Creates a kernel execution node and adds it to the graph with the given dependencies, launch configuration, and kernel parameters. The dependency list may be empty, in which case the node is placed at the graph root. It may not contain duplicate entries.
When the graph is launched, the node invokes the kernel on the grid and blocks specified by LaunchConfig.
LaunchConfig::shared_memory_bytes sets the amount of dynamic shared memory available to each thread block.
Kernel parameters are passed with KernelParameters or tuples of shared or mutable references.
Kernels launched using graphs must not use texture and surface references. Reading or writing through any texture or surface reference is undefined behavior. This restriction does not apply to texture and surface objects.
Runtime kernel handles queried via sys::cudaLibraryGetKernel or sys::cudaGetKernel may be used.
The symbol passed to sys::cudaGetKernel must be registered with the same CUDA Runtime instance.
Passing a symbol that belongs to a different runtime instance results in undefined behavior.
Graph objects are not threadsafe.
§Safety
CUDA copies the kernel argument values during this call and stores those copied values in the graph node for later replay. If an argument value is itself a pointer, only the pointer address is copied. The caller must ensure every copied pointer value remains valid for every graph instantiation, update, and launch that can execute this node. Mutable pointer arguments must also remain exclusive for the work ordered by those launches.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub unsafe fn add_memory_copy_node_1d(
&mut self,
dependencies: &[GraphNode],
params: &MemoryCopy1DNodeParams,
) -> Result<GraphNode>
pub unsafe fn add_memory_copy_node_1d( &mut self, dependencies: &[GraphNode], params: &MemoryCopy1DNodeParams, ) -> Result<GraphNode>
Creates a new 1D memcpy node and adds it to the graph with the given dependencies. The dependency list may be empty, in which case the node is placed at the root of the graph, and it may not contain duplicate entries.
When the graph is launched, the node copies count bytes from src to dst.
The transfer direction is described by MemoryCopyKind.
MemoryCopyKind::Default is recommended when unified virtual addressing is available, in which case the transfer direction is inferred from the pointer values.
Launching a memcpy node with dst and src pointers that do not match the direction of the copy results in undefined behavior.
Memcpy nodes have additional restrictions for managed memory if any device in the system does not support concurrent managed access.
Graph objects are not threadsafe.
§Safety
CUDA stores the raw source and destination addresses in the graph node
for later replay. The caller must ensure params remains valid
according to [Memcpy1DNodeParams::new] for every graph instantiation
and launch that can execute this node.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub unsafe fn add_memory_copy_node_1d_device_to_device<D, S>(
&mut self,
dependencies: &[GraphNode],
dst: &mut D,
src: &S,
) -> Result<GraphNode>
pub unsafe fn add_memory_copy_node_1d_device_to_device<D, S>( &mut self, dependencies: &[GraphNode], dst: &mut D, src: &S, ) -> Result<GraphNode>
Creates a device-to-device memcpy node from typed byte buffers.
The node copies src.byte_len() bytes. dst must have at least that
many bytes.
§Safety
CUDA stores the raw source and destination addresses in the graph node
for later replay. The caller must ensure dst and src remain valid
for every graph instantiation and launch that can execute this node.
dst must not be accessed through another mutable path while graph
launches using this node can write it.
§Errors
Returns an error if dst is smaller than src, if CUDA rejects the graph
operation, if a previous asynchronous launch reported an error, or if CUDA
reports runtime initialization diagnostics.
Sourcepub fn add_buffer_memory_copy_node_1d_device_to_device<T>(
&mut self,
dependencies: &[GraphNode],
dst: &mut GraphBuffer<T>,
src: &GraphBuffer<T>,
) -> Result<GraphNode>
pub fn add_buffer_memory_copy_node_1d_device_to_device<T>( &mut self, dependencies: &[GraphNode], dst: &mut GraphBuffer<T>, src: &GraphBuffer<T>, ) -> Result<GraphNode>
Creates a device-to-device memcpy node between graph-retained buffers.
The node copies src.byte_len() bytes. dst must have at least that
many bytes. The graph retains both allocations so the baked CUDA graph
pointers remain live for future instantiation and replay.
§Errors
Returns an error if dst is smaller than src, if CUDA rejects the graph
operation, if a previous asynchronous launch reported an error, or if CUDA
reports runtime initialization diagnostics.
Sourcepub unsafe fn add_memory_copy_node(
&mut self,
dependencies: &[GraphNode],
params: &MemoryCopy3DNodeParams,
) -> Result<GraphNode>
pub unsafe fn add_memory_copy_node( &mut self, dependencies: &[GraphNode], params: &MemoryCopy3DNodeParams, ) -> Result<GraphNode>
Creates a memcpy node and adds it to the graph with the given dependencies. The dependency list may be empty, in which case the node is placed at the graph root. It may not contain duplicate entries.
When the graph is launched, the node performs the memcpy described by params.
See sys::cudaMemcpy3D for a description of the structure and its restrictions.
Memcpy nodes have additional restrictions for managed memory if any device in the system does not support concurrent managed access.
Graph objects are not threadsafe.
§Safety
CUDA stores the raw source and destination addresses in the graph node
for later replay. The caller must ensure params remains valid
according to [Memcpy3DNodeParams] for every graph instantiation and
launch that can execute this node.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub unsafe fn add_memory_copy_node_to_symbol(
&mut self,
dependencies: &[GraphNode],
params: &MemoryCopyToSymbolNodeParams,
) -> Result<GraphNode>
pub unsafe fn add_memory_copy_node_to_symbol( &mut self, dependencies: &[GraphNode], params: &MemoryCopyToSymbolNodeParams, ) -> Result<GraphNode>
§Safety
CUDA stores the raw symbol and source pointer in the graph node for
later replay. The caller must ensure params remains valid according to
[MemcpyToSymbolNodeParams::new] for every graph instantiation and
launch that can execute this node.
Sourcepub unsafe fn add_memory_copy_node_from_symbol(
&mut self,
dependencies: &[GraphNode],
params: &MemoryCopyFromSymbolNodeParams,
) -> Result<GraphNode>
pub unsafe fn add_memory_copy_node_from_symbol( &mut self, dependencies: &[GraphNode], params: &MemoryCopyFromSymbolNodeParams, ) -> Result<GraphNode>
§Safety
CUDA stores the raw destination and symbol pointer in the graph node for
later replay. The caller must ensure params remains valid according to
MemoryCopyFromSymbolNodeParams::new for every graph instantiation and
launch that can execute this node.
Sourcepub unsafe fn add_memory_set_node(
&mut self,
dependencies: &[GraphNode],
params: &MemorySetNodeParams,
) -> Result<GraphNode>
pub unsafe fn add_memory_set_node( &mut self, dependencies: &[GraphNode], params: &MemorySetNodeParams, ) -> Result<GraphNode>
Creates a new memset node and adds it to the graph with the given dependencies. The dependency list may be empty, in which case the node is placed at the root of the graph, and it may not contain duplicate entries.
The element size must be 1, 2, or 4 bytes.
When the graph is launched, the node performs the memset described by params.
Graph objects are not threadsafe.
§Safety
CUDA stores the destination address in the graph node for later replay.
The caller must ensure params remains valid according to
MemorySetNodeParams::new for every graph instantiation and launch that
can execute this node.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub fn add_child_graph_node(
&mut self,
dependencies: &[GraphNode],
child_graph: &Self,
) -> Result<GraphNode>
pub fn add_child_graph_node( &mut self, dependencies: &[GraphNode], child_graph: &Self, ) -> Result<GraphNode>
Creates a new node which executes an embedded graph, and adds it to the graph with the given dependencies. The dependency list may be empty, in which case the node is placed at the root of the graph, and it may not contain duplicate entries.
If child_graph contains allocation nodes, free nodes, or conditional nodes, this call returns an error.
The node executes an embedded child graph. The child graph is cloned in this call.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub fn add_memory_free_node(
&mut self,
dependencies: &[GraphNode],
allocation: &MemoryAllocationNodeInfo,
) -> Result<GraphNode>
pub fn add_memory_free_node( &mut self, dependencies: &[GraphNode], allocation: &MemoryAllocationNodeInfo, ) -> Result<GraphNode>
Creates a new memory free node for a graph allocation and adds it to the graph. The dependency list may be empty, in which case the node is placed at the root of the graph, and it may not contain duplicate entries.
[Graph::add_mem_free_node] returns crate::error::Status::InvalidValue if the caller attempts to free:
- an allocation twice in the same graph.
- an address that was not returned by an allocation node.
- an invalid address.
The following restrictions apply to graphs which contain allocation and/or memory free nodes:
- Nodes and edges of the graph cannot be deleted.
- The graph can only be used in a child node if the ownership is moved to the parent.
- Only one instantiation of the graph may exist at any point in time.
- The graph cannot be cloned.
Graph objects are not threadsafe.
§Errors
Returns Error::GraphNodeMismatch if allocation did not come from this
graph. Returns an error if CUDA rejects the graph operation or if a
previous asynchronous launch reported an error.
Sourcepub unsafe fn add_memory_free_node_raw(
&mut self,
dependencies: &[GraphNode],
ptr: DevicePtr,
) -> Result<GraphNode>
pub unsafe fn add_memory_free_node_raw( &mut self, dependencies: &[GraphNode], ptr: DevicePtr, ) -> Result<GraphNode>
Creates a new memory free node from a raw device address.
§Safety
CUDA stores the raw address in the graph. The caller must ensure ptr
is a graph allocation that may be freed by this graph, is ordered after
the allocation node, and is not freed more than once or by another graph
in a way that violates CUDA graph allocation ownership rules.
Sourcepub fn add_memory_allocation_node(
&mut self,
dependencies: &[GraphNode],
params: &MemoryAllocationNodeParams<'_>,
) -> Result<(GraphNode, MemoryAllocationNodeInfo)>
pub fn add_memory_allocation_node( &mut self, dependencies: &[GraphNode], params: &MemoryAllocationNodeParams<'_>, ) -> Result<(GraphNode, MemoryAllocationNodeInfo)>
Creates a new allocation node and adds it to the graph with the given dependencies and allocation parameters. The dependency list may be empty, in which case the node is placed at the root of the graph, and it may not contain duplicate entries.
When [Graph::add_mem_alloc_node] creates an allocation node, it returns the allocation metadata in MemoryAllocationNodeInfo.
The allocation’s address remains fixed across instantiations and launches.
If the allocation is freed in the same graph, by creating a free node using [Graph::add_mem_free_node], the allocation can be accessed by nodes ordered after the allocation node but before the free node.
These allocations cannot be freed outside the owning graph, and they can only be freed once in the owning graph.
If the allocation is not freed in the same graph, then it can be accessed not only by nodes in the graph which are ordered after the allocation node, but also by stream operations ordered after the graph’s execution but before the allocation is freed.
Allocations which are not freed in the same graph can be freed by:
- passing the allocation to
DeviceMemory::free_asyncorDeviceMemory::free; - launching a graph with a free node for that allocation; or
- specifying
GraphInstantiateFlags::AUTO_FREE_ON_LAUNCHduring instantiation, which makes each launch behave as though it calledDeviceMemory::free_asyncfor every unfreed allocation.
It is not possible to free an allocation in both the owning graph and another graph. If the allocation is freed in the same graph, a free node cannot be added to another graph. If the allocation is freed in another graph, a free node can no longer be added to the owning graph.
The following restrictions apply to graphs which contain allocation and/or memory free nodes:
- Nodes and edges of the graph cannot be deleted.
- The graph can only be used in a child node if the ownership is moved to the parent.
- Only one instantiation of the graph may exist at any point in time.
- The graph cannot be cloned.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation or if a previous asynchronous launch reported an error.
Sourcepub fn nodes(&self) -> Result<Vec<GraphNode>>
pub fn nodes(&self) -> Result<Vec<GraphNode>>
Returns this graph’s nodes.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub fn root_nodes(&self) -> Result<Vec<GraphNode>>
pub fn root_nodes(&self) -> Result<Vec<GraphNode>>
Returns this graph’s root nodes.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub fn edges(&self) -> Result<Vec<GraphEdge>>
pub fn edges(&self) -> Result<Vec<GraphEdge>>
Returns this graph’s dependency edges.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub fn topology_summary(&self) -> Result<GraphTopologySummary>
pub fn topology_summary(&self) -> Result<GraphTopologySummary>
Returns a compact summary of this graph’s native CUDA topology.
The summary is computed from CUDA graph introspection APIs and counts
node kinds, root nodes, and dependency edges in this graph. Child graph
nodes are counted as child nodes here; callers that need recursive
details can query the child graph returned by GraphNode::child_graph.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects a topology query, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics.
Sourcepub fn write_dot(&self, path: &str, flags: GraphDebugDotFlags) -> Result<()>
pub fn write_dot(&self, path: &str, flags: GraphDebugDotFlags) -> Result<()>
Writes a DOT-formatted description of the graph to path.
By default this includes the graph topology, node types, node ID, kernel names, and memcpy direction.
flags can request more detailed information about each node type, such as parameter values, kernel attributes, node handles, and function handles.
§Errors
Returns an error if path contains an interior NUL byte or if CUDA
Runtime cannot write the DOT file.
pub fn as_raw(&self) -> cudaGraph_t
pub fn context(&self) -> Option<&Context>
Sourcepub fn into_raw(self) -> cudaGraph_t
pub fn into_raw(self) -> cudaGraph_t
Consumes the graph and returns the raw CUDA graph handle without destroying it.
The caller becomes responsible for eventually destroying the returned handle with CUDA.