Struct BorrowedGraph

Source

pub struct BorrowedGraph<'node> { /* private fields */ }

Implementations§

Source §

impl<'graph> BorrowedGraph<'graph>

Source

pub unsafe fn from_raw(handle: cudaGraph_t) -> Result<Self>

Wraps an existing CUDA graph handle without taking ownership.

§Safety

handle must be a valid CUDA graph handle for the returned lifetime. The returned graph view will not destroy handle when dropped.

Source

pub unsafe fn from_raw_in_context( handle: cudaGraph_t, ctx: Option<Arc<Context>>, ) -> Result<Self>

Wraps an existing CUDA graph handle without taking ownership and keeps a modeled context association for safe graph operations through the borrowed view.

§Safety

handle must be a valid CUDA graph handle for the returned lifetime, and it must be associated with ctx when ctx is present. The returned graph view will not destroy handle when dropped.

Source

pub const fn as_graph(&self) -> &Graph

Source

pub fn as_raw(&self) -> cudaGraph_t

Methods from Deref<Target = Graph>§

Source

pub fn instantiate(&self) -> Result<ExecutableGraph>

Source

pub fn instantiate_with_flags( &self, flags: GraphInstantiateFlags, ) -> Result<ExecutableGraph>

Instantiates graph as an executable graph. The graph is validated for any structural constraints or intra-node constraints which were not previously validated. If instantiation is successful, returns an instantiated executable graph.

flags controls the behavior of instantiation and subsequent graph launches. Valid flags are:

GraphInstantiateFlags::AUTO_FREE_ON_LAUNCH, which configures a graph containing memory allocation nodes to automatically free any unfreed memory allocations before the graph is relaunched.
GraphInstantiateFlags::DEVICE_LAUNCH, which configures the graph for launch from the device. If this flag is passed, the executable graph handle returned can be used to launch the graph from both the host and device. This flag can only be used on platforms which support unified addressing. This flag cannot be used in conjunction with GraphInstantiateFlags::AUTO_FREE_ON_LAUNCH.
GraphInstantiateFlags::USE_NODE_PRIORITY, which causes the graph to use the priorities from the per-node attributes rather than the priority of the launch stream during execution. Priorities are only available on kernel nodes and are copied from stream priority during stream capture.

If the graph contains any allocation or free nodes, there can be at most one executable graph in existence for that graph at a time. An attempt to instantiate a second executable graph before dropping the first results in an error. The same also applies if the graph contains any device-updatable kernel nodes.

If the graph contains kernels which call device-side ExecutableGraph::launch from multiple devices, this results in an error.

Graphs instantiated for launch on the device have additional restrictions which do not apply to host graphs:

The graph’s nodes must reside on a single device.
The graph can only contain kernel nodes, memcpy nodes, memset nodes, and child graph nodes.
The graph cannot be empty and must contain at least one kernel, memcpy, or memset node. Operation-specific restrictions are outlined below.
Kernel nodes:
- Use of CUDA Dynamic Parallelism is not permitted.
- Cooperative launches are permitted as long as MPS is not in use.
Memcpy nodes:
- Only copies involving device memory and/or pinned device-mapped host memory are permitted.
- Copies involving CUDA arrays are not permitted.
- Both operands must be accessible from the current device, and the current device must match the device of other nodes in the graph.

Graph objects are not threadsafe.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn try_clone(&self) -> Result<Self>

Creates a copy of original_graph. All parameters are copied into the cloned graph. The original graph may be modified after this call without affecting the clone.

Child graph nodes in the original graph are recursively copied into the clone.

Cloning is not supported for graphs that contain memory allocation nodes, memory free nodes, or conditional nodes.

Graph objects are not threadsafe.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn nodes(&self) -> Result<Vec<GraphNode>>

Returns this graph’s nodes.

Graph objects are not threadsafe.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn root_nodes(&self) -> Result<Vec<GraphNode>>

Returns this graph’s root nodes.

Graph objects are not threadsafe.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn edges(&self) -> Result<Vec<GraphEdge>>

Returns this graph’s dependency edges.

Graph objects are not threadsafe.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn topology_summary(&self) -> Result<GraphTopologySummary>

Returns a compact summary of this graph’s native CUDA topology.

The summary is computed from CUDA graph introspection APIs and counts node kinds, root nodes, and dependency edges in this graph. Child graph nodes are counted as child nodes here; callers that need recursive details can query the child graph returned by GraphNode::child_graph.

Graph objects are not threadsafe.

§Errors

Returns an error if CUDA rejects a topology query, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics.

Source

pub fn write_dot(&self, path: &str, flags: GraphDebugDotFlags) -> Result<()>

Writes a DOT-formatted description of the graph to path. By default this includes the graph topology, node types, node ID, kernel names, and memcpy direction. flags can request more detailed information about each node type, such as parameter values, kernel attributes, node handles, and function handles.