pub struct GraphExec { /* private fields */ }Expand description
An instantiated, executable graph.
Created by Graph::instantiate, a GraphExec holds a snapshot of the
graph and a pre-computed execution order.
§Driver backing
When a CUDA driver is available, instantiate builds a genuine
CUgraph (cuGraphCreate + one cuGraphAdd*Node per in-memory node,
with the dependency DAG wired through real CUgraphNode edges) and
finalises it into a CUgraphExec via cuGraphInstantiate. In that
case launch issues a real cuGraphLaunch.
The in-memory GraphNode representation stores only an operation
specification (kernel name, copy direction/size, memset size/value) —
it carries no resolved CUfunction or device pointers. Every node is
therefore translated to a real cuGraphAddEmptyNode: the resulting
driver graph reproduces the node count and dependency topology exactly
and executes on the GPU as a DAG of synchronisation barriers. The
per-node dispatch in Graph::build_driver_graph is structured so that
kernel / memcpy / memset nodes that gain concrete device operands can be
promoted to cuGraphAddKernelNode / cuGraphAddMemcpyNode /
cuGraphAddMemsetNode without further restructuring.
On macOS (or any host without a CUDA driver), no driver handles are
created; the graph is still validated (topological sort) and
launch returns CudaError::NotInitialized.
Implementations§
Source§impl GraphExec
impl GraphExec
Sourcepub fn launch(&self, stream: &Stream) -> CudaResult<()>
pub fn launch(&self, stream: &Stream) -> CudaResult<()>
Launches the executable graph on the given stream.
When this GraphExec is backed by a real CUgraphExec, this issues
cuGraphLaunch(hGraphExec, hStream), submitting the entire graph to
the stream with minimal CPU overhead. Otherwise it surfaces the
driver-load error.
§Errors
CudaError::NotInitializedif the CUDA driver is not available (e.g. on macOS, or a host without an NVIDIA GPU).- Any
CudaErrormapped fromcuGraphLaunch.
Sourcepub fn execution_order(&self) -> &[usize]
pub fn execution_order(&self) -> &[usize]
Returns the pre-computed execution order (topological sort).
Sourcepub fn node_count(&self) -> usize
pub fn node_count(&self) -> usize
Returns the total number of nodes that would be executed.
Sourcepub fn is_driver_backed(&self) -> bool
pub fn is_driver_backed(&self) -> bool
Returns true if this GraphExec is backed by a real, live
CUgraphExec driver handle (as opposed to a CPU-side-only graph).