pub struct ExecutableGraph { /* private fields */ }Implementations§
Source§impl ExecutableGraph
impl ExecutableGraph
pub fn launch_async(&self, stream: &Stream) -> Result<CheckedStreamFuture>
pub async fn launch_and_wait(&self, stream: &Stream) -> Result<()>
pub fn upload_async(&self, stream: &Stream) -> Result<CheckedStreamFuture>
pub async fn upload_and_wait(&self, stream: &Stream) -> Result<()>
Source§impl ExecutableGraph
impl ExecutableGraph
Sourcepub fn flags(&self) -> Result<GraphInstantiateFlags>
pub fn flags(&self) -> Result<GraphInstantiateFlags>
Returns the flags that were passed to instantiation for the given executable graph.
GraphInstantiateFlags::UPLOAD is not returned because it does not affect the resulting executable graph.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub fn launch(&self, stream: &Stream) -> Result<()>
pub fn launch(&self, stream: &Stream) -> Result<()>
Executes this executable graph in stream.
Only one instance of this executable graph may be executing at a time.
Each launch is ordered behind both any previous work in stream and any previous launches of this executable graph.
To execute a graph concurrently, it must be instantiated multiple times into multiple executable graphs.
If any allocations created by this executable graph remain unfreed from a previous launch and the graph was not instantiated with GraphInstantiateFlags::AUTO_FREE_ON_LAUNCH, the launch fails with crate::error::Status::InvalidValue.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub const fn launch_operation(&self) -> ExecutableGraphLaunchOperation<'_>
pub const fn launch_operation(&self) -> ExecutableGraphLaunchOperation<'_>
Returns a reusable operation object that launches this executable graph.
Sourcepub fn upload(&self, stream: &Stream) -> Result<()>
pub fn upload(&self, stream: &Stream) -> Result<()>
Uploads this executable graph to the device in stream without executing it.
Uploads of the same executable graph are serialized.
Each upload is ordered behind both any previous work in stream and any previous launches of this executable graph.
Uses memory cached by stream to back the allocations owned by this executable graph.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics.
Sourcepub fn update(&mut self, graph: &Graph) -> Result<ExecutableGraphUpdate>
pub fn update(&mut self, graph: &Graph) -> Result<ExecutableGraphUpdate>
Updates this executable graph with the node parameters in a topologically identical graph.
Limitations:
- Kernel nodes:
- The owning context of the kernel function cannot change.
- A node whose kernel function originally did not use CUDA dynamic parallelism cannot be updated to a kernel function that uses CDP.
- A node whose kernel function originally did not make device-side update calls cannot be updated to a kernel function that makes device-side update calls.
- A cooperative node cannot be updated to a non-cooperative node, and vice-versa.
- If the graph was instantiated with
GraphInstantiateFlags::USE_NODE_PRIORITY, the priority attribute cannot change. Equality is checked on the originally requested priority values, before they are clamped to the device’s supported range. - If this executable graph was not instantiated for device launch, a node whose kernel function originally did not use device-side
ExecutableGraph::launchcannot be updated to a kernel function that uses device-sideExecutableGraph::launchunless the node resides on the same device as nodes which contained such calls at instantiate-time. If no such calls were present at instantiation, these updates cannot be performed at all. - Neither the source graph nor this executable graph may contain device-updatable kernel nodes.
- Memset and memcpy nodes:
- The CUDA device(s) to which the operand(s) was allocated/mapped cannot change.
- The source/destination memory must be allocated from the same contexts as the original source/destination memory.
- For 2D memsets, only address and assigned value may be updated.
- For 1D memsets, updating dimensions is also allowed, but may fail if the resulting operation does not map onto the work resources already allocated for the node.
- Additional memcpy node restrictions:
- Changing either the source or destination memory type, such as
MemoryType::DeviceorMemoryType::Array, is not supported.
- Changing either the source or destination memory type, such as
- Conditional nodes:
- Changing node parameters is not supported.
- Changing parameters of nodes within the conditional body graph is subject to the rules above.
- Conditional handle flags and default values are updated as part of the graph update.
CUDA may add further restrictions in future releases.
ExecutableGraph::update sets the update result to GraphExecUpdateResult::ErrorTopologyChanged under the following conditions:
- The count of nodes directly in the executable graph and the source graph differ.
- The source graph has more exit nodes.
- A node in the source graph has a different number of dependencies than the paired node from the executable graph.
- A node in the source graph has a dependency that does not match the corresponding dependency of the paired node from the executable graph. The dependencies are paired based on edge order and a dependency does not match when the nodes are already paired based on other edges examined in the graph.
ExecutableGraph::update sets the update result to:
GraphExecUpdateResult::Errorif passed an invalid value.GraphExecUpdateResult::ErrorTopologyChangedif the graph topology changed.GraphExecUpdateResult::ErrorNodeTypeChangedif the type of a node changed.GraphExecUpdateResult::ErrorFunctionChangedif the kernel function of a node changed (CUDA driver before 11.2).GraphExecUpdateResult::ErrorUnsupportedFunctionChangeif the kernel function changed in an unsupported way.GraphExecUpdateResult::ErrorParametersChangedif any parameters to a node changed in a way that is not supported.GraphExecUpdateResult::ErrorAttributesChangedif any attributes of a node changed in a way that is not supported.GraphExecUpdateResult::ErrorNotSupportedif something about a node is unsupported, like the node’s type or configuration.
If the update fails for a reason not listed above, the result is GraphExecUpdateResult::Error.
If the update succeeds, the result is GraphExecUpdateResult::Success.
ExecutableGraph::update succeeds when the update was performed successfully.
It returns crate::error::Status::GraphExecUpdateFailure if the graph update was not performed because it included changes which violated constraints specific to instantiated graph update.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph update, if the update violates instantiated graph
update constraints, or if a previous asynchronous launch reported an error. CUDA may also
return initialization-related errors such as crate::error::Status::NotInitialized,
crate::error::Status::CallRequiresNewerDriver, or crate::error::Status::NoDevice if this call initializes
internal runtime state. Callbacks must not call CUDA functions; see
Stream::add_callback.
Sourcepub unsafe fn set_kernel_node_params<'a, P>(
&mut self,
node: GraphNode,
function: DeviceFunction,
config: &LaunchConfig,
params: P,
) -> Result<()>where
P: KernelLaunchArgs<'a>,
pub unsafe fn set_kernel_node_params<'a, P>(
&mut self,
node: GraphNode,
function: DeviceFunction,
config: &LaunchConfig,
params: P,
) -> Result<()>where
P: KernelLaunchArgs<'a>,
Sets the parameters of a kernel node in this executable graph.
The node is identified by the corresponding node in the non-executable graph from which this executable graph was instantiated.
node must not have been removed from the original graph.
All node parameters may change, but the following restrictions apply to function updates:
- The owning device of the kernel function cannot change.
- A node whose kernel function originally did not use CUDA dynamic parallelism cannot be updated to a kernel function that uses CDP
- A node whose kernel function originally did not make device-side update calls cannot be updated to a kernel function that makes device-side update calls.
- If this executable graph was not instantiated for device launch, a node whose kernel function originally did not use device-side
ExecutableGraph::launchcannot be updated to a kernel function that uses device-sideExecutableGraph::launchunless the node resides on the same device as nodes which contained such calls at instantiate-time. If no such calls were present at instantiation, these updates cannot be performed at all.
The modifications only affect future launches of this executable graph.
Already enqueued or running launches of this executable graph are not affected by this call.
The original node is also not modified by this call.
If node is a device-updatable kernel node, the next upload or launch of this executable graph will overwrite any previous device-side updates.
Additionally, applying host updates to a device-updatable kernel node while it is being updated from the device results in undefined behavior.
This can also be used with a runtime kernel handle queried through sys::cudaLibraryGetKernel or sys::cudaGetKernel and then passed as a raw pointer.
The symbol passed to sys::cudaGetKernel must be registered with the same CUDA Runtime instance.
Passing a symbol that belongs to a different runtime instance results in undefined behavior.
The only type that can be reliably passed to a different runtime instance is the runtime kernel handle type itself.
Graph objects are not threadsafe.
§Safety
CUDA copies the kernel argument values during this call and stores those copied values in the executable graph for future launches. If an argument value is itself a pointer, only the pointer address is copied. The caller must ensure every copied pointer value remains valid for every future launch that can execute this node. Mutable pointer arguments must also remain exclusive for the work ordered by those launches.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub unsafe fn set_memory_copy_node_1d_params(
&mut self,
node: GraphNode,
params: &MemoryCopy1DNodeParams,
) -> Result<()>
pub unsafe fn set_memory_copy_node_1d_params( &mut self, node: GraphNode, params: &MemoryCopy1DNodeParams, ) -> Result<()>
Updates the work represented by node in this executable graph as though node had contained the given params at instantiation.
node must remain in the graph which was used to instantiate this executable graph.
Changed edges to and from node are ignored.
The source and destination must be allocated from the same contexts as the original source and destination memory. The instantiation-time memory operands must be 1-dimensional. Zero-length operations are not supported.
The modifications only affect future launches of this executable graph.
Already enqueued or running launches of this executable graph are not affected by this call.
The original node is also not modified by this call.
Returns crate::error::Status::InvalidValue if the memory operands’ mappings changed or the original memory operands are multidimensional.
Graph objects are not threadsafe.
§Safety
CUDA stores the raw source and destination addresses in the executable
graph for future launches. The caller must ensure params remains
valid according to [Memcpy1DNodeParams::new] for every future launch
that can execute this node.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub unsafe fn set_memory_copy_node_1d_device_to_device<D, S>(
&mut self,
node: GraphNode,
dst: &mut D,
src: &S,
) -> Result<()>
pub unsafe fn set_memory_copy_node_1d_device_to_device<D, S>( &mut self, node: GraphNode, dst: &mut D, src: &S, ) -> Result<()>
Updates a memcpy node to copy between typed device byte buffers.
The node copies src.byte_len() bytes. dst must have at least that
many bytes.
§Safety
CUDA stores the raw source and destination addresses in the executable
graph for future launches. The caller must ensure dst and src
remain valid for every future launch that can execute this node. dst
must not be accessed through another mutable path while graph launches
using this node can write it.
§Errors
Returns an error if dst is smaller than src, if CUDA rejects the graph
operation, if a previous asynchronous launch reported an error, or if CUDA
reports runtime initialization diagnostics.
Sourcepub fn set_buffer_memory_copy_node_1d_device_to_device<T>(
&mut self,
node: GraphNode,
dst: &mut GraphBuffer<T>,
src: &GraphBuffer<T>,
) -> Result<()>
pub fn set_buffer_memory_copy_node_1d_device_to_device<T>( &mut self, node: GraphNode, dst: &mut GraphBuffer<T>, src: &GraphBuffer<T>, ) -> Result<()>
Updates a memcpy node to copy between graph-retained buffers.
The node copies src.byte_len() bytes. dst must have at least that
many bytes. The executable graph retains both allocations so future
launches cannot outlive the baked CUDA pointer values.
§Errors
Returns an error if dst is smaller than src, if node does not
belong to the graph used to instantiate this executable graph, if CUDA
rejects the graph update, if a previous asynchronous launch reported an
error, or if CUDA reports runtime initialization diagnostics.
Sourcepub unsafe fn set_memory_copy_node_params(
&mut self,
node: GraphNode,
params: &MemoryCopy3DNodeParams,
) -> Result<()>
pub unsafe fn set_memory_copy_node_params( &mut self, node: GraphNode, params: &MemoryCopy3DNodeParams, ) -> Result<()>
Updates the work represented by node in this executable graph as though node had contained the given params at instantiation.
node must remain in the graph which was used to instantiate this executable graph.
Changed edges to and from node are ignored.
The source and destination memory in params must be allocated from the same contexts as the original source and destination memory.
Both the instantiation-time memory operands and the memory operands in params must be 1-dimensional.
Zero-length operations are not supported.
The modifications only affect future launches of this executable graph.
Already enqueued or running launches of this executable graph are not affected by this call.
The original node is also not modified by this call.
Returns crate::error::Status::InvalidValue if the memory operands’ mappings changed or either the original or new memory operands are multidimensional.
Graph objects are not threadsafe.
§Safety
CUDA stores the raw source and destination addresses in the executable
graph for future launches. The caller must ensure params remains
valid according to MemoryCopy3DNodeParams for every future launch that
can execute this node.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub unsafe fn set_memory_copy_node_to_symbol_params(
&mut self,
node: GraphNode,
params: &MemoryCopyToSymbolNodeParams,
) -> Result<()>
pub unsafe fn set_memory_copy_node_to_symbol_params( &mut self, node: GraphNode, params: &MemoryCopyToSymbolNodeParams, ) -> Result<()>
§Safety
CUDA stores the raw symbol and source pointer in the executable graph
for future launches. The caller must ensure params remains valid
according to MemoryCopyToSymbolNodeParams::new for every future launch
that can execute this node.
Sourcepub unsafe fn set_memory_copy_node_from_symbol_params(
&mut self,
node: GraphNode,
params: &MemoryCopyFromSymbolNodeParams,
) -> Result<()>
pub unsafe fn set_memory_copy_node_from_symbol_params( &mut self, node: GraphNode, params: &MemoryCopyFromSymbolNodeParams, ) -> Result<()>
§Safety
CUDA stores the raw destination and symbol pointer in the executable
graph for future launches. The caller must ensure params remains
valid according to MemoryCopyFromSymbolNodeParams::new for every future
launch that can execute this node.
Sourcepub unsafe fn set_memory_set_node_params(
&mut self,
node: GraphNode,
params: &MemorySetNodeParams,
) -> Result<()>
pub unsafe fn set_memory_set_node_params( &mut self, node: GraphNode, params: &MemorySetNodeParams, ) -> Result<()>
Updates the work represented by node in this executable graph as though node had contained the given params at instantiation.
node must remain in the graph which was used to instantiate this executable graph.
Changed edges to and from node are ignored.
Zero-sized operations are not supported.
The new destination pointer in params must be to the same kind of allocation as the original destination pointer and have the same context association and device mapping as the original destination pointer.
Both the value and pointer address may be updated. Changing other aspects of the memset (width, height, element size or pitch) may cause the update to be rejected. Specifically, for 2D memsets, all dimension changes are rejected. For 1D memsets, changes in height are explicitly rejected and other changes are opportunistically allowed if the resulting work maps onto the work resources already allocated for the node.
The modifications only affect future launches of this executable graph.
Already enqueued or running launches of this executable graph are not affected by this call.
The original node is also not modified by this call.
Graph objects are not threadsafe.
§Safety
CUDA stores the raw destination address in the executable graph for
future launches. The caller must ensure params remains valid according
to MemorySetNodeParams::new for every future launch that can execute
this node.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub unsafe fn set_host_node_params(
&mut self,
node: GraphNode,
params: &HostNodeParams,
) -> Result<()>
pub unsafe fn set_host_node_params( &mut self, node: GraphNode, params: &HostNodeParams, ) -> Result<()>
Updates the work represented by node in this executable graph as though node had contained the given params at instantiation.
node must remain in the graph which was used to instantiate this executable graph.
Changed edges to and from node are ignored.
The modifications only affect future launches of this executable graph.
Already enqueued or running launches of this executable graph are not affected by this call.
The original node is also not modified by this call.
Graph objects are not threadsafe.
§Safety
CUDA stores the raw callback function and user-data pointer in the
executable graph for future launches. The caller must ensure params
remains valid according to HostNodeParams::new for every future
launch that can execute this node.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub fn set_event_record_node_event(
&mut self,
node: GraphNode,
event: &Event,
) -> Result<()>
pub fn set_event_record_node_event( &mut self, node: GraphNode, event: &Event, ) -> Result<()>
Sets the event of an event record node in this executable graph.
The node is identified by the corresponding node in the non-executable graph from which this executable graph was instantiated.
The modifications only affect future launches of this executable graph.
Already enqueued or running launches of this executable graph are not affected by this call.
The original node is also not modified by this call.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub fn set_child_graph_node(
&mut self,
node: GraphNode,
child_graph: &Graph,
) -> Result<()>
pub fn set_child_graph_node( &mut self, node: GraphNode, child_graph: &Graph, ) -> Result<()>
Updates the work represented by node in this executable graph as though the nodes contained in node’s graph had the parameters contained in child_graph’s nodes at instantiation.
node must remain in the graph which was used to instantiate this executable graph.
Changed edges to and from node are ignored.
The modifications only affect future launches of this executable graph.
Already enqueued or running launches of this executable graph are not affected by this call.
The original node is also not modified by this call.
The topology of child_graph, as well as the node insertion order, must match that of the graph contained in node.
See ExecutableGraph::update for a list of restrictions on what can be updated in an instantiated graph.
The update is recursive, so child graph nodes contained within the top-level child graph are also updated.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
Sourcepub fn set_event_wait_node_event(
&mut self,
node: GraphNode,
event: &Event,
) -> Result<()>
pub fn set_event_wait_node_event( &mut self, node: GraphNode, event: &Event, ) -> Result<()>
Sets the event of an event wait node in this executable graph.
The node is identified by the corresponding node in the non-executable graph from which this executable graph was instantiated.
The modifications only affect future launches of this executable graph.
Already enqueued or running launches of this executable graph are not affected by this call.
The original node is also not modified by this call.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
pub fn enable_node(&mut self, node: GraphNode) -> Result<()>
pub fn disable_node(&mut self, node: GraphNode) -> Result<()>
Sourcepub fn is_node_enabled(&self, node: GraphNode) -> Result<bool>
pub fn is_node_enabled(&self, node: GraphNode) -> Result<bool>
Returns whether node is enabled.
The node is identified by the corresponding node in the non-executable graph from which this executable graph was instantiated.
node must not have been removed from the original graph.
Currently only kernel, memset and memcpy nodes are supported.
Graph objects are not threadsafe.
§Errors
Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch
reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must
not call CUDA functions; see Stream::add_callback.
pub const fn as_raw(&self) -> cudaGraphExec_t
pub fn context(&self) -> Option<&Context>
Sourcepub fn into_raw(self) -> cudaGraphExec_t
pub fn into_raw(self) -> cudaGraphExec_t
Consumes the executable graph and returns the raw CUDA executable graph handle without destroying it.
The caller becomes responsible for eventually destroying the returned handle with CUDA.