Struct ExecutableGraph

Source

pub struct ExecutableGraph { /* private fields */ }

Implementations§

Source §

impl ExecutableGraph

Source

pub fn flags(&self) -> Result<GraphInstantiateFlags>

Returns the flags that were passed to instantiation for the given executable graph. GraphInstantiateFlags::UPLOAD is not returned because it does not affect the resulting executable graph.

Graph objects are not threadsafe.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn launch(&self, stream: &Stream) -> Result<()>

Executes this executable graph in stream. Only one instance of this executable graph may be executing at a time. Each launch is ordered behind both any previous work in stream and any previous launches of this executable graph. To execute a graph concurrently, it must be instantiated multiple times into multiple executable graphs.

If any allocations created by this executable graph remain unfreed from a previous launch and the graph was not instantiated with GraphInstantiateFlags::AUTO_FREE_ON_LAUNCH, the launch fails with crate::error::Status::InvalidValue.

Graph objects are not threadsafe.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub const fn launch_operation(&self) -> ExecutableGraphLaunchOperation<'_>

Returns a reusable operation object that launches this executable graph.

Source

pub fn upload(&self, stream: &Stream) -> Result<()>

Uploads this executable graph to the device in stream without executing it. Uploads of the same executable graph are serialized. Each upload is ordered behind both any previous work in stream and any previous launches of this executable graph. Uses memory cached by stream to back the allocations owned by this executable graph.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics.

Source

pub fn update(&mut self, graph: &Graph) -> Result<ExecutableGraphUpdate>

Updates this executable graph with the node parameters in a topologically identical graph.

Limitations:

Kernel nodes:
- The owning context of the kernel function cannot change.
- A node whose kernel function originally did not use CUDA dynamic parallelism cannot be updated to a kernel function that uses CDP.
- A node whose kernel function originally did not make device-side update calls cannot be updated to a kernel function that makes device-side update calls.
- A cooperative node cannot be updated to a non-cooperative node, and vice-versa.
- If the graph was instantiated with GraphInstantiateFlags::USE_NODE_PRIORITY, the priority attribute cannot change. Equality is checked on the originally requested priority values, before they are clamped to the device’s supported range.
- If this executable graph was not instantiated for device launch, a node whose kernel function originally did not use device-side ExecutableGraph::launch cannot be updated to a kernel function that uses device-side ExecutableGraph::launch unless the node resides on the same device as nodes which contained such calls at instantiate-time. If no such calls were present at instantiation, these updates cannot be performed at all.
- Neither the source graph nor this executable graph may contain device-updatable kernel nodes.
Memset and memcpy nodes:
- The CUDA device(s) to which the operand(s) was allocated/mapped cannot change.
- The source/destination memory must be allocated from the same contexts as the original source/destination memory.
- For 2D memsets, only address and assigned value may be updated.
- For 1D memsets, updating dimensions is also allowed, but may fail if the resulting operation does not map onto the work resources already allocated for the node.
Additional memcpy node restrictions:
- Changing either the source or destination memory type, such as MemoryType::Device or MemoryType::Array, is not supported.
Conditional nodes:
- Changing node parameters is not supported.
- Changing parameters of nodes within the conditional body graph is subject to the rules above.
- Conditional handle flags and default values are updated as part of the graph update.

CUDA may add further restrictions in future releases. ExecutableGraph::update sets the update result to GraphExecUpdateResult::ErrorTopologyChanged under the following conditions:

The count of nodes directly in the executable graph and the source graph differ.
The source graph has more exit nodes.
A node in the source graph has a different number of dependencies than the paired node from the executable graph.
A node in the source graph has a dependency that does not match the corresponding dependency of the paired node from the executable graph. The dependencies are paired based on edge order and a dependency does not match when the nodes are already paired based on other edges examined in the graph.

ExecutableGraph::update sets the update result to:

GraphExecUpdateResult::Error if passed an invalid value.
GraphExecUpdateResult::ErrorTopologyChanged if the graph topology changed.
GraphExecUpdateResult::ErrorNodeTypeChanged if the type of a node changed.
GraphExecUpdateResult::ErrorFunctionChanged if the kernel function of a node changed (CUDA driver before 11.2).
GraphExecUpdateResult::ErrorUnsupportedFunctionChange if the kernel function changed in an unsupported way.
GraphExecUpdateResult::ErrorParametersChanged if any parameters to a node changed in a way that is not supported.
GraphExecUpdateResult::ErrorAttributesChanged if any attributes of a node changed in a way that is not supported.
GraphExecUpdateResult::ErrorNotSupported if something about a node is unsupported, like the node’s type or configuration.

If the update fails for a reason not listed above, the result is GraphExecUpdateResult::Error. If the update succeeds, the result is GraphExecUpdateResult::Success.

ExecutableGraph::update succeeds when the update was performed successfully. It returns crate::error::Status::GraphExecUpdateFailure if the graph update was not performed because it included changes which violated constraints specific to instantiated graph update.

Graph objects are not threadsafe.

§Errors

Returns an error if CUDA rejects the graph update, if the update violates instantiated graph update constraints, or if a previous asynchronous launch reported an error. CUDA may also return initialization-related errors such as crate::error::Status::NotInitialized, crate::error::Status::CallRequiresNewerDriver, or crate::error::Status::NoDevice if this call initializes internal runtime state. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub unsafe fn set_kernel_node_params<'a, P>( &mut self, node: GraphNode, function: DeviceFunction, config: &LaunchConfig, params: P, ) -> Result<()>
where P: KernelLaunchArgs<'a>,

Sets the parameters of a kernel node in this executable graph. The node is identified by the corresponding node in the non-executable graph from which this executable graph was instantiated.

node must not have been removed from the original graph. All node parameters may change, but the following restrictions apply to function updates:

The owning device of the kernel function cannot change.
A node whose kernel function originally did not use CUDA dynamic parallelism cannot be updated to a kernel function that uses CDP
A node whose kernel function originally did not make device-side update calls cannot be updated to a kernel function that makes device-side update calls.
If this executable graph was not instantiated for device launch, a node whose kernel function originally did not use device-side ExecutableGraph::launch cannot be updated to a kernel function that uses device-side ExecutableGraph::launch unless the node resides on the same device as nodes which contained such calls at instantiate-time. If no such calls were present at instantiation, these updates cannot be performed at all.

The modifications only affect future launches of this executable graph. Already enqueued or running launches of this executable graph are not affected by this call. The original node is also not modified by this call.

If node is a device-updatable kernel node, the next upload or launch of this executable graph will overwrite any previous device-side updates. Additionally, applying host updates to a device-updatable kernel node while it is being updated from the device results in undefined behavior. This can also be used with a runtime kernel handle queried through sys::cudaLibraryGetKernel or sys::cudaGetKernel and then passed as a raw pointer. The symbol passed to sys::cudaGetKernel must be registered with the same CUDA Runtime instance. Passing a symbol that belongs to a different runtime instance results in undefined behavior. The only type that can be reliably passed to a different runtime instance is the runtime kernel handle type itself.

Graph objects are not threadsafe.

§Safety

CUDA copies the kernel argument values during this call and stores those copied values in the executable graph for future launches. If an argument value is itself a pointer, only the pointer address is copied. The caller must ensure every copied pointer value remains valid for every future launch that can execute this node. Mutable pointer arguments must also remain exclusive for the work ordered by those launches.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub unsafe fn set_memory_copy_node_1d_params( &mut self, node: GraphNode, params: &MemoryCopy1DNodeParams, ) -> Result<()>

Updates the work represented by node in this executable graph as though node had contained the given params at instantiation. node must remain in the graph which was used to instantiate this executable graph. Changed edges to and from node are ignored.

The source and destination must be allocated from the same contexts as the original source and destination memory. The instantiation-time memory operands must be 1-dimensional. Zero-length operations are not supported.

The modifications only affect future launches of this executable graph. Already enqueued or running launches of this executable graph are not affected by this call. The original node is also not modified by this call.

Returns crate::error::Status::InvalidValue if the memory operands’ mappings changed or the original memory operands are multidimensional.

Graph objects are not threadsafe.

§Safety

CUDA stores the raw source and destination addresses in the executable graph for future launches. The caller must ensure params remains valid according to [Memcpy1DNodeParams::new] for every future launch that can execute this node.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub unsafe fn set_memory_copy_node_1d_device_to_device<D, S>( &mut self, node: GraphNode, dst: &mut D, src: &S, ) -> Result<()>
where D: ByteBufferMut + ?Sized, S: ByteBuffer + ?Sized,

Updates a memcpy node to copy between typed device byte buffers.

The node copies src.byte_len() bytes. dst must have at least that many bytes.

§Safety

CUDA stores the raw source and destination addresses in the executable graph for future launches. The caller must ensure dst and src remain valid for every future launch that can execute this node. dst must not be accessed through another mutable path while graph launches using this node can write it.

§Errors

Returns an error if dst is smaller than src, if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics.

Source

pub fn set_buffer_memory_copy_node_1d_device_to_device<T>( &mut self, node: GraphNode, dst: &mut GraphBuffer<T>, src: &GraphBuffer<T>, ) -> Result<()>
where T: DeviceRepr + Send + Sync,

Updates a memcpy node to copy between graph-retained buffers.

The node copies src.byte_len() bytes. dst must have at least that many bytes. The executable graph retains both allocations so future launches cannot outlive the baked CUDA pointer values.

§Errors

Returns an error if dst is smaller than src, if node does not belong to the graph used to instantiate this executable graph, if CUDA rejects the graph update, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics.

Source

pub unsafe fn set_memory_copy_node_params( &mut self, node: GraphNode, params: &MemoryCopy3DNodeParams, ) -> Result<()>

Updates the work represented by node in this executable graph as though node had contained the given params at instantiation. node must remain in the graph which was used to instantiate this executable graph. Changed edges to and from node are ignored.

The source and destination memory in params must be allocated from the same contexts as the original source and destination memory. Both the instantiation-time memory operands and the memory operands in params must be 1-dimensional. Zero-length operations are not supported.

The modifications only affect future launches of this executable graph. Already enqueued or running launches of this executable graph are not affected by this call. The original node is also not modified by this call.

Returns crate::error::Status::InvalidValue if the memory operands’ mappings changed or either the original or new memory operands are multidimensional.

Graph objects are not threadsafe.

§Safety

CUDA stores the raw source and destination addresses in the executable graph for future launches. The caller must ensure params remains valid according to MemoryCopy3DNodeParams for every future launch that can execute this node.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub unsafe fn set_memory_copy_node_to_symbol_params( &mut self, node: GraphNode, params: &MemoryCopyToSymbolNodeParams, ) -> Result<()>

§Safety

CUDA stores the raw symbol and source pointer in the executable graph for future launches. The caller must ensure params remains valid according to MemoryCopyToSymbolNodeParams::new for every future launch that can execute this node.

Source

pub unsafe fn set_memory_copy_node_from_symbol_params( &mut self, node: GraphNode, params: &MemoryCopyFromSymbolNodeParams, ) -> Result<()>

§Safety

CUDA stores the raw destination and symbol pointer in the executable graph for future launches. The caller must ensure params remains valid according to MemoryCopyFromSymbolNodeParams::new for every future launch that can execute this node.

Source

pub unsafe fn set_memory_set_node_params( &mut self, node: GraphNode, params: &MemorySetNodeParams, ) -> Result<()>

Updates the work represented by node in this executable graph as though node had contained the given params at instantiation. node must remain in the graph which was used to instantiate this executable graph. Changed edges to and from node are ignored.

Zero-sized operations are not supported.

The new destination pointer in params must be to the same kind of allocation as the original destination pointer and have the same context association and device mapping as the original destination pointer.

Both the value and pointer address may be updated. Changing other aspects of the memset (width, height, element size or pitch) may cause the update to be rejected. Specifically, for 2D memsets, all dimension changes are rejected. For 1D memsets, changes in height are explicitly rejected and other changes are opportunistically allowed if the resulting work maps onto the work resources already allocated for the node.

The modifications only affect future launches of this executable graph. Already enqueued or running launches of this executable graph are not affected by this call. The original node is also not modified by this call.

Graph objects are not threadsafe.

§Safety

CUDA stores the raw destination address in the executable graph for future launches. The caller must ensure params remains valid according to MemorySetNodeParams::new for every future launch that can execute this node.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub unsafe fn set_host_node_params( &mut self, node: GraphNode, params: &HostNodeParams, ) -> Result<()>

Updates the work represented by node in this executable graph as though node had contained the given params at instantiation. node must remain in the graph which was used to instantiate this executable graph. Changed edges to and from node are ignored.

The modifications only affect future launches of this executable graph. Already enqueued or running launches of this executable graph are not affected by this call. The original node is also not modified by this call.

Graph objects are not threadsafe.

§Safety

CUDA stores the raw callback function and user-data pointer in the executable graph for future launches. The caller must ensure params remains valid according to HostNodeParams::new for every future launch that can execute this node.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn set_event_record_node_event( &mut self, node: GraphNode, event: &Event, ) -> Result<()>

Sets the event of an event record node in this executable graph. The node is identified by the corresponding node in the non-executable graph from which this executable graph was instantiated.

The modifications only affect future launches of this executable graph. Already enqueued or running launches of this executable graph are not affected by this call. The original node is also not modified by this call.

Graph objects are not threadsafe.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn set_child_graph_node( &mut self, node: GraphNode, child_graph: &Graph, ) -> Result<()>

Updates the work represented by node in this executable graph as though the nodes contained in node’s graph had the parameters contained in child_graph’s nodes at instantiation. node must remain in the graph which was used to instantiate this executable graph. Changed edges to and from node are ignored.

The modifications only affect future launches of this executable graph. Already enqueued or running launches of this executable graph are not affected by this call. The original node is also not modified by this call.

The topology of child_graph, as well as the node insertion order, must match that of the graph contained in node. See ExecutableGraph::update for a list of restrictions on what can be updated in an instantiated graph. The update is recursive, so child graph nodes contained within the top-level child graph are also updated.

Graph objects are not threadsafe.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn set_event_wait_node_event( &mut self, node: GraphNode, event: &Event, ) -> Result<()>

Sets the event of an event wait node in this executable graph. The node is identified by the corresponding node in the non-executable graph from which this executable graph was instantiated.

The modifications only affect future launches of this executable graph. Already enqueued or running launches of this executable graph are not affected by this call. The original node is also not modified by this call.

Graph objects are not threadsafe.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn enable_node(&mut self, node: GraphNode) -> Result<()>

Source

pub fn disable_node(&mut self, node: GraphNode) -> Result<()>

Source

pub fn is_node_enabled(&self, node: GraphNode) -> Result<bool>

Returns whether node is enabled.

The node is identified by the corresponding node in the non-executable graph from which this executable graph was instantiated.

node must not have been removed from the original graph.

Currently only kernel, memset and memcpy nodes are supported.

Graph objects are not threadsafe.

§Errors

Returns an error if CUDA rejects the graph operation, if a previous asynchronous launch reported an error, or if CUDA reports runtime initialization diagnostics. Callbacks must not call CUDA functions; see Stream::add_callback.

Source