pub struct Stream { /* private fields */ }Implementations§
Source§impl Stream
impl Stream
Sourcepub fn completion_future(&self) -> Result<StreamFuture>
pub fn completion_future(&self) -> Result<StreamFuture>
Returns a future that resolves when a host function enqueued at the current end of this stream runs.
This is a notification primitive.
It does not report asynchronous CUDA errors after registration.
Use Stream::checked_completion_future or Stream::synchronize_async when the result must include CUDA status.
Sourcepub fn checked_completion_future(&self) -> Result<CheckedStreamFuture>
pub fn checked_completion_future(&self) -> Result<CheckedStreamFuture>
Returns a future that resolves with CUDA’s asynchronous stream status.
This uses CUDA’s stream callback status path and is therefore rejected while stream capture is active.
pub async fn synchronize_async(&self) -> Result<()>
pub fn enqueue_async<T, F>(&self, f: F) -> Result<CudaFuture<T>>
Source§impl Stream
impl Stream
Sourcepub unsafe fn from_raw(handle: cudaStream_t, ctx: Arc<Context>) -> Result<Self>
pub unsafe fn from_raw(handle: cudaStream_t, ctx: Arc<Context>) -> Result<Self>
Wraps an existing CUDA stream handle and takes ownership of it.
Dropping the returned stream may block while synchronizing the stream
before destruction. Use Stream::shutdown to surface synchronization
or destruction errors explicitly.
§Safety
handle must be a valid CUDA stream owned by ctx, and ownership of
the handle is transferred to the returned Stream. The handle must
not be destroyed elsewhere after calling this function.
pub fn to_borrowed(&self) -> BorrowedStream
Sourcepub fn sync_scope<'env, F, R>(&self, f: F) -> Result<R>
pub fn sync_scope<'env, F, R>(&self, f: F) -> Result<R>
Runs f with a stream scope and synchronizes this stream before returning.
Use this for scoped asynchronous operations that borrow host or device
memory until stream completion. For CUDA graph capture, use
Stream::capture or Stream::capture_executable.
Sourcepub fn synchronize(&self) -> Result<()>
pub fn synchronize(&self) -> Result<()>
Blocks until stream has completed all operations.
If ContextFlags::SCHEDULE_BLOCKING_SYNC was set for this device, the host thread will block until the stream is finished with all of its tasks.
Uses standard default stream semantics.
§Errors
Returns an error if stream synchronization fails or if a previous asynchronous launch
reported an error. CUDA may also return initialization-related errors such as
crate::error::Status::NotInitialized, crate::error::Status::CallRequiresNewerDriver, or
crate::error::Status::NoDevice if this call initializes internal runtime state. Callbacks must not
call CUDA functions; see Stream::add_callback.
Sourcepub fn shutdown(self) -> Result<()>
pub fn shutdown(self) -> Result<()>
Synchronizes this stream, destroys it, and returns any CUDA error.
This is the explicit version of the cleanup normally performed by
Drop. It may block while waiting for stream work and callbacks to
complete. If synchronization fails, destruction is still attempted and
the synchronization error is returned. If synchronization succeeds but
destruction fails, the destruction error is returned.
Sourcepub fn query(&self) -> Result<bool>
pub fn query(&self) -> Result<bool>
Returns true if all operations in stream have completed, or false if not.
For the purposes of Unified Memory, a return value of true is equivalent to having called Stream::synchronize.
Uses standard default stream semantics.
§Errors
Returns an error if querying the stream fails or if a previous asynchronous launch reported
an error. CUDA may also return initialization-related errors such as
crate::error::Status::NotInitialized, crate::error::Status::CallRequiresNewerDriver, or
crate::error::Status::NoDevice if this call initializes internal runtime state. Callbacks must not
call CUDA functions; see Stream::add_callback.
Sourcepub fn wait_event(&self, event: &Event) -> Result<()>
pub fn wait_event(&self, event: &Event) -> Result<()>
Makes all future work submitted to stream wait for all work captured in event.
See sys::cudaEventRecord for details on what is captured by an event.
Synchronization is performed efficiently on the device when applicable.
event may be from a different device than stream.
Uses standard default stream semantics.
§Errors
Returns an error if the stream cannot wait on the event or if a previous asynchronous launch
reported an error. CUDA may also return initialization-related errors such as
crate::error::Status::NotInitialized, crate::error::Status::CallRequiresNewerDriver, or
crate::error::Status::NoDevice if this call initializes internal runtime state. Callbacks must not
call CUDA functions; see Stream::add_callback.
Sourcepub fn wait_event_with_flags(&self, event: &Event, flags: u32) -> Result<()>
pub fn wait_event_with_flags(&self, event: &Event, flags: u32) -> Result<()>
Makes all future work submitted to stream wait for all work captured in event.
See sys::cudaEventRecord for details on what is captured by an event.
flags controls how strictly the wait is enforced.
Synchronization is performed efficiently on the device when applicable.
event may be from a different device than stream.
Uses standard default stream semantics.
§Errors
Returns an error if the stream cannot wait on the event or if a previous asynchronous launch
reported an error. CUDA may also return initialization-related errors such as
crate::error::Status::NotInitialized, crate::error::Status::CallRequiresNewerDriver, or
crate::error::Status::NoDevice if this call initializes internal runtime state. Callbacks must not
call CUDA functions; see Stream::add_callback.
Sourcepub fn begin_capture(&self, mode: StreamCaptureMode) -> Result<()>
pub fn begin_capture(&self, mode: StreamCaptureMode) -> Result<()>
Begin graph capture on stream.
When a stream is in capture mode, operations pushed into the stream are captured
into a graph instead of executed. Stream::end_capture returns the graph.
Capture may not be initiated on the legacy default stream.
Capture must be ended on the same stream in which it was initiated, and it may only be initiated if the stream is not already in capture mode.
The capture mode may be queried via Stream::capture_status.
A unique id representing the capture sequence may be queried via Stream::capture_info.
If mode is not StreamCaptureMode::Relaxed, Stream::end_capture must be called on this stream from the same thread.
Kernels captured using this API must not use texture and surface references. Reading or writing through any texture or surface reference is undefined behavior. This restriction does not apply to texture and surface objects.
§Errors
Returns an error if the context cannot be bound, capture cannot begin on this stream, the capture mode is invalid for the current thread state, or a previous asynchronous launch reports an error.
Sourcepub unsafe fn begin_capture_to_graph(
&self,
graph: &Graph,
dependencies: &[GraphNode],
mode: StreamCaptureMode,
) -> Result<()>
pub unsafe fn begin_capture_to_graph( &self, graph: &Graph, dependencies: &[GraphNode], mode: StreamCaptureMode, ) -> Result<()>
Begins stream capture into an existing graph.
§Safety
This low-level API captures into graph’s existing CUDA handle. Calling
Stream::end_capture after this may return that same raw handle; the
caller must not wrap it as a second owned Graph. Prefer
Stream::capture unless manually managing capture into an existing
graph is required.
Sourcepub unsafe fn begin_capture_to_graph_with_data(
&self,
graph: &Graph,
dependencies: &[GraphNode],
edge_data: &[GraphEdgeData],
mode: StreamCaptureMode,
) -> Result<()>
pub unsafe fn begin_capture_to_graph_with_data( &self, graph: &Graph, dependencies: &[GraphNode], edge_data: &[GraphEdgeData], mode: StreamCaptureMode, ) -> Result<()>
Begins stream capture into an existing graph with annotated dependency edges.
§Safety
This has the same ownership restrictions as
Stream::begin_capture_to_graph.
Sourcepub unsafe fn begin_capture_to_graph_with_dependencies(
&self,
graph: &Graph,
dependencies: &[GraphDependency],
mode: StreamCaptureMode,
) -> Result<()>
pub unsafe fn begin_capture_to_graph_with_dependencies( &self, graph: &Graph, dependencies: &[GraphDependency], mode: StreamCaptureMode, ) -> Result<()>
Begin graph capture on stream.
When a stream is in capture mode, operations pushed into the stream are captured
into a graph instead of executed. Stream::end_capture returns the graph.
Capture may not be initiated on the legacy default stream.
Capture must be ended on the same stream in which it was initiated, and it may only be initiated if the stream is not already in capture mode.
The capture mode may be queried via Stream::capture_status.
A unique id representing the capture sequence may be queried via Stream::capture_info.
If mode is not StreamCaptureMode::Relaxed, Stream::end_capture must be called on this stream from the same thread.
Kernels captured using this API must not use texture and surface references. Reading or writing through any texture or surface reference is undefined behavior. This restriction does not apply to texture and surface objects.
§Errors
Returns an error if the context cannot be bound, capture cannot begin on this stream, the graph dependencies are invalid, the capture mode is invalid for the current thread state, or a previous asynchronous launch reports an error.
§Safety
This captures into graph’s existing CUDA handle. Calling
Stream::end_capture after this may return that same raw handle; the
caller must not wrap it as a second owned Graph.
Sourcepub fn end_capture(&self) -> Result<Graph>
pub fn end_capture(&self) -> Result<Graph>
Ends capture on this stream, returning the captured graph.
Capture must have been initiated on stream via a call to Stream::begin_capture.
If capture was invalidated due to a violation of the rules of stream capture, an error is returned.
If the mode argument to Stream::begin_capture was not StreamCaptureMode::Relaxed, this call must be from the same thread as Stream::begin_capture.
§Errors
Returns an error if the context cannot be bound, capture is not active on this stream, the capture has been invalidated, or a previous asynchronous launch reports an error.
Sourcepub fn capture<F>(&self, mode: StreamCaptureMode, f: F) -> Result<Graph>
pub fn capture<F>(&self, mode: StreamCaptureMode, f: F) -> Result<Graph>
Captures stream work recorded by f into a CUDA graph.
This is the scoped form of Stream::begin_capture and
Stream::end_capture. The capture is always ended before this method
returns or resumes a panic. If f returns an error, this method attempts
to end capture to restore stream usability, destroys any graph returned
by CUDA, and returns the closure error.
The scope is intentionally !Send, so it cannot be moved to another
thread while capture is active. Future graph-safe recording helpers can
be added to StreamCaptureScope without changing this API shape.
§Errors
Returns an error if capture cannot begin, if f returns an error, or if
capture cannot be ended successfully.
pub fn capture_executable<F>( &self, mode: StreamCaptureMode, f: F, ) -> Result<ExecutableGraph>
pub fn capture_executable_with_flags<F>( &self, mode: StreamCaptureMode, flags: GraphInstantiateFlags, f: F, ) -> Result<ExecutableGraph>
Sourcepub fn capture_status(&self) -> Result<StreamCaptureStatus>
pub fn capture_status(&self) -> Result<StreamCaptureStatus>
Returns the capture status of this stream. After a successful call, the status is one of the following:
StreamCaptureStatus::None: The stream is not capturing.StreamCaptureStatus::Active: The stream is capturing.StreamCaptureStatus::Invalidated: The stream was capturing but an error has invalidated the capture sequence. The capture sequence must be terminated withStream::end_captureon the stream where it was initiated to continue using the stream.
If this is called on the legacy default stream while a blocking stream on the same device is capturing, it returns crate::error::Status::StreamCaptureImplicit.
The blocking stream capture is not invalidated.
When a blocking stream is capturing, the legacy stream is in an unusable state until the blocking stream capture is terminated. The legacy stream is not supported for stream capture, but attempted use would have an implicit dependency on the capturing stream(s).
§Errors
Returns an error if the context cannot be bound, CUDA cannot query the capture status, or a previous asynchronous launch reports an error.
Sourcepub fn capture_info(&self) -> Result<StreamCaptureInfo>
pub fn capture_info(&self) -> Result<StreamCaptureInfo>
Query stream state related to stream capture.
If called on the legacy default stream while a stream not created with StreamFlags::NON_BLOCKING is capturing, returns crate::error::Status::StreamCaptureImplicit.
Valid data (other than capture status) is returned only if both of the following are true:
- the call succeeds
- the returned capture status is
StreamCaptureStatus::Active
If there is non-zero edge data for one or more current stream dependencies and the query cannot return that data, the call returns crate::error::Status::LossyQuery.
§Errors
Returns an error if the context cannot be bound, CUDA cannot query the capture info, the query would lose non-zero edge data, or a previous asynchronous launch reports an error.
pub fn update_capture_dependencies( &self, dependencies: &[GraphNode], ) -> Result<()>
pub fn update_capture_dependencies_with_data( &self, dependencies: &[GraphNode], edge_data: &[GraphEdgeData], ) -> Result<()>
pub fn update_capture_dependencies_with_mode( &self, dependencies: &[GraphNode], edge_data: &[GraphEdgeData], mode: StreamCaptureDependencyUpdate, ) -> Result<()>
Sourcepub fn update_capture_dependencies_with_dependencies(
&self,
dependencies: &[GraphDependency],
mode: StreamCaptureDependencyUpdate,
) -> Result<()>
pub fn update_capture_dependencies_with_dependencies( &self, dependencies: &[GraphDependency], mode: StreamCaptureDependencyUpdate, ) -> Result<()>
Modifies the dependency set of a capturing stream. The dependency set is the set of nodes that the next captured node in the stream will depend on.
Valid flags are StreamCaptureDependencyUpdate::Add and StreamCaptureDependencyUpdate::Set.
These control whether the supplied set is added to the existing set or replaces it.
A flags value of 0 defaults to StreamCaptureDependencyUpdate::Add.
Nodes that are removed from the dependency set by this call do not result in crate::error::Status::StreamCaptureUnjoined if they are unreachable from the stream at Stream::end_capture.
Returns crate::error::Status::IllegalState if the stream is not capturing.
§Errors
Returns an error if the context cannot be bound, the stream is not capturing, the supplied dependencies are invalid, or a previous asynchronous launch reports an error.
Sourcepub fn add_callback<F>(&self, callback: F) -> Result<()>
pub fn add_callback<F>(&self, callback: F) -> Result<()>
This callback API is slated for eventual deprecation and removal.
If you do not require the callback to execute after a device error, consider using sys::cudaLaunchHostFunc.
Additionally, this callback mechanism is not supported with Stream::begin_capture and Stream::end_capture, unlike sys::cudaLaunchHostFunc.
Adds a callback to be called on the host after all currently enqueued items in the stream have completed.
For each Stream::add_callback call, a callback is executed exactly once.
The callback blocks later work in the stream until it is finished.
The callback may be passed a successful status or an error code.
In the event of a device error, all subsequently executed callbacks receive an appropriate Status.
Callbacks must not call CUDA functions.
Attempting to do so may result in crate::error::Status::NotPermitted.
Callbacks must not perform any synchronization that may depend on outstanding device work or other callbacks that are not mandated to run earlier.
Callbacks without a mandated order (in independent streams) execute in undefined order and may be serialized.
For the purposes of Unified Memory, callback execution makes a number of guarantees:
- The callback stream is considered idle for the duration of the callback. Thus, for example, a callback may always use memory attached to the callback stream.
- The start of execution of a callback has the same effect as synchronizing an event recorded in the same stream immediately before the callback. It thus synchronizes streams which have been “joined” before the callback.
- Adding device work to any stream does not have the effect of making the stream active until all preceding callbacks have executed. Thus, for example, a callback might use global attached memory even if work has been added to another stream, if it has been properly ordered with an event.
- Completion of a callback does not cause a stream to become active except as described above. The callback stream will remain idle if no device work follows the callback, and will remain idle across consecutive callbacks without device work in between. Thus, for example, stream synchronization can be done by signaling from a callback at the end of the stream.
§Errors
Returns an error if the context cannot be bound, CUDA rejects the
callback registration, a previous asynchronous launch reports an error,
or CUDA reports runtime initialization diagnostics such as
crate::error::Status::NotInitialized, crate::error::Status::CallRequiresNewerDriver,
or crate::error::Status::NoDevice.
Sourcepub fn launch_host_func<F>(&self, function: F) -> Result<()>
pub fn launch_host_func<F>(&self, function: F) -> Result<()>
Enqueues a host function to run after all currently enqueued work in this stream completes.
Unlike Stream::add_callback, CUDA does not call this function if the CUDA context is already in an error state.
This API is supported during stream capture by CUDA, but the host function still must not call CUDA
APIs or perform synchronization that depends on outstanding device work.
§Errors
Returns an error if the context cannot be bound, CUDA rejects the host function registration, a previous asynchronous launch reports an error, or CUDA reports runtime initialization diagnostics.
Sourcepub fn flags(&self) -> Result<StreamFlags>
pub fn flags(&self) -> Result<StreamFlags>
Query the flags of a stream.
Returns the stream flags.
See Context::create_stream_with_flags for a list of valid flags.
Uses standard default stream semantics.
§Errors
Returns an error if CUDA cannot query the flags or if a previous asynchronous launch
reported an error. CUDA may also return initialization-related errors such as
crate::error::Status::NotInitialized, crate::error::Status::CallRequiresNewerDriver, or
crate::error::Status::NoDevice if this call initializes internal runtime state. Callbacks must not
call CUDA functions; see Stream::add_callback.
Sourcepub fn priority(&self) -> Result<i32>
pub fn priority(&self) -> Result<i32>
Query the priority of a stream.
Returns the stream priority.
If the stream was created with a priority outside the meaningful numerical range returned by Device::stream_priority_range, this returns the clamped priority.
See Context::create_stream_with_priority for details about priority clamping.
§Errors
Returns an error if the context cannot be bound, CUDA cannot query the priority, a previous asynchronous launch reports an error, or CUDA reports runtime initialization diagnostics.
Sourcepub fn id(&self) -> Result<u64>
pub fn id(&self) -> Result<u64>
Returns a stream identifier that remains unique for the life of the program.
The stream handle may refer to any of the following:
- a stream created via any of the CUDA runtime APIs such as
sys::cudaStreamCreate,Context::create_stream_with_flagsandContext::create_stream_with_priority, or their driver API equivalents such assys::cuStreamCreateorsys::cuStreamCreateWithPriority. Passing an invalid handle results in undefined behavior. - the special legacy default stream and per-thread default stream. The driver API equivalents of these are also accepted.
§Errors
Returns an error if the context cannot be bound, CUDA cannot query the stream identifier, a previous asynchronous launch reports an error, or CUDA reports runtime initialization diagnostics.
Sourcepub fn device(&self) -> Result<Device>
pub fn device(&self) -> Result<Device>
Returns the device of the stream.
§Errors
Returns an error if the context cannot be bound, CUDA cannot query the stream device, a previous asynchronous launch reports an error, or CUDA reports runtime initialization diagnostics.
pub fn context(&self) -> &Context
pub fn as_raw(&self) -> cudaStream_t
Sourcepub fn into_raw(self) -> cudaStream_t
pub fn into_raw(self) -> cudaStream_t
Consumes the stream and returns the raw CUDA stream handle without destroying it.
The caller becomes responsible for eventually destroying the returned handle with CUDA.