Skip to main content

Stream

Struct Stream 

Source
pub struct Stream { /* private fields */ }

Implementations§

Source§

impl Stream

Source

pub fn completion_future(&self) -> Result<StreamFuture>

Returns a future that resolves when a host function enqueued at the current end of this stream runs.

This is a notification primitive. It does not report asynchronous CUDA errors after registration. Use Stream::checked_completion_future or Stream::synchronize_async when the result must include CUDA status.

Source

pub fn checked_completion_future(&self) -> Result<CheckedStreamFuture>

Returns a future that resolves with CUDA’s asynchronous stream status.

This uses CUDA’s stream callback status path and is therefore rejected while stream capture is active.

Source

pub async fn synchronize_async(&self) -> Result<()>

Source

pub fn enqueue_async<T, F>(&self, f: F) -> Result<CudaFuture<T>>
where F: FnOnce(&Stream) -> Result<T>,

Source§

impl Stream

Source

pub unsafe fn from_raw(handle: cudaStream_t, ctx: Arc<Context>) -> Result<Self>

Wraps an existing CUDA stream handle and takes ownership of it.

Dropping the returned stream may block while synchronizing the stream before destruction. Use Stream::shutdown to surface synchronization or destruction errors explicitly.

§Safety

handle must be a valid CUDA stream owned by ctx, and ownership of the handle is transferred to the returned Stream. The handle must not be destroyed elsewhere after calling this function.

Source

pub fn to_borrowed(&self) -> BorrowedStream

Source

pub fn sync_scope<'env, F, R>(&self, f: F) -> Result<R>
where F: for<'scope> FnOnce(&'scope StreamScope<'scope, 'env>) -> Result<R>,

Runs f with a stream scope and synchronizes this stream before returning.

Use this for scoped asynchronous operations that borrow host or device memory until stream completion. For CUDA graph capture, use Stream::capture or Stream::capture_executable.

Source

pub fn synchronize(&self) -> Result<()>

Blocks until stream has completed all operations. If ContextFlags::SCHEDULE_BLOCKING_SYNC was set for this device, the host thread will block until the stream is finished with all of its tasks.

Uses standard default stream semantics.

§Errors

Returns an error if stream synchronization fails or if a previous asynchronous launch reported an error. CUDA may also return initialization-related errors such as crate::error::Status::NotInitialized, crate::error::Status::CallRequiresNewerDriver, or crate::error::Status::NoDevice if this call initializes internal runtime state. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn shutdown(self) -> Result<()>

Synchronizes this stream, destroys it, and returns any CUDA error.

This is the explicit version of the cleanup normally performed by Drop. It may block while waiting for stream work and callbacks to complete. If synchronization fails, destruction is still attempted and the synchronization error is returned. If synchronization succeeds but destruction fails, the destruction error is returned.

Source

pub fn query(&self) -> Result<bool>

Returns true if all operations in stream have completed, or false if not.

For the purposes of Unified Memory, a return value of true is equivalent to having called Stream::synchronize.

Uses standard default stream semantics.

§Errors

Returns an error if querying the stream fails or if a previous asynchronous launch reported an error. CUDA may also return initialization-related errors such as crate::error::Status::NotInitialized, crate::error::Status::CallRequiresNewerDriver, or crate::error::Status::NoDevice if this call initializes internal runtime state. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn wait_event(&self, event: &Event) -> Result<()>

Makes all future work submitted to stream wait for all work captured in event. See sys::cudaEventRecord for details on what is captured by an event. Synchronization is performed efficiently on the device when applicable. event may be from a different device than stream.

Uses standard default stream semantics.

§Errors

Returns an error if the stream cannot wait on the event or if a previous asynchronous launch reported an error. CUDA may also return initialization-related errors such as crate::error::Status::NotInitialized, crate::error::Status::CallRequiresNewerDriver, or crate::error::Status::NoDevice if this call initializes internal runtime state. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn wait_event_with_flags(&self, event: &Event, flags: u32) -> Result<()>

Makes all future work submitted to stream wait for all work captured in event. See sys::cudaEventRecord for details on what is captured by an event. flags controls how strictly the wait is enforced. Synchronization is performed efficiently on the device when applicable. event may be from a different device than stream.

Uses standard default stream semantics.

§Errors

Returns an error if the stream cannot wait on the event or if a previous asynchronous launch reported an error. CUDA may also return initialization-related errors such as crate::error::Status::NotInitialized, crate::error::Status::CallRequiresNewerDriver, or crate::error::Status::NoDevice if this call initializes internal runtime state. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn begin_capture(&self, mode: StreamCaptureMode) -> Result<()>

Begin graph capture on stream. When a stream is in capture mode, operations pushed into the stream are captured into a graph instead of executed. Stream::end_capture returns the graph. Capture may not be initiated on the legacy default stream. Capture must be ended on the same stream in which it was initiated, and it may only be initiated if the stream is not already in capture mode. The capture mode may be queried via Stream::capture_status. A unique id representing the capture sequence may be queried via Stream::capture_info.

If mode is not StreamCaptureMode::Relaxed, Stream::end_capture must be called on this stream from the same thread.

Kernels captured using this API must not use texture and surface references. Reading or writing through any texture or surface reference is undefined behavior. This restriction does not apply to texture and surface objects.

§Errors

Returns an error if the context cannot be bound, capture cannot begin on this stream, the capture mode is invalid for the current thread state, or a previous asynchronous launch reports an error.

Source

pub unsafe fn begin_capture_to_graph( &self, graph: &Graph, dependencies: &[GraphNode], mode: StreamCaptureMode, ) -> Result<()>

Begins stream capture into an existing graph.

§Safety

This low-level API captures into graph’s existing CUDA handle. Calling Stream::end_capture after this may return that same raw handle; the caller must not wrap it as a second owned Graph. Prefer Stream::capture unless manually managing capture into an existing graph is required.

Source

pub unsafe fn begin_capture_to_graph_with_data( &self, graph: &Graph, dependencies: &[GraphNode], edge_data: &[GraphEdgeData], mode: StreamCaptureMode, ) -> Result<()>

Begins stream capture into an existing graph with annotated dependency edges.

§Safety

This has the same ownership restrictions as Stream::begin_capture_to_graph.

Source

pub unsafe fn begin_capture_to_graph_with_dependencies( &self, graph: &Graph, dependencies: &[GraphDependency], mode: StreamCaptureMode, ) -> Result<()>

Begin graph capture on stream. When a stream is in capture mode, operations pushed into the stream are captured into a graph instead of executed. Stream::end_capture returns the graph.

Capture may not be initiated on the legacy default stream. Capture must be ended on the same stream in which it was initiated, and it may only be initiated if the stream is not already in capture mode. The capture mode may be queried via Stream::capture_status. A unique id representing the capture sequence may be queried via Stream::capture_info.

If mode is not StreamCaptureMode::Relaxed, Stream::end_capture must be called on this stream from the same thread.

Kernels captured using this API must not use texture and surface references. Reading or writing through any texture or surface reference is undefined behavior. This restriction does not apply to texture and surface objects.

§Errors

Returns an error if the context cannot be bound, capture cannot begin on this stream, the graph dependencies are invalid, the capture mode is invalid for the current thread state, or a previous asynchronous launch reports an error.

§Safety

This captures into graph’s existing CUDA handle. Calling Stream::end_capture after this may return that same raw handle; the caller must not wrap it as a second owned Graph.

Source

pub fn end_capture(&self) -> Result<Graph>

Ends capture on this stream, returning the captured graph. Capture must have been initiated on stream via a call to Stream::begin_capture. If capture was invalidated due to a violation of the rules of stream capture, an error is returned.

If the mode argument to Stream::begin_capture was not StreamCaptureMode::Relaxed, this call must be from the same thread as Stream::begin_capture.

§Errors

Returns an error if the context cannot be bound, capture is not active on this stream, the capture has been invalidated, or a previous asynchronous launch reports an error.

Source

pub fn capture<F>(&self, mode: StreamCaptureMode, f: F) -> Result<Graph>
where F: FnOnce(&StreamCaptureScope<'_>) -> Result<()>,

Captures stream work recorded by f into a CUDA graph.

This is the scoped form of Stream::begin_capture and Stream::end_capture. The capture is always ended before this method returns or resumes a panic. If f returns an error, this method attempts to end capture to restore stream usability, destroys any graph returned by CUDA, and returns the closure error.

The scope is intentionally !Send, so it cannot be moved to another thread while capture is active. Future graph-safe recording helpers can be added to StreamCaptureScope without changing this API shape.

§Errors

Returns an error if capture cannot begin, if f returns an error, or if capture cannot be ended successfully.

Source

pub fn capture_executable<F>( &self, mode: StreamCaptureMode, f: F, ) -> Result<ExecutableGraph>
where F: FnOnce(&StreamCaptureScope<'_>) -> Result<()>,

Source

pub fn capture_executable_with_flags<F>( &self, mode: StreamCaptureMode, flags: GraphInstantiateFlags, f: F, ) -> Result<ExecutableGraph>
where F: FnOnce(&StreamCaptureScope<'_>) -> Result<()>,

Source

pub fn capture_status(&self) -> Result<StreamCaptureStatus>

Returns the capture status of this stream. After a successful call, the status is one of the following:

If this is called on the legacy default stream while a blocking stream on the same device is capturing, it returns crate::error::Status::StreamCaptureImplicit. The blocking stream capture is not invalidated.

When a blocking stream is capturing, the legacy stream is in an unusable state until the blocking stream capture is terminated. The legacy stream is not supported for stream capture, but attempted use would have an implicit dependency on the capturing stream(s).

§Errors

Returns an error if the context cannot be bound, CUDA cannot query the capture status, or a previous asynchronous launch reports an error.

Source

pub fn capture_info(&self) -> Result<StreamCaptureInfo>

Query stream state related to stream capture.

If called on the legacy default stream while a stream not created with StreamFlags::NON_BLOCKING is capturing, returns crate::error::Status::StreamCaptureImplicit.

Valid data (other than capture status) is returned only if both of the following are true:

If there is non-zero edge data for one or more current stream dependencies and the query cannot return that data, the call returns crate::error::Status::LossyQuery.

§Errors

Returns an error if the context cannot be bound, CUDA cannot query the capture info, the query would lose non-zero edge data, or a previous asynchronous launch reports an error.

Source

pub fn update_capture_dependencies( &self, dependencies: &[GraphNode], ) -> Result<()>

Source

pub fn update_capture_dependencies_with_data( &self, dependencies: &[GraphNode], edge_data: &[GraphEdgeData], ) -> Result<()>

Source

pub fn update_capture_dependencies_with_mode( &self, dependencies: &[GraphNode], edge_data: &[GraphEdgeData], mode: StreamCaptureDependencyUpdate, ) -> Result<()>

Source

pub fn update_capture_dependencies_with_dependencies( &self, dependencies: &[GraphDependency], mode: StreamCaptureDependencyUpdate, ) -> Result<()>

Modifies the dependency set of a capturing stream. The dependency set is the set of nodes that the next captured node in the stream will depend on.

Valid flags are StreamCaptureDependencyUpdate::Add and StreamCaptureDependencyUpdate::Set. These control whether the supplied set is added to the existing set or replaces it. A flags value of 0 defaults to StreamCaptureDependencyUpdate::Add.

Nodes that are removed from the dependency set by this call do not result in crate::error::Status::StreamCaptureUnjoined if they are unreachable from the stream at Stream::end_capture.

Returns crate::error::Status::IllegalState if the stream is not capturing.

§Errors

Returns an error if the context cannot be bound, the stream is not capturing, the supplied dependencies are invalid, or a previous asynchronous launch reports an error.

Source

pub fn add_callback<F>(&self, callback: F) -> Result<()>
where F: FnOnce(Result<()>) + Send + 'static,

This callback API is slated for eventual deprecation and removal. If you do not require the callback to execute after a device error, consider using sys::cudaLaunchHostFunc. Additionally, this callback mechanism is not supported with Stream::begin_capture and Stream::end_capture, unlike sys::cudaLaunchHostFunc.

Adds a callback to be called on the host after all currently enqueued items in the stream have completed. For each Stream::add_callback call, a callback is executed exactly once. The callback blocks later work in the stream until it is finished.

The callback may be passed a successful status or an error code. In the event of a device error, all subsequently executed callbacks receive an appropriate Status.

Callbacks must not call CUDA functions. Attempting to do so may result in crate::error::Status::NotPermitted. Callbacks must not perform any synchronization that may depend on outstanding device work or other callbacks that are not mandated to run earlier. Callbacks without a mandated order (in independent streams) execute in undefined order and may be serialized.

For the purposes of Unified Memory, callback execution makes a number of guarantees:

  • The callback stream is considered idle for the duration of the callback. Thus, for example, a callback may always use memory attached to the callback stream.
  • The start of execution of a callback has the same effect as synchronizing an event recorded in the same stream immediately before the callback. It thus synchronizes streams which have been “joined” before the callback.
  • Adding device work to any stream does not have the effect of making the stream active until all preceding callbacks have executed. Thus, for example, a callback might use global attached memory even if work has been added to another stream, if it has been properly ordered with an event.
  • Completion of a callback does not cause a stream to become active except as described above. The callback stream will remain idle if no device work follows the callback, and will remain idle across consecutive callbacks without device work in between. Thus, for example, stream synchronization can be done by signaling from a callback at the end of the stream.
§Errors

Returns an error if the context cannot be bound, CUDA rejects the callback registration, a previous asynchronous launch reports an error, or CUDA reports runtime initialization diagnostics such as crate::error::Status::NotInitialized, crate::error::Status::CallRequiresNewerDriver, or crate::error::Status::NoDevice.

Source

pub fn launch_host_func<F>(&self, function: F) -> Result<()>
where F: FnOnce() + Send + 'static,

Enqueues a host function to run after all currently enqueued work in this stream completes.

Unlike Stream::add_callback, CUDA does not call this function if the CUDA context is already in an error state. This API is supported during stream capture by CUDA, but the host function still must not call CUDA APIs or perform synchronization that depends on outstanding device work.

§Errors

Returns an error if the context cannot be bound, CUDA rejects the host function registration, a previous asynchronous launch reports an error, or CUDA reports runtime initialization diagnostics.

Source

pub fn flags(&self) -> Result<StreamFlags>

Query the flags of a stream. Returns the stream flags. See Context::create_stream_with_flags for a list of valid flags.

Uses standard default stream semantics.

§Errors

Returns an error if CUDA cannot query the flags or if a previous asynchronous launch reported an error. CUDA may also return initialization-related errors such as crate::error::Status::NotInitialized, crate::error::Status::CallRequiresNewerDriver, or crate::error::Status::NoDevice if this call initializes internal runtime state. Callbacks must not call CUDA functions; see Stream::add_callback.

Source

pub fn priority(&self) -> Result<i32>

Query the priority of a stream. Returns the stream priority. If the stream was created with a priority outside the meaningful numerical range returned by Device::stream_priority_range, this returns the clamped priority. See Context::create_stream_with_priority for details about priority clamping.

§Errors

Returns an error if the context cannot be bound, CUDA cannot query the priority, a previous asynchronous launch reports an error, or CUDA reports runtime initialization diagnostics.

Source

pub fn id(&self) -> Result<u64>

Returns a stream identifier that remains unique for the life of the program.

The stream handle may refer to any of the following:

§Errors

Returns an error if the context cannot be bound, CUDA cannot query the stream identifier, a previous asynchronous launch reports an error, or CUDA reports runtime initialization diagnostics.

Source

pub fn device(&self) -> Result<Device>

Returns the device of the stream.

§Errors

Returns an error if the context cannot be bound, CUDA cannot query the stream device, a previous asynchronous launch reports an error, or CUDA reports runtime initialization diagnostics.

Source

pub fn context(&self) -> &Context

Source

pub fn as_raw(&self) -> cudaStream_t

Source

pub fn into_raw(self) -> cudaStream_t

Consumes the stream and returns the raw CUDA stream handle without destroying it.

The caller becomes responsible for eventually destroying the returned handle with CUDA.

Trait Implementations§

Source§

impl Clone for Stream

Source§

fn clone(&self) -> Stream

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for Stream

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Eq for Stream

Source§

impl PartialEq for Stream

Source§

fn eq(&self, other: &Self) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 (const: unstable) · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl Send for Stream

Source§

impl Sync for Stream

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
where ST: ?Sized, DT: ?Sized,

Source§

impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
where ST: ?Sized, DT: ?Sized,

Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> Read<Exclusive, BecauseExclusive> for T
where T: ?Sized,

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.