Skip to main content

Module runtime

Module runtime 

Source
Expand description

Safe wrapper around nvinfer1::IExecutionContext and the IRuntime deserialiser. This is the inference-time hot path.

The runtime actor (TrtActor, see actor.rs) drives an ExecutionContext per inference, calling enqueueV3 on a CUDA stream provided by atomr-accel-cuda::DeviceActor. The actor never blocks on the GPU; completion is signalled via the same host-fn-completion mechanism the BLAS/cuDNN actors use.

Structs§

EnqueueRequest
Standalone enqueue request type — embedded into the TrtActor’s message enum but exposed here so the message dispatcher in actor.rs and tests can construct it without crossing module boundaries.
ExecutionBindings
Per-call inputs/outputs: tensor name → device pointer. Pointers are raw u64s (CUDA device addresses) so the message is Send + Sync without lifetimes from Arc<CudaSlice<T>>.
ExecutionContext
Wrapper around an owned IExecutionContext*. Held inside the TrtActor and is not Send to outside callers — the actor owns it for life and serialises access.
TensorShape
Shape of a dynamic tensor input. Exactly mirrors nvinfer1::Dims (max 8 dims).
TrtRuntime
Owned wrapper for IRuntime, used to deserialise plan blobs.

Type Aliases§

EnqueueReply
Reply payload for an enqueue request. Ok = stream submission succeeded (kernel still running on the GPU); the caller awaits real completion via the shared CUDA stream completion sentinel.