pub enum TrtMsg {
Build {
source: NetworkSource,
config: Box<IBuilderConfig>,
reply: BuildReply,
},
Deserialize {
plan: EnginePlan,
reply: DeserializeReply,
},
CreateContext {
engine: Arc<TrtEngine>,
reply: CreateContextReply,
},
EnqueueOnStream {
stream: Arc<CudaStream>,
context: ExecutionContext,
bindings: ExecutionBindings,
reply: EnqueueReply,
},
Refit {
engine: Arc<TrtEngine>,
weights: Vec<RefitWeights>,
reply: RefitReply,
},
Execute {
engine: Arc<TrtEngine>,
bindings: Vec<(String, u64)>,
input_shapes: Vec<(String, Vec<i32>)>,
stream: Arc<CudaStream>,
reply: ExecuteReply,
},
BuildFromOnnx {
onnx_bytes: Vec<u8>,
config: Box<IBuilderConfig>,
reply: BuildFromOnnxReply,
},
}Expand description
Public message surface for TrtActor.
The variant EnqueueOnStream accepts the Arc<CudaStream> from
atomr-accel-cuda::DeviceActor so the TensorRT runtime shares
the device’s stream timeline (no cross-stream synchronisation,
no extra event hops).
Variants§
Build
Build a TensorRT engine from a network source + config. Returns the serialised plan on success.
Deserialize
Deserialise a plan blob into a shared engine handle.
CreateContext
Create a fresh IExecutionContext from an existing engine.
Returns the new context (caller owns it).
EnqueueOnStream
Submit enqueueV3 on the supplied CUDA stream. The actor
returns immediately after submission; real GPU completion is
observed by atomr-accel-cuda’s completion strategy on the
shared stream.
Fields
stream: Arc<CudaStream>context: ExecutionContextbindings: ExecutionBindingsreply: EnqueueReplyRefit
Refit a built engine in-place with new weights. Requires the
engine to have been built with RefitPolicy::OnDemand or
WeightsStreaming.
Execute
Phase 4.5++ — Run inference on a previously-loaded engine.
bindings is (tensor_name, CUdeviceptr) for every I/O
tensor on the engine; stream is the Arc<CudaStream> to
enqueueV3 against (typically the device’s primary stream
from DeviceMsg::SnapshotStream).
The handler creates a fresh IExecutionContext, binds every
tensor address, then calls enqueueV3. Returns Ok(()) on
successful submission (kernel still running on the GPU);
real completion is observed by atomr-accel-cuda’s
completion strategy on the shared stream.
On builds without tensorrt-link the variant compiles but
the handler returns TrtError::NotLinked.
BuildFromOnnx
Phase 4.5++ — Parse an ONNX model and build a serialised
engine plan. Gated on the upstream tensorrt-onnx feature
(and transitively on tensorrt-link). Without those the
handler returns TrtError::NotLinked.