Enum TrtMsg

Source

pub enum TrtMsg {
    Build {
        source: NetworkSource,
        config: Box<IBuilderConfig>,
        reply: BuildReply,
    },
    Deserialize {
        plan: EnginePlan,
        reply: DeserializeReply,
    },
    CreateContext {
        engine: Arc<TrtEngine>,
        reply: CreateContextReply,
    },
    EnqueueOnStream {
        stream: Arc<CudaStream>,
        context: ExecutionContext,
        bindings: ExecutionBindings,
        reply: EnqueueReply,
    },
    Refit {
        engine: Arc<TrtEngine>,
        weights: Vec<RefitWeights>,
        reply: RefitReply,
    },
    Execute {
        engine: Arc<TrtEngine>,
        bindings: Vec<(String, u64)>,
        input_shapes: Vec<(String, Vec<i32>)>,
        stream: Arc<CudaStream>,
        reply: ExecuteReply,
    },
    BuildFromOnnx {
        onnx_bytes: Vec<u8>,
        config: Box<IBuilderConfig>,
        reply: BuildFromOnnxReply,
    },
}

Expand description

Public message surface for TrtActor.

The variant EnqueueOnStream accepts the Arc<CudaStream> from atomr-accel-cuda::DeviceActor so the TensorRT runtime shares the device’s stream timeline (no cross-stream synchronisation, no extra event hops).

Variants§

§

Build

Build a TensorRT engine from a network source + config. Returns the serialised plan on success.

Fields

§source: NetworkSource

§config: Box<IBuilderConfig>

§reply: BuildReply

§

Deserialize

Deserialise a plan blob into a shared engine handle.

Fields

§plan: EnginePlan

§reply: DeserializeReply

§

CreateContext

Create a fresh IExecutionContext from an existing engine. Returns the new context (caller owns it).

Fields

§engine: Arc<TrtEngine>

§reply: CreateContextReply

§

EnqueueOnStream

Submit enqueueV3 on the supplied CUDA stream. The actor returns immediately after submission; real GPU completion is observed by atomr-accel-cuda’s completion strategy on the shared stream.

Fields

§stream: Arc<CudaStream>

§context: ExecutionContext

§bindings: ExecutionBindings

§reply: EnqueueReply

§

Refit

Refit a built engine in-place with new weights. Requires the engine to have been built with RefitPolicy::OnDemand or WeightsStreaming.

Fields

§engine: Arc<TrtEngine>

§weights: Vec<RefitWeights>

§reply: RefitReply

§

Execute

Phase 4.5++ — Run inference on a previously-loaded engine. bindings is (tensor_name, CUdeviceptr) for every I/O tensor on the engine; stream is the Arc<CudaStream> to enqueueV3 against (typically the device’s primary stream from DeviceMsg::SnapshotStream).

The handler creates a fresh IExecutionContext, binds every tensor address, then calls enqueueV3. Returns Ok(()) on successful submission (kernel still running on the GPU); real completion is observed by atomr-accel-cuda’s completion strategy on the shared stream.

On builds without tensorrt-link the variant compiles but the handler returns TrtError::NotLinked.

Fields

§engine: Arc<TrtEngine>

§bindings: Vec<(String, u64)>

§input_shapes: Vec<(String, Vec<i32>)>

§stream: Arc<CudaStream>

§reply: ExecuteReply

§

BuildFromOnnx

Phase 4.5++ — Parse an ONNX model and build a serialised engine plan. Gated on the upstream tensorrt-onnx feature (and transitively on tensorrt-link). Without those the handler returns TrtError::NotLinked.