Expand description
Safe Rust bindings to NVIDIA TensorRT-RTX
⚠️ EXPERIMENTAL - NOT FOR PRODUCTION USE
This crate is in early experimental development. The API is unstable and will change. This is NOT production-ready software. Use at your own risk.
This crate provides safe, ergonomic Rust bindings to the TensorRT-RTX library for high-performance deep learning inference on NVIDIA GPUs.
§Overview
TensorRT-RTX enables efficient inference by:
- Optimizing neural network graphs
- Fusing layers and operations
- Selecting optimal kernels for your hardware
- Supporting dynamic shapes and batching
§Workflow
Using TensorRT-RTX typically follows two phases:
§Build Phase (Ahead-of-Time)
- Create a
Loggerto capture TensorRT messages - Create a
Builderto construct an optimized engine - Define your network using
NetworkDefinition - Configure optimization with
BuilderConfig - Build and serialize the engine to disk
§Inference Phase (Runtime)
- Create a
Runtimewith a logger - Deserialize the engine using
Runtime::deserialize_cuda_engine - Create an
ExecutionContextfrom the engine - Bind input/output tensors
- Execute inference with
ExecutionContext::enqueue_v3
§Example
use trtx::{Logger, Builder, Runtime};
use trtx::builder::{BuilderConfig, MemoryPoolType, network_flags};
// Create logger
let logger = Logger::stderr()?;
// Build phase
let builder = Builder::new(&logger)?;
let mut network = builder.create_network(network_flags::EXPLICIT_BATCH)?;
let mut config = builder.create_config()?;
// Configure memory
config.set_memory_pool_limit(MemoryPoolType::Workspace, 1 << 30)?;
// Build and serialize
let engine_data = builder.build_serialized_network(&mut network, &mut config)?;
std::fs::write("model.engine", &engine_data)?;
// Inference phase
let runtime = Runtime::new(&logger)?;
let engine = runtime.deserialize_cuda_engine(&engine_data)?;
let context = engine.create_execution_context()?;
// List I/O tensors
let num_tensors = engine.get_nb_io_tensors()?;
for i in 0..num_tensors {
let name = engine.get_tensor_name(i)?;
println!("Tensor {}: {}", i, name);
}§Safety
This crate provides safe abstractions over the underlying C++ API. However,
some operations (like setting tensor addresses and enqueueing inference)
require careful management of CUDA memory and are marked as unsafe.
§Prerequisites
- NVIDIA TensorRT-RTX library installed
- CUDA Runtime
- Compatible NVIDIA GPU
Set the TENSORRT_RTX_DIR environment variable to the installation path
if TensorRT-RTX is not in a standard location.
Re-exports§
pub use builder::Builder;pub use builder::BuilderConfig;pub use cuda::get_default_stream;pub use cuda::synchronize;pub use cuda::DeviceBuffer;pub use enum_helpers::activation_type_name;pub use enum_helpers::datatype_name;pub use enum_helpers::elementwise_op_name;pub use enum_helpers::pooling_type_name;pub use enum_helpers::reduce_op_name;pub use enum_helpers::unary_op_name;pub use error::Error;pub use error::Result;pub use executor::run_onnx_with_tensorrt;pub use executor::run_onnx_zeroed;pub use executor::TensorInput;pub use executor::TensorOutput;pub use logger::LogHandler;pub use logger::Logger;pub use logger::Severity;pub use logger::StderrLogger;pub use network::NetworkDefinition;pub use network::Tensor;pub use onnx_parser::OnnxParser;pub use runtime::CudaEngine;pub use runtime::ExecutionContext;pub use runtime::Runtime;
Modules§
- autocxx_
helpers - Helper utilities for working with autocxx-generated TensorRT bindings
- builder
- Builder for creating TensorRT engines
- cuda
- CUDA memory management utilities
- enum_
helpers - Helper functions for converting TensorRT enums to strings
- error
- Error types for TensorRT-RTX operations
- executor
- Executor module providing rustnn-compatible interface
- logger
- Logger interface for TensorRT-RTX
- network
- Network definition for building TensorRT engines
- onnx_
parser - ONNX model parser for TensorRT
- runtime
- Runtime for deserializing and managing TensorRT engines
Macros§
- autocxx_
call - Helper macro to reduce boilerplate for autocxx method calls
Enums§
- Activation
Type - ! ! \enum ActivationType ! ! \brief Enumerates the types of activation to perform in an activation layer. !
- Cumulative
Operation - ! ! \enum CumulativeOperation ! ! \brief Enumerates the cumulative operations that may be performed by a Cumulative layer. ! ! The table shows the initial value of each Cumulative operation. ! ! Operation | kFLOAT, kHALF, kBF16 | kINT32, kINT64 | ! ——— | –––––––––– | ––––––– | ! kSUM | +0.0 | 0 | !
- Data
Type - ! ! \enum DataType ! \brief The type of weights and tensors. ! The datatypes other than kBOOL, kINT32, and kINT64 are “activation datatypes,” ! as they often represent values corresponding to inference results. !
- Element
Wise Operation - ! ! \enum ElementWiseOperation ! ! \brief Enumerates the binary operations that may be performed by an ElementWise layer. ! ! Operations kAND, kOR, and kXOR must have inputs of DataType::kBOOL. ! ! All other operations must have inputs of floating-point type, DataType::kINT8, DataType::kINT32, or ! DataType::kINT64. ! ! \see IElementWiseLayer !
- Gather
Mode - ! ! \brief Control form of IGatherLayer ! ! \see IGatherLayer !
- Interpolation
Mode - ! \enum InterpolationMode ! ! \brief Enumerates various modes of interpolation ! !
- Matrix
Operation - ! ! \enum MatrixOperation ! ! \brief Enumerates the operations that may be performed on a tensor ! by IMatrixMultiplyLayer before multiplication. !
- Pooling
Type - ! ! \enum PoolingType ! ! \brief The type of pooling to perform in a pooling layer. !
- Reduce
Operation - ! ! \enum ReduceOperation ! ! \brief Enumerates the reduce operations that may be performed by a Reduce layer. ! ! The table shows the result of reducing across an empty volume of a given type. ! ! Operation | kFLOAT and kHALF | kINT32 | kINT8 ! ——— | —————– | —–– | —– ! kSUM | 0 | 0 | 0 ! kPROD | 1 | 1 | 1 ! kMAX | negative infinity | INT_MIN | -128 ! kMIN | positive infinity | INT_MAX | 127 ! kAVG | NaN | 0 | -128 ! ! The current version of TensorRT usually performs reduction for kINT8 via kFLOAT or kHALF. ! The kINT8 values show the quantized representations of the floating-point values. !
- Resize
Coordinate Transformation - ! ! \enum ResizeCoordinateTransformation ! ! \brief The resize coordinate transformation function. ! ! \see IResizeLayer::setCoordinateTransformation() !
- Resize
Round Mode - ! ! \enum ResizeRoundMode ! ! \brief The rounding mode for nearest neighbor resize. ! ! \see IResizeLayer::setNearestRounding() !
- Resize
Selector - ! ! \enum ResizeSelector ! ! \brief The coordinate selector when resize to single pixel output. ! ! \see IResizeLayer::setSelectorForSinglePixel() !
- Scale
Mode - ! ! \brief Controls how shift, scale and power are applied in a Scale layer. ! ! \see IScaleLayer !
- Scatter
Mode - ! ! \enum ScatterMode ! ! \brief Control form of IScatterLayer ! ! \see IScatterLayer !
- Unary
Operation - ! ! \enum UnaryOperation ! ! \brief Enumerates the unary operations that may be performed by a Unary layer. ! ! Operations kNOT must have inputs of DataType::kBOOL. ! ! Operation kSIGN and kABS must have inputs of floating-point type, DataType::kINT8, DataType::kINT32 or ! DataType::kINT64. ! ! Operation kISINF must have inputs of floating-point type. ! ! All other operations must have inputs of floating-point type. ! ! \see IUnaryLayer !