Skip to main content

Crate trtx

Crate trtx 

Source
Expand description

Safe Rust bindings to NVIDIA TensorRT-RTX

⚠️ EXPERIMENTAL - NOT FOR PRODUCTION USE

This crate is in early experimental development. The API is unstable and will change. This is NOT production-ready software. Use at your own risk.

This crate provides safe, ergonomic Rust bindings to the TensorRT-RTX library for high-performance deep learning inference on NVIDIA GPUs.

§Overview

TensorRT-RTX enables efficient inference by:

  • Optimizing neural network graphs
  • Fusing layers and operations
  • Selecting optimal kernels for your hardware
  • Supporting dynamic shapes and batching

§Workflow

Using TensorRT-RTX typically follows two phases:

§Build Phase (Ahead-of-Time)

  1. Create a Logger to capture TensorRT messages
  2. Create a Builder to construct an optimized engine
  3. Define your network using NetworkDefinition
  4. Configure optimization with BuilderConfig
  5. Build and serialize the engine to disk

§Inference Phase (Runtime)

  1. Create a Runtime with a logger
  2. Deserialize the engine using Runtime::deserialize_cuda_engine
  3. Create an ExecutionContext from the engine
  4. Bind input/output tensors
  5. Execute inference with ExecutionContext::enqueue_v3

§Example

use trtx::{Logger, Builder, Runtime};
use trtx::builder::{BuilderConfig, MemoryPoolType, network_flags};

// Create logger
let logger = Logger::stderr()?;

// Build phase
let builder = Builder::new(&logger)?;
let mut network = builder.create_network(network_flags::EXPLICIT_BATCH)?;
let mut config = builder.create_config()?;

// Configure memory
config.set_memory_pool_limit(MemoryPoolType::Workspace, 1 << 30)?;

// Build and serialize
let engine_data = builder.build_serialized_network(&mut network, &mut config)?;
std::fs::write("model.engine", &engine_data)?;

// Inference phase
let runtime = Runtime::new(&logger)?;
let engine = runtime.deserialize_cuda_engine(&engine_data)?;
let context = engine.create_execution_context()?;

// List I/O tensors
let num_tensors = engine.get_nb_io_tensors()?;
for i in 0..num_tensors {
    let name = engine.get_tensor_name(i)?;
    println!("Tensor {}: {}", i, name);
}

§Safety

This crate provides safe abstractions over the underlying C++ API. However, some operations (like setting tensor addresses and enqueueing inference) require careful management of CUDA memory and are marked as unsafe.

§Prerequisites

  • NVIDIA TensorRT-RTX library installed
  • CUDA Runtime
  • Compatible NVIDIA GPU

Set the TENSORRT_RTX_DIR environment variable to the installation path if TensorRT-RTX is not in a standard location.

Re-exports§

pub use builder::Builder;
pub use builder::BuilderConfig;
pub use cuda::get_default_stream;
pub use cuda::synchronize;
pub use cuda::DeviceBuffer;
pub use enum_helpers::activation_type_name;
pub use enum_helpers::datatype_name;
pub use enum_helpers::elementwise_op_name;
pub use enum_helpers::pooling_type_name;
pub use enum_helpers::reduce_op_name;
pub use enum_helpers::unary_op_name;
pub use error::Error;
pub use error::Result;
pub use executor::run_onnx_with_tensorrt;
pub use executor::run_onnx_zeroed;
pub use executor::TensorInput;
pub use executor::TensorOutput;
pub use logger::LogHandler;
pub use logger::Logger;
pub use logger::Severity;
pub use logger::StderrLogger;
pub use network::NetworkDefinition;
pub use network::Tensor;
pub use onnx_parser::OnnxParser;
pub use runtime::CudaEngine;
pub use runtime::ExecutionContext;
pub use runtime::Runtime;

Modules§

autocxx_helpers
Helper utilities for working with autocxx-generated TensorRT bindings
builder
Builder for creating TensorRT engines
cuda
CUDA memory management utilities
enum_helpers
Helper functions for converting TensorRT enums to strings
error
Error types for TensorRT-RTX operations
executor
Executor module providing rustnn-compatible interface
logger
Logger interface for TensorRT-RTX
network
Network definition for building TensorRT engines
onnx_parser
ONNX model parser for TensorRT
runtime
Runtime for deserializing and managing TensorRT engines

Macros§

autocxx_call
Helper macro to reduce boilerplate for autocxx method calls

Enums§

ActivationType
! ! \enum ActivationType ! ! \brief Enumerates the types of activation to perform in an activation layer. !
CumulativeOperation
! ! \enum CumulativeOperation ! ! \brief Enumerates the cumulative operations that may be performed by a Cumulative layer. ! ! The table shows the initial value of each Cumulative operation. ! ! Operation | kFLOAT, kHALF, kBF16 | kINT32, kINT64 | ! ——— | –––––––––– | ––––––– | ! kSUM | +0.0 | 0 | !
DataType
! ! \enum DataType ! \brief The type of weights and tensors. ! The datatypes other than kBOOL, kINT32, and kINT64 are “activation datatypes,” ! as they often represent values corresponding to inference results. !
ElementWiseOperation
! ! \enum ElementWiseOperation ! ! \brief Enumerates the binary operations that may be performed by an ElementWise layer. ! ! Operations kAND, kOR, and kXOR must have inputs of DataType::kBOOL. ! ! All other operations must have inputs of floating-point type, DataType::kINT8, DataType::kINT32, or ! DataType::kINT64. ! ! \see IElementWiseLayer !
GatherMode
! ! \brief Control form of IGatherLayer ! ! \see IGatherLayer !
InterpolationMode
! \enum InterpolationMode ! ! \brief Enumerates various modes of interpolation ! !
MatrixOperation
! ! \enum MatrixOperation ! ! \brief Enumerates the operations that may be performed on a tensor ! by IMatrixMultiplyLayer before multiplication. !
PoolingType
! ! \enum PoolingType ! ! \brief The type of pooling to perform in a pooling layer. !
ReduceOperation
! ! \enum ReduceOperation ! ! \brief Enumerates the reduce operations that may be performed by a Reduce layer. ! ! The table shows the result of reducing across an empty volume of a given type. ! ! Operation | kFLOAT and kHALF | kINT32 | kINT8 ! ——— | —————– | —–– | —– ! kSUM | 0 | 0 | 0 ! kPROD | 1 | 1 | 1 ! kMAX | negative infinity | INT_MIN | -128 ! kMIN | positive infinity | INT_MAX | 127 ! kAVG | NaN | 0 | -128 ! ! The current version of TensorRT usually performs reduction for kINT8 via kFLOAT or kHALF. ! The kINT8 values show the quantized representations of the floating-point values. !
ResizeCoordinateTransformation
! ! \enum ResizeCoordinateTransformation ! ! \brief The resize coordinate transformation function. ! ! \see IResizeLayer::setCoordinateTransformation() !
ResizeRoundMode
! ! \enum ResizeRoundMode ! ! \brief The rounding mode for nearest neighbor resize. ! ! \see IResizeLayer::setNearestRounding() !
ResizeSelector
! ! \enum ResizeSelector ! ! \brief The coordinate selector when resize to single pixel output. ! ! \see IResizeLayer::setSelectorForSinglePixel() !
ScaleMode
! ! \brief Controls how shift, scale and power are applied in a Scale layer. ! ! \see IScaleLayer !
ScatterMode
! ! \enum ScatterMode ! ! \brief Control form of IScatterLayer ! ! \see IScatterLayer !
UnaryOperation
! ! \enum UnaryOperation ! ! \brief Enumerates the unary operations that may be performed by a Unary layer. ! ! Operations kNOT must have inputs of DataType::kBOOL. ! ! Operation kSIGN and kABS must have inputs of floating-point type, DataType::kINT8, DataType::kINT32 or ! DataType::kINT64. ! ! Operation kISINF must have inputs of floating-point type. ! ! All other operations must have inputs of floating-point type. ! ! \see IUnaryLayer !

Functions§

dynamically_load_tensorrt
dynamically_load_tensorrt_onnxparser

Type Aliases§

ResizeMode