trtx-rs
⚠️ EXPERIMENTAL - NOT FOR PRODUCTION USE
This project is in early experimental development. The API is unstable and will change. This is NOT production-ready software. Use at your own risk.
Published on crates.io to reserve the crate names.
Safe Rust bindings to NVIDIA TensorRT-RTX for high-performance deep learning inference.
Overview
This project provides ergonomic Rust bindings to TensorRT-RTX, enabling efficient inference of deep learning models on NVIDIA GPUs with minimal overhead.
Features
- Safe API: RAII-based memory management and type-safe abstractions
- Two-phase workflow: Separate build (AOT) and inference (runtime) phases
- Zero-cost abstractions: Minimal overhead over C++ API
- Comprehensive error handling: Proper Rust error types for all operations
- Flexible logging: Customizable log handlers for TensorRT messages
Project Structure
trtx-rs/
├── trtx-sys/ # Raw FFI bindings (unsafe)
└── trtx/ # Safe Rust wrapper (use this!)
Prerequisites
Required (building)
- NVIDIA TensorRT-RTX 1.3: Download and install from NVIDIA Developer
- CUDA Runtime: Version compatible with your TensorRT-RTX installation
- Clang: Required for autocxx. On Windows:
winget install LLVM.LLVM - NVIDIA GPU: Compatible with TensorRT-RTX requirements
TensorRT is by default dynamically loaded. So, the TensorRT SDK is only required for building
with Cargo features link_tensorrt_rtx/ link_tensorrt_onnxparser which would link the TensorRT libraries.
Use TENSORRT_RTX_DIR to point to the TensorRT SDK root directory (the path that contains the lib folder with the shared libraries).
Required (GPU execution)
-
NVIDIA TensorRT-RTX: Download and install from NVIDIA Developer
- The TensorRT libraries should be in a location where they can be dynamically loaded. (e.g. by setting PATH on Windows or LD_LIBRARY_PATH on Linux)
- This crate currently requires TensorRT RTX version 1.3 (see Cargo feature
v_1_3). Other versions, might become available in future.
-
NVIDIA GPU: Compatible with TensorRT-RTX requirements
Development Without TensorRT-RTX (Mock Mode)
If you're developing on a machine without TensorRT-RTX (e.g., macOS, or for testing), you can use the mock feature. This enables the trtx mock layer (safe Rust stubs in trtx that mirror the real API), not the low-level FFI:
# Build with mock mode
# Run examples with mock mode
# Run tests with mock mode
Mock mode provides stub implementations that allow you to:
- Verify the API compiles correctly
- Test your application structure
- Develop without needing an NVIDIA GPU
- Run CI/CD pipelines on any platform
Note: Mock mode only validates structure and API usage. For actual inference, you need real TensorRT-RTX.
Cargo features
The trtx crate has the following Cargo features:
default: "real", "dlopen_tensorrt_onnxparser", "dlopen_tensorrt_rtx", "onnxparser", "v_1_3"mock: use this library in mock mode. TensorRT libraries and a Nvidia are no longer necessary for executionreal: opposite ofmockmode. TensorRT and Nvidia GPU are required for executiondlopen_tensorrt_rtx: enables dynamic loading of the TensorRT library viatrtx::dynamically_load_tensorrtdlopen_tensorrt_onnxparser: enables dynamic loading of the TensorRT ONNX parser library viatrtx::dynamically_load_tensorrt_onnxparserlinks_tensorrt_rtx: links the TensorRT library,trtx::dynamically_load_tensorrtis now optionallinks_tensorrt_onnxparser: links the TensorRT ONNX parser library,trtx::dynamically_load_tensorrt_onnxparseris now optionalonnxparser: Enables the ONNX parser functionality of this crate. Optional if not using ONNX as the input format for TensorRT, but using the builder library insteadv_1_3: Needs to be always enabled. Future TensorRT versions might be selectable by higher version numbers in future
Installation
Add to your Cargo.toml:
[]
= "0.3"
Usage
Build Phase (Creating an Engine)
use ;
use ;
Inference Phase (Running Inference)
use ;
use fs;
Custom Logging
use ;
;
API Overview
Core Types
Logger: Captures TensorRT messages with custom handlersBuilder: Creates optimized inference enginesNetworkDefinition: Defines the computational graphBuilderConfig: Configures optimization parametersRuntime: Deserializes engines for inferenceCudaEngine: Optimized inference engineExecutionContext: Manages inference execution
Error Handling
All fallible operations return Result<T, Error>:
use Error;
match builder.create_network
Safety
Safe Operations
Most operations are safe and use RAII for resource management:
- Creating loggers, builders, runtimes
- Building and serializing engines
- Deserializing engines
- Creating execution contexts
Unsafe Operations
CUDA-related operations require unsafe:
set_tensor_address: Must point to valid CUDA device memoryenqueue_v3: Requires valid CUDA stream and properly bound tensors
Building from Source
# Clone the repository
# Option 1: Build with TensorRT-RTX (requires NVIDIA GPU)
# Option 2: Build in mock mode (no GPU required)
Examples
See the trtx/examples/ directory for complete examples:
basic_workflow.rs: Build and serialize an engine (optionally from ONNX), then run inferencetiny_network.rs: Build a small ReLU-based network from scratch using the Network API (no ONNX)rustnn_executor.rs: rustnn-compatible executor integration
Architecture
trtx-sys (FFI Layer)
- autocxx-generated bindings for the TensorRT-RTX C++ API
- Slim C++ logger bridge for virtual method handling (e.g., log callbacks)
- Optional mock FFI (when
mockfeature is enabled) so the crate can build without TensorRT installed - No safety guarantees; internal use only
trtx (Safe Wrapper)
- Mock layer: When the
mockfeature is enabled, the trtx crate uses a Rust mock layer (trtx/src/mock/) that mirrors the real API—this is the “mock mode” you use for development without GPU. Real implementation lives intrtx/src/real/. - RAII-based resource management
- Type-safe API
- Lifetime tracking
- Comprehensive error handling
- User-facing API
Troubleshooting
Build Errors
Cannot find TensorRT headers:
Linking errors:
Runtime Errors
CUDA not initialized: Ensure CUDA runtime is properly initialized before creating engines or contexts.
Invalid tensor addresses: Verify that all tensor addresses point to valid CUDA device memory with correct sizes.
Development
Pre-commit Hooks
To ensure code quality, set up the pre-commit hook:
The hook will automatically run cargo fmt and cargo clippy before each commit.
Manual Checks
You can also run checks manually using the Makefile:
GPU Testing
The project includes CI workflows for testing with real NVIDIA GPUs:
- Mock mode CI: Runs on every push (ubuntu, macos) - tests API without GPU
- GPU tests: Runs on self-hosted Windows runner with T4 GPU - tests real TensorRT-RTX
To set up a GPU runner for real hardware testing, see GPU Runner Setup Guide.
The GPU tests workflow:
- Builds without mock mode (uses real TensorRT-RTX)
- Verifies CUDA and GPU availability
- Runs tests and examples with actual GPU acceleration
- Can be triggered manually or runs automatically on code changes
Contributing
Contributions are welcome! Please see docs/DESIGN.md for architecture details.
License
This project is licensed under the Apache License, Version 2.0 - see the LICENSE file for details.
Acknowledgments
- NVIDIA for TensorRT-RTX
- The Rust community for excellent FFI tools
Status
This project is in early development. APIs may change before 1.0 release.
Implemented
- ✅ Core FFI layer (autocxx); trtx mock layer for development without TensorRT (no GPU)
- ✅ Logger interface with custom handlers
- ✅ Builder API for engine creation
- ✅ Runtime and engine deserialization
- ✅ Execution context
- ✅ Error handling with detailed messages
- ✅ Network API: TensorRT-RTX
INetworkDefinitionsupported—build networks in Rust without ONNX - ✅ ONNX parser bindings (nvonnxparser integration)
- ✅ CUDA: cudarc integration for memory management and device sync
- ✅ rustnn-compatible executor API (ready for integration)
- ✅ RAII-based resource management
Planned
- ⬜ Dynamic shape support
- ⬜ Optimization profiles
- ⬜ Weight refitting
- ⬜ INT8 quantization support
- ⬜ Comprehensive examples with real models
- ⬜ Performance benchmarking
- ⬜ Documentation improvements