docs.rs failed to build libinfer-0.1.3
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
libinfer
Rust interface to TensorRT engines via cxx. Caller provides device memory and CUDA streams.
Installation
Requirements:
- CUDA and TensorRT installed
- Environment variables:
TENSORRT_LIBRARIES: path to TensorRT librariesCUDA_LIBRARIES: path to CUDA librariesCUDA_INCLUDE_DIRS: path to CUDA include directories
[]
= "0.1.0"
A Nix flake is provided for development. nix develop sets up all dependencies.
Usage
The API operates on raw CUDA device pointers and streams. The caller is responsible for device selection, memory allocation, and stream management.
use CudaContext;
use ;
// Set the CUDA device before loading the engine
let ctx = new.expect;
let options = Options ;
let mut engine = new.unwrap;
// Query tensor metadata
let inputs = engine.get_input_dims;
let outputs = engine.get_output_dims;
let batch = engine.get_batch_dims;
// Allocate device memory, run inference
let stream = ctx.new_stream.unwrap;
// ... allocate input_bufs, output_bufs on the device ...
engine.infer.unwrap;
Input and output pointer arrays must match the order returned by get_input_dims() / get_output_dims().
Examples
cargo run --example basic -- --path /path/to/model.engine
cargo run --example benchmark -- --path /path/to/model.engine
cargo run --example dynamic -- --path /path/to/model.engine
Testing
Tests require a CUDA-capable GPU. Generate test models and build TensorRT engines:
Then run:
cargo test
Caveats
EngineisSendbut notSync.infertakes&mut self. For concurrent inference on the same model, create separateEngineinstances.- The caller must ensure the CUDA context outlives the engine, particularly when cudarc's event tracking is disabled.
- Only the batch dimension is dynamic. Non-batch dynamic shapes are yet not supported.
- Engine files are not portable across TensorRT versions or GPU architectures. Rebuild from ONNX for each target.
- CUDA graphs are not yet supported.
Credits
C++ code originally based on tensorrt-cpp-api.