rave-tensorrt-0.3.0 has been yanked.
rave-tensorrt
TensorRT-backed inference implementation for RAVE.
rave-tensorrt provides an UpscaleBackend implementation powered by
ONNX Runtime with TensorRT/CUDA execution providers and device-memory I/O
binding.
Scope
- Backend implementation:
TensorRtBackend - Precision policy controls:
PrecisionPolicy - Batch settings surface:
BatchConfig - Inference/output-ring metrics and telemetry
- Provider bridge handling for Linux/WSL runtime compatibility
Public API Highlights
TensorRtBackend::new(model_path, ctx, device_id, ring_size, downstream_capacity)TensorRtBackend::with_precision(...)BatchConfig { max_batch, latency_deadline_us }TensorRtBackendimplementsrave_core::backend::UpscaleBackend
Typical Usage
use PathBuf;
use Arc;
use UpscaleBackend;
use GpuContext;
use Result;
use TensorRtBackend;
async
Runtime Requirements
- ONNX Runtime shared libs and provider libs discoverable at runtime
- CUDA/TensorRT/cuDNN stack compatible with ORT build
- NVIDIA driver libs discoverable (
libcuda, etc.)
Notes
- Data path is GPU-resident via ORT I/O binding.
BatchConfigis part of the API surface; current processing path is single-frame.- Provider selection can be controlled externally via
RAVE_ORT_TENSORRT.