Expand description
§Ultralytics YOLO Inference Library
High-performance YOLO model inference library written in Rust, providing a safe and efficient interface for running Ultralytics YOLO models on images, videos, and streams.
§Features
- High Performance - Pure Rust with zero-cost abstractions and SIMD-optimized preprocessing
- ONNX Runtime - Leverages ONNX Runtime for cross-platform hardware acceleration
- Supported YOLO Versions -
YOLO11andYOLO26(including YOLO26 end-to-end NMS-free exports) - All Tasks - Detection, segmentation, pose estimation, classification, and OBB
- Ultralytics API - Results API matches the Python package for easy migration
- Multiple Backends - CPU, CUDA,
TensorRT,CoreML,OpenVINO, and more - Multiple Sources - Images, directories, glob patterns, video, webcam, streams
§Installation
Add to your Cargo.toml:
[dependencies]
ultralytics-inference = "0.0.13"Or install the CLI tool:
cargo install ultralytics-inference§Quick Start (Library)
use ultralytics_inference::{YOLOModel, InferenceConfig};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load model - metadata (classes, task, imgsz) is read automatically
let mut model = YOLOModel::load("yolo26n.onnx")?;
// Run inference
let results = model.predict("image.jpg")?;
// Process results
for result in &results {
if let Some(ref boxes) = result.boxes {
println!("Found {} detections", boxes.len());
for i in 0..boxes.len() {
let cls = boxes.cls()[i] as usize;
let conf = boxes.conf()[i];
let name = result.names.get(&cls).map(|s| s.as_str()).unwrap_or("unknown");
println!(" {} {:.2}", name, conf);
}
}
}
Ok(())
}§CLI Usage
The ultralytics-inference CLI provides a command-line interface for running YOLO inference:
# Install the CLI
cargo install ultralytics-inference
# Run with defaults (auto-downloads model and sample images)
ultralytics-inference predict
# Select task — auto-downloads the matching nano model
ultralytics-inference predict --task segment
ultralytics-inference predict --task pose
ultralytics-inference predict --task obb
ultralytics-inference predict --task classify
# Run on a specific image
ultralytics-inference predict --model yolo26n.onnx --source image.jpg
# Run on a directory of images
ultralytics-inference predict --model yolo26n.onnx --source images/
# With custom thresholds
ultralytics-inference predict -m yolo26n.onnx -s image.jpg --conf 0.5 --iou 0.7
# Filter by class IDs
ultralytics-inference predict --source image.jpg --classes "0,1,2"
# With visualization window
ultralytics-inference predict --model yolo26n.onnx --source video.mp4 --show
# Save annotated results
ultralytics-inference predict --model yolo26n.onnx --source image.jpg --save
# Save individual frames for video input
ultralytics-inference predict --source video.mp4 --save-frames
# Show help
ultralytics-inference help
# Show version
ultralytics-inference versionCLI Options:
| Option | Short | Description | Default |
|---|---|---|---|
--model | -m | Path to ONNX model file; auto-downloaded if a known YOLO11/YOLO26 name | yolo26n.onnx |
--task | Task type (detect, segment, pose, obb, classify); selects nano model when --model is omitted | detect | |
--source | -s | Input source (image, directory, glob, video, webcam index, or URL) | Task-dependent sample assets |
--conf | Confidence threshold | 0.25 | |
--iou | IoU threshold for NMS | 0.7 | |
--max-det | Maximum number of detections | 300 | |
--imgsz | Inference image size | Model metadata | |
--rect | Enable rectangular inference (minimal padding) | true | |
--batch | Batch size for inference | 1 | |
--half | Use FP16 half-precision inference | false | |
--save | Save annotated results to runs/<task>/predict | true | |
--save-frames | Save individual frames for video input | false | |
--show | Display results in a window | false | |
--device | Device (cpu, cuda:0, mps, coreml, directml:0, openvino, tensorrt:0, xnnpack) | cpu | |
--verbose | Show verbose output | true | |
--classes | Filter by class IDs, e.g. 0 or "0,1,2" or "[0, 1, 2]" | all classes |
§Task-Specific Examples
The library supports all YOLO tasks. Export models from Python:
# Detection (default)
yolo export model=yolo26n.pt format=onnx
# Segmentation
yolo export model=yolo26n-seg.pt format=onnx
# Pose Estimation
yolo export model=yolo26n-pose.pt format=onnx
# Classification
yolo export model=yolo26n-cls.pt format=onnx
# Oriented Bounding Boxes
yolo export model=yolo26n-obb.pt format=onnxThe task is auto-detected from ONNX metadata:
use ultralytics_inference::YOLOModel;
// Segmentation model - returns masks
let mut model = YOLOModel::load("yolo26n-seg.onnx")?;
let results = model.predict("image.jpg")?;
if let Some(ref masks) = results[0].masks {
println!("Found {} instance masks", masks.len());
}
// Pose model - returns keypoints
let mut model = YOLOModel::load("yolo26n-pose.onnx")?;
let results = model.predict("image.jpg")?;
if let Some(ref keypoints) = results[0].keypoints {
println!("Found {} poses", keypoints.len());
}
// Classification model - returns probabilities
let mut model = YOLOModel::load("yolo26n-cls.onnx")?;
let results = model.predict("image.jpg")?;
if let Some(ref probs) = results[0].probs {
println!("Top-1: class {} ({:.1}%)", probs.top1(), probs.top1conf() * 100.0);
}§Custom Configuration
Use the builder pattern to customize inference settings:
use ultralytics_inference::InferenceConfig;
let config = InferenceConfig::new()
.with_confidence(0.5) // Confidence threshold
.with_iou(0.45) // NMS IoU threshold
.with_max_det(300) // Max detections per image
.with_imgsz(640, 640); // Input image size§Hardware Acceleration
Enable hardware acceleration with Cargo features:
# NVIDIA CUDA
cargo build --release --features cuda
# NVIDIA TensorRT
cargo build --release --features tensorrt
# Apple CoreML
cargo build --release --features coreml
# Intel OpenVINO
cargo build --release --features openvino§Results API
The Results struct provides access to inference outputs:
Boxes- Bounding boxes withxyxy(),xywh(),xyxyn(),xywhn(),conf(),cls()methodsMasks- Segmentation masks withdata,orig_shapefieldsKeypoints- Pose keypoints withxy(),xyn(),conf()methodsProbs- Classification probabilities withtop1(),top5(),top1conf(),top5conf()methodsObb- Oriented bounding boxes withxyxyxyxy(),xywhr(),conf(),cls()methods
§Module Overview
| Module | Description |
|---|---|
model | Core YOLOModel for loading models and running inference |
results | Output types (Results, Boxes, Masks, etc.) |
inference | InferenceConfig for customizing inference settings |
source | Input source handling (Source, SourceIterator) |
task | YOLO task types (Task: Detect, Segment, Pose, etc.) |
error | Error types (InferenceError, Result) |
preprocessing | Image preprocessing utilities |
postprocessing | Detection post-processing (NMS, decode) |
metadata | ONNX model metadata parsing |
§Feature Flags
| Feature | Description |
|---|---|
annotate | Image annotation support (default) |
visualize | Real-time window display (default) |
video | Video file support (FFmpeg) |
cuda | NVIDIA CUDA acceleration |
tensorrt | NVIDIA TensorRT optimization |
coreml | Apple CoreML (macOS/iOS) |
openvino | Intel OpenVINO |
§License
This project is dual-licensed under AGPL-3.0 for open-source use or Ultralytics Enterprise License for commercial applications.
Re-exports§
pub use device::Device;pub use error::InferenceError;pub use error::Result;pub use inference::InferenceConfig;pub use model::YOLOModel;pub use results::Boxes;pub use results::Keypoints;pub use results::Masks;pub use results::Obb;pub use results::Probs;pub use results::Results;pub use results::Speed;pub use source::Source;pub use source::SourceIterator;pub use source::SourceMeta;pub use task::Task;pub use metadata::ModelMetadata;pub use preprocessing::PreprocessResult;pub use preprocessing::preprocess_image;pub use preprocessing::preprocess_image_with_precision;
Modules§
- annotate
annotate - Image annotation utilities.
- batch
- Batch processing module for YOLO inference.
- cli
- CLI module for running inference.
- device
- Hardware device support and abstraction.
- download
- Model downloading utilities.
- error
- Error types for the inference library.
- inference
- Inference configuration and common types.
- io
- I/O utilities for saving results including video encoding.
- logging
- Logging utilities.
- metadata
- ONNX model metadata parsing.
- model
- YOLO model loading and inference.
- postprocessing
- Post-processing for YOLO model outputs.
- preprocessing
- Image preprocessing for YOLO inference.
- results
- Results classes for YOLO inference output.
- source
- Input source handling for YOLO inference.
- task
- Task definitions for YOLO models.
- utils
- Utility functions for the inference library
- visualizer
- Visualization tools for inference results.
Macros§
- error
- Macro for error messages.
- info
- Macro for standard info messages.
- section
- Macro for section headers.
- success
- Macro for success messages.
- verbose
- Macro for verbose messages.
- warn
- Macro for warning messages.