Skip to main content

Crate ultralytics_inference

Crate ultralytics_inference 

Source
Expand description

§Ultralytics YOLO Inference Library

crates.io docs.rs Downloads License MSRV dependency status

High-performance YOLO model inference library written in Rust, providing a safe and efficient interface for running Ultralytics YOLO models on images, videos, and streams.

§Features

  • High Performance - Pure Rust with zero-cost abstractions and SIMD-optimized preprocessing
  • ONNX Runtime - Leverages ONNX Runtime for cross-platform hardware acceleration
  • Supported YOLO Versions - YOLO26, YOLO11, and YOLOv8 (including YOLO26 end-to-end NMS-free exports)
  • All Tasks - Detection, segmentation, pose estimation, classification, OBB, and semantic segmentation (YOLO26 only)
  • Ultralytics API - Results API matches the Python package for easy migration
  • Multiple Backends - CPU, CUDA, TensorRT, CoreML, OpenVINO, and more
  • Multiple Sources - Images, directories, glob patterns, video, webcam, streams

§Installation

Add to your Cargo.toml:

[dependencies]
ultralytics-inference = "0.0.18"

Or install the CLI tool:

cargo install ultralytics-inference

§Quick Start (Library)

use ultralytics_inference::{YOLOModel, InferenceConfig};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Load model - metadata (classes, task, imgsz) is read automatically
    let mut model = YOLOModel::load("yolo26n.onnx")?;

    // Run inference
    let results = model.predict("image.jpg")?;

    // Process results
    for result in &results {
        if let Some(ref boxes) = result.boxes {
            println!("Found {} detections", boxes.len());
            for i in 0..boxes.len() {
                let cls = boxes.cls()[i] as usize;
                let conf = boxes.conf()[i];
                let name = result.names.get(&cls).map(|s| s.as_str()).unwrap_or("unknown");
                println!("  {} {:.2}", name, conf);
            }
        }
    }

    Ok(())
}

§CLI Usage

The ultralytics-inference CLI provides a command-line interface for running YOLO inference:

# Install the CLI
cargo install ultralytics-inference

# Run with defaults (auto-downloads model and sample images)
ultralytics-inference predict

# Select task: auto-downloads the matching nano model
ultralytics-inference predict --task segment
ultralytics-inference predict --task pose
ultralytics-inference predict --task obb
ultralytics-inference predict --task classify

# Run on a specific image
ultralytics-inference predict --model yolo26n.onnx --source image.jpg

# Run on a directory of images
ultralytics-inference predict --model yolo26n.onnx --source images/

# With custom thresholds
ultralytics-inference predict -m yolo26n.onnx -s image.jpg --conf 0.5 --iou 0.7

# Filter by class IDs
ultralytics-inference predict --source image.jpg --classes "0,1,2"

# With visualization window
ultralytics-inference predict --model yolo26n.onnx --source video.mp4 --show

# Save annotated results
ultralytics-inference predict --model yolo26n.onnx --source image.jpg --save

# Save individual frames for video input
ultralytics-inference predict --source video.mp4 --save-frames

# Show help
ultralytics-inference help

# Show version
ultralytics-inference version

CLI Options:

OptionShortDescriptionDefault
--model-mPath to ONNX model file; auto-downloaded if a known YOLO26/YOLO11/YOLOv8 nameyolo26n.onnx
--taskTask type (detect, segment, pose, obb, classify, semantic*); selects nano model when --model is omitteddetect
--source-sInput source (image, directory, glob, video, webcam index, or URL)Task-dependent sample assets
--confConfidence threshold0.25
--iouIoU threshold for NMS0.7
--max-detMaximum number of detections300
--imgszInference image sizeModel metadata
--rectEnable rectangular inference (minimal padding)true
--batchBatch size for inference1
--halfUse FP16 half-precision inferencefalse
--saveSave annotated results to runs/<task>/predicttrue
--save-framesSave individual frames for video inputfalse
--save-jsonSave semantic segmentation class-map PNGs for external evaluationfalse
--showDisplay results in a windowfalse
--deviceDevice (cpu, cuda:0, coreml, directml:0, openvino, tensorrt:0, xnnpack)cpu
--verboseShow verbose outputtrue
--classesFilter by class IDs, e.g. 0 or "0,1,2" or "[0, 1, 2]"all classes

* semantic (semantic segmentation) is YOLO26-only.

§Task-Specific Examples

The library supports all YOLO tasks. Export models from Python:

# Detection (default)
yolo export model=yolo26n.pt format=onnx

# Segmentation
yolo export model=yolo26n-seg.pt format=onnx

# Pose Estimation
yolo export model=yolo26n-pose.pt format=onnx

# Classification
yolo export model=yolo26n-cls.pt format=onnx

# Oriented Bounding Boxes
yolo export model=yolo26n-obb.pt format=onnx

# Semantic Segmentation (YOLO26 only)
yolo export model=yolo26n-sem.pt format=onnx

The task is auto-detected from ONNX metadata:

use ultralytics_inference::YOLOModel;

// Detection model - returns bounding boxes
let mut model = YOLOModel::load("yolo26n.onnx")?;
let results = model.predict("image.jpg")?;
if let Some(ref boxes) = results[0].boxes {
    println!("Found {} detections", boxes.len());
    for i in 0..boxes.len() {
        let cls = boxes.cls()[i] as usize;
        let conf = boxes.conf()[i];
        let name = results[0].names.get(&cls).map(|s| s.as_str()).unwrap_or("unknown");
        println!("  {} {:.2}", name, conf);
    }
}

// Segmentation model - returns instance masks
let mut model = YOLOModel::load("yolo26n-seg.onnx")?;
let results = model.predict("image.jpg")?;
if let Some(ref masks) = results[0].masks {
    println!("Found {} instance masks", masks.len());
}

// Pose model - returns keypoints
let mut model = YOLOModel::load("yolo26n-pose.onnx")?;
let results = model.predict("image.jpg")?;
if let Some(ref keypoints) = results[0].keypoints {
    println!("Found {} poses", keypoints.len());
}

// OBB model - returns oriented bounding boxes
let mut model = YOLOModel::load("yolo26n-obb.onnx")?;
let results = model.predict("image.jpg")?;
if let Some(ref obb) = results[0].obb {
    println!("Found {} oriented boxes", obb.len());
    for i in 0..obb.len() {
        let conf = obb.conf()[i];
        let cls = obb.cls()[i] as usize;
        let name = results[0].names.get(&cls).map(|s| s.as_str()).unwrap_or("unknown");
        println!("  {} {:.2}", name, conf);
    }
}

// Classification model - returns probabilities
let mut model = YOLOModel::load("yolo26n-cls.onnx")?;
let results = model.predict("image.jpg")?;
if let Some(ref probs) = results[0].probs {
    println!("Top-1: class {} ({:.1}%)", probs.top1(), probs.top1conf() * 100.0);
}

// Semantic segmentation model (YOLO26 only) - returns a per-pixel class map
let mut model = YOLOModel::load("yolo26n-sem.onnx")?;
let results = model.predict("image.jpg")?;
if let Some(ref sem) = results[0].semantic_mask {
    println!("Semantic mask shape: {:?}", sem.data.shape());
}

§Custom Configuration

Use the builder pattern to customize inference settings:

use ultralytics_inference::InferenceConfig;

let config = InferenceConfig::new()
    .with_confidence(0.5)    // Confidence threshold
    .with_iou(0.45)          // NMS IoU threshold
    .with_max_det(300) // Max detections per image
    .with_imgsz(640, 640);   // Input image size

§Hardware Acceleration

See the CUDA / TensorRT acceleration guide for setup, requirements, and the zero-copy GPU preprocess fast path.

Enable hardware acceleration with Cargo features:

# NVIDIA CUDA
cargo build --release --features cuda

# NVIDIA TensorRT
cargo build --release --features tensorrt

# NVIDIA GPU preprocess + zero-copy device input
# (requires CUDA toolkit; see docs/CUDA.md)
cargo build --release --features cuda-preprocess

# Apple CoreML
cargo build --release --features coreml

# Intel OpenVINO
cargo build --release --features openvino

§Results API

The Results struct provides access to inference outputs:

  • Boxes - Bounding boxes with xyxy(), xywh(), xyxyn(), xywhn(), conf(), cls() methods
  • Masks - Segmentation masks with data, orig_shape fields
  • Keypoints - Pose keypoints with xy(), xyn(), conf() methods
  • Probs - Classification probabilities with top1(), top5(), top1conf(), top5conf() methods
  • Obb - Oriented bounding boxes with xyxyxyxy(), xywhr(), conf(), cls() methods

§Module Overview

ModuleDescription
modelCore YOLOModel for loading models and running inference
resultsOutput types (Results, Boxes, Masks, etc.)
inferenceInferenceConfig for customizing inference settings
sourceInput source handling (Source, SourceIterator)
taskYOLO task types (Task: Detect, Segment, Pose, Classify, Obb, Semantic)
errorError types (InferenceError, Result)
preprocessingImage preprocessing utilities
postprocessingPost-processing for all tasks (NMS/decode for detection; argmax for semantic segmentation)
metadataONNX model metadata parsing

§Feature Flags

Default features (enabled unless --no-default-features is passed): annotate, visualize.

FeatureDescription
annotateImage annotation for --save (default)
visualizeReal-time window display for --show (default)
videoVideo file decoding/encoding (requires FFmpeg)
cudaNVIDIA CUDA acceleration
tensorrtNVIDIA TensorRT optimization
coremlApple CoreML (macOS/iOS)
openvinoIntel OpenVINO
onednnIntel oneDNN
rocmAMD ROCm
migraphxAMD MIGraphX
directmlDirectML (Windows)
nnapiAndroid Neural Networks API
qnnQualcomm Neural Networks
xnnpackXNNPACK (cross-platform)
aclARM Compute Library
armnnARM NN
tvmApache TVM
rknpuRockchip NPU
cannHuawei CANN
webgpuWebGPU
azureAzure
nvidiaConvenience: cuda + tensorrt
amdConvenience: rocm + migraphx
intelConvenience: openvino + onednn
mobileConvenience: nnapi + coreml + qnn
allConvenience: annotate + visualize + video

§License

This project is dual-licensed under AGPL-3.0 for open-source use or Ultralytics Enterprise License for commercial applications.

Re-exports§

pub use device::Device;
pub use error::InferenceError;
pub use error::Result;
pub use inference::InferenceConfig;
pub use model::YOLOModel;
pub use results::Boxes;
pub use results::Keypoints;
pub use results::Masks;
pub use results::Obb;
pub use results::Probs;
pub use results::Results;
pub use results::SemanticMask;
pub use results::Speed;
pub use source::Source;
pub use source::SourceIterator;
pub use source::SourceMeta;
pub use task::Task;
pub use metadata::ModelMetadata;
pub use preprocessing::PreprocessResult;
pub use preprocessing::preprocess_image;
pub use preprocessing::preprocess_image_with_precision;

Modules§

annotateannotate
Image annotation utilities.
batch
Batch processing module for YOLO inference.
cli
CLI module for running inference.
cuda_guide
CUDA / TensorRT acceleration guide rendered from docs/CUDA.md.
device
Hardware device support and abstraction.
download
Model downloading utilities.
error
Error types for the inference library.
inference
Inference configuration and common types.
io
I/O utilities for saving results including video encoding.
logging
Logging utilities.
metadata
ONNX model metadata parsing.
model
YOLO model loading and inference.
postprocessing
Post-processing for YOLO model outputs.
preprocessing
Image preprocessing for YOLO inference.
results
Results classes for YOLO inference output.
source
Input source handling for YOLO inference.
task
Task definitions for YOLO models.
utils
Utility functions for the inference library
visualizer
Visualization tools for inference results.

Macros§

error
Macro for error messages.
info
Macro for standard info messages.
section
Macro for section headers.
success
Macro for success messages.
verbose
Macro for verbose messages.
warn
Macro for warning messages.

Constants§

DISPLAY_NAME
Application display name.
NAME
Library name.
VERSION
Library version.