usls

A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models including YOLOv5, YOLOv8, YOLOv9, YOLOv10, RTDETR, CLIP, DINOv2, FastSAM, YOLO-World, BLIP, PaddleOCR, Depth-Anything, MODNet and others.

Monocular Depth Estimation

Panoptic Driving Perception	Text-Detection-Recognition

Portrait Matting

Supported Models

Model	Task / Type	Example	CUDAf32	CUDAf16	TensorRTf32	TensorRTf16
YOLOv5	ClassificationObject DetectionInstance Segmentation	demo	✅	✅	✅	✅
YOLOv6	Object Detection	demo	✅	✅	✅	✅
YOLOv7	Object Detection	demo	✅	✅	✅	✅
YOLOv8	Object DetectionInstance SegmentationClassificationOriented Object DetectionKeypoint Detection	demo	✅	✅	✅	✅
YOLOv9	Object Detection	demo	✅	✅	✅	✅
YOLOv10	Object Detection	demo	✅	✅	✅	✅
RTDETR	Object Detection	demo	✅	✅	✅	✅
FastSAM	Instance Segmentation	demo	✅	✅	✅	✅
YOLO-World	Object Detection	demo	✅	✅	✅	✅
DINOv2	Vision-Self-Supervised	demo	✅	✅	✅	✅
CLIP	Vision-Language	demo	✅	✅	✅ visual❌ textual	✅ visual❌ textual
BLIP	Vision-Language	demo	✅	✅	✅ visual❌ textual	✅ visual❌ textual
DB	Text Detection	demo	✅	✅	✅	✅
SVTR	Text Recognition	demo	✅	✅	✅	✅
RTMO	Keypoint Detection	demo	✅	✅	❌	❌
YOLOPv2	Panoptic Driving Perception	demo	✅	✅	✅	✅
Depth-Anything(v1, v2)	Monocular Depth Estimation	demo	✅	✅	❌	❌
MODNet	Image Matting	demo	✅	✅	✅	✅

Installation

Refer to ort docs

Download from ONNXRuntime Releases

Then linking

export ORT_DYLIB_PATH=/Users/qweasd/Desktop/onnxruntime-osx-arm64-1.17.1/lib/libonnxruntime.1.17.1.dylib

Quick Start

cargo run -r --example yolo   # blip, clip, yolop, svtr, db, ...

Integrate into your own project

1. Add `usls` as a dependency to your project's `Cargo.toml`

cargo add usls

Or you can use specific commit

usls = { git = "https://github.com/jamjamjon/usls", rev = "???sha???"}

2. Build model

let options = Options::default()
    .with_yolo_version(YOLOVersion::V5)  // YOLOVersion: V5, V6, V7, V8, V9, V10, RTDETR
    .with_yolo_task(YOLOTask::Classify)  // YOLOTask: Classify, Detect, Pose, Segment, Obb
    .with_model("xxxx.onnx")?;
let mut model = YOLO::new(options)?;

If you want to run your model with TensorRT or CoreML

let options = Options::default()
    .with_trt(0) // using cuda by default
    // .with_coreml(0)

If your model has dynamic shapes

let options = Options::default()
    .with_i00((1, 2, 4).into()) // dynamic batch
    .with_i02((416, 640, 800).into())   // dynamic height
    .with_i03((416, 640, 800).into())   // dynamic width

If you want to set a confidence for each category

let options = Options::default()
    .with_confs(&[0.4, 0.15]) // class_0: 0.4, others: 0.15

Go check Options for more model options.

3. Load images

Build DataLoader to load images

let dl = DataLoader::default()
    .with_batch(model.batch.opt as usize)
    .load("./assets/")?;

for (xs, _paths) in dl {
    let _y = model.run(&xs)?;
}

Or simply read one image

let x = vec![DataLoader::try_read("./assets/bus.jpg")?];
let y = model.run(&x)?;

4. Annotate and save

let annotator = Annotator::default().with_saveout("YOLO");
annotator.annotate(&x, &y);

5. Get results

The inference outputs of provided models will be saved to Vec<Y>.

You can get detection bboxes with y.bboxes():

let ys = model.run(&xs)?;
for y in ys {
    // bboxes
    if let Some(bboxes) = y.bboxes() {
        for bbox in bboxes {
            println!(
                "Bbox: {}, {}, {}, {}, {}, {}",
                bbox.xmin(),
                bbox.ymin(),
                bbox.xmax(),
                bbox.ymax(),
                bbox.confidence(),
                bbox.id(),
            )
        }
    }
}

Other: Docs

usls 0.0.6