usls 0.1.7

usls is a cross-platform Rust library powered by ONNX Runtime for efficient inference of SOTA vision and vision-language models (typically under 1B parameters).

📚 Documentation

🚀 Quick Start

Run the YOLO demo to explore various YOLO-Series models with different tasks, precision, and execution providers:

Tasks: detect, segment, pose, classify, obb
Versions: YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOv9, YOLOv10, YOLO11, YOLOv12, YOLOv13
Scales: n, s, m, l, x
Precision: fp32, fp16, q8, q4, q4f16, bnb4
Execution Providers: CPU, CUDA, TensorRT, CoreML, OpenVINO, and more

# CPU: Object detection, YOLOv8n, FP16
cargo run -r --example yolo -- --task detect --ver 8 --scale n --dtype fp16

# NVIDIA CUDA: Instance segmentation, YOLO11m
cargo run -r -F cuda --example yolo -- --task segment --ver 11 --scale m --device cuda:0

# NVIDIA TensorRT
cargo run -r -F tensorrt --example yolo -- --device tensorrt:0

# Apple Silicon CoreML
cargo run -r -F coreml --example yolo -- --device coreml

# Intel OpenVINO: CPU/GPU/VPU acceleration
cargo run -r -F openvino -F ort-load-dynamic --example yolo -- --device openvino:CPU

# Show all available options
cargo run -r --example yolo -- --help

See YOLO Examples for more details and use cases.

⚙️ Installation

Add the following to your Cargo.toml:

[dependencies]
# Use GitHub version
usls = { git = "https://github.com/jamjamjon/usls", features = [ "cuda" ] }

# Alternative: Use crates.io version
usls = { version = "latest-version", features = [ "cuda" ] }

📦 Cargo Features

❕ Features in italics are enabled by default.

Runtime & Utilities
- ort-download-binaries: Auto-download ONNX Runtime binaries from pyke.
- ort-load-dynamic: Linking ONNX Runtime by your self. Use this if pyke doesn't provide prebuilt binaries for your platform or you want to link your local ONNX Runtime library. See Linking Guide for more details.
- viewer: Image/video visualization (minifb). Similar to OpenCV imshow(). See example.
- video: Video I/O support (video-rs). Enable this to read/write video streams. See example
- hf-hub: Hugging Face Hub support for downloading models from Hugging Face repositories.
- tokenizers: Tokenizer support for vision-language models. Automatically enabled when using vision-language model features (blip, clip, florence2, grounding-dino, fastvlm, moondream2, owl, smolvlm, trocr, yoloe).
- slsl: SLSL tensor library support. Automatically enabled when using yolo or clip features.
Execution Providers

Hardware acceleration for inference.
- cuda, tensorrt: NVIDIA GPU acceleration
- coreml: Apple Silicon acceleration
- openvino: Intel CPU/GPU/VPU acceleration
- onednn, directml, xnnpack, rocm, cann, rknpu, acl, nnapi, armnn, tvm, qnn, migraphx, vitis, azure: Various hardware/platform support
See ONNX Runtime docs and ORT performance guide for details.
Model Selection

Almost each model is a separate feature. Enable only what you need to reduce compile time and binary size.
- yolo, sam, clip, image-classifier, dino, rtmpose, rtdetr, db, ...
- All models: all-models (enables all model features)
See Supported Models for the complete list with feature names.

⚡ Supported Models

Model	Task / Description	Feature	Example
BEiT	Image Classification	`image-classifier`	demo
ConvNeXt	Image Classification	`image-classifier`	demo
FastViT	Image Classification	`image-classifier`	demo
MobileOne	Image Classification	`image-classifier`	demo
DeiT	Image Classification	`image-classifier`	demo
DINOv2	Vision Embedding	`dino`	demo
DINOv3	Vision Embedding	`dino`	demo
YOLOv5	Image ClassificationObject DetectionInstance Segmentation	`yolo`	demo
YOLOv6	Object Detection	`yolo`	demo
YOLOv7	Object Detection	`yolo`	demo
YOLOv8YOLO11	Object DetectionInstance SegmentationImage ClassificationOriented Object DetectionKeypoint Detection	`yolo`	demo
YOLOv9	Object Detection	`yolo`	demo
YOLOv10	Object Detection	`yolo`	demo
YOLOv12	Image ClassificationObject DetectionInstance Segmentation	`yolo`	demo
YOLOv13	Object Detection	`yolo`	demo
RT-DETR	Object Detection	`rtdetr`	demo
RF-DETR	Object Detection	`rfdetr`	demo
PP-PicoDet	Object Detection	`picodet`	demo
DocLayout-YOLO	Object Detection	`picodet`	demo
D-FINE	Object Detection	`rtdetr`	demo
DEIM	Object Detection	`rtdetr`	demo
DEIMv2	Object Detection	`rtdetr`	demo
RTMPose	Keypoint Detection	`rtmpose`	demo
DWPose	Keypoint Detection	`rtmpose`	demo
RTMW	Keypoint Detection	`rtmpose`	demo
RTMO	Keypoint Detection	`rtmo`	demo
SAM	Segment Anything	`sam`	demo
SAM2	Segment Anything	`sam`	demo
MobileSAM	Segment Anything	`sam`	demo
EdgeSAM	Segment Anything	`sam`	demo
SAM-HQ	Segment Anything	`sam`	demo
FastSAM	Instance Segmentation	`yolo`	demo
YOLO-World	Open-Set Detection With Language	`yolo`	demo
YOLOE	Open-Set Detection And Segmentation	`yoloe`	demo-prompt-free demo-prompt(visual & textual)
GroundingDINO	Open-Set Detection With Language	`grounding-dino`	demo
CLIP	Vision-Language Embedding	`clip`	demo
jina-clip-v1	Vision-Language Embedding	`clip`	demo
jina-clip-v2	Vision-Language Embedding	`clip`	demo
mobileclip & mobileclip2	Vision-Language Embedding	`clip`	demo
BLIP	Image Captioning	`blip`	demo
DB(PaddleOCR-Det)	Text Detection	`db`	demo
FAST	Text Detection	`db`	demo
LinkNet	Text Detection	`db`	demo
SVTR(PaddleOCR-Rec)	Text Recognition	`svtr`	demo
SLANet	Tabel Recognition	`slanet`	demo
TrOCR	Text Recognition	`trocr`	demo
YOLOPv2	Panoptic Driving Perception	`yolop`	demo
DepthAnything v1DepthAnything v2	Monocular Depth Estimation	`depth-anything`	demo
DepthPro	Monocular Depth Estimation	`depth-pro`	demo
MODNet	Image Matting	`modnet`	demo
Sapiens	Foundation for Human Vision Models	`sapiens`	demo
Florence2	A Variety of Vision Tasks	`florence2`	demo
Moondream2	Open-Set Object DetectionOpen-Set Keypoints DetectionImage CaptionVisual Question Answering	`moondream2`	demo
OWLv2	Open-Set Object Detection	`owl`	demo
SmolVLM(256M, 500M)	Visual Question Answering	`smolvlm`	demo
FastVLM(0.5B)	Vision Language Models	`fastvlm`	demo
RMBG(1.4, 2.0)	Image SegmentationBackground Removal	`rmbg`	demo
BEN2	Image SegmentationBackground Removal	`ben2`	demo
MediaPipe: Selfie-segmentation	Image Segmentation	`mediapipe-segmenter`	demo
Swin2SR	Image Super-Resolution and Restoration	`swin2sr`	demo
APISR	Real-World Anime Super-Resolution	`apisr`	demo

❓ FAQ

See issues or open a new discussion.

🤝 Contributing

Contributions are welcome! If you have suggestions, bug reports, or want to add new features or models, feel free to open an issue or submit a pull request.

🙏 Acknowledgments

This project is built on top of ort (ONNX Runtime for Rust), which provides seamless Rust bindings for ONNX Runtime. Special thanks to the ort maintainers.

Thanks to all the open-source libraries and their maintainers that make this project possible. See Cargo.toml for a complete list of dependencies.

📜 License

This project is licensed under LICENSE.