usls is a cross-platform Rust library powered by ONNX Runtime for efficient inference of SOTA vision and vision-language models (typically under 1B parameters).
📚 Documentation
🚀 Quick Start
Run the YOLO demo to explore various YOLO-Series models with different tasks, precision, and execution providers:
- Tasks:
detect,segment,pose,classify,obb - Versions:
YOLOv5,YOLOv6,YOLOv7,YOLOv8,YOLOv9,YOLOv10,YOLO11,YOLOv12,YOLOv13 - Scales:
n,s,m,l,x - Precision:
fp32,fp16,q8,q4,q4f16,bnb4 - Execution Providers:
CPU,CUDA,TensorRT,CoreML,OpenVINO, and more
# CPU: Object detection, YOLOv8n, FP16
# NVIDIA CUDA: Instance segmentation, YOLO11m
# NVIDIA TensorRT
# Apple Silicon CoreML
# Intel OpenVINO: CPU/GPU/VPU acceleration
# Show all available options
See YOLO Examples for more details and use cases.
⚙️ Installation
Add the following to your Cargo.toml:
[]
# Use GitHub version
= { = "https://github.com/jamjamjon/usls", = [ "cuda" ] }
# Alternative: Use crates.io version
= { = "latest-version", = [ "cuda" ] }
📦 Cargo Features
❕ Features in italics are enabled by default.
-
Runtime & Utilities
ort-download-binaries: Auto-download ONNX Runtime binaries from pyke.ort-load-dynamic: Linking ONNX Runtime by your self. Use this ifpykedoesn't provide prebuilt binaries for your platform or you want to link your local ONNX Runtime library. See Linking Guide for more details.viewer: Image/video visualization (minifb). Similar to OpenCVimshow(). See example.video: Video I/O support (video-rs). Enable this to read/write video streams. See examplehf-hub: Hugging Face Hub support for downloading models from Hugging Face repositories.tokenizers: Tokenizer support for vision-language models. Automatically enabled when using vision-language model features (blip, clip, florence2, grounding-dino, fastvlm, moondream2, owl, smolvlm, trocr, yoloe).slsl: SLSL tensor library support. Automatically enabled when usingyoloorclipfeatures.
-
Execution Providers
Hardware acceleration for inference.
cuda,tensorrt: NVIDIA GPU accelerationcoreml: Apple Silicon accelerationopenvino: Intel CPU/GPU/VPU accelerationonednn,directml,xnnpack,rocm,cann,rknpu,acl,nnapi,armnn,tvm,qnn,migraphx,vitis,azure: Various hardware/platform support
See ONNX Runtime docs and ORT performance guide for details.
-
Model Selection
Almost each model is a separate feature. Enable only what you need to reduce compile time and binary size.
yolo,sam,clip,image-classifier,dino,rtmpose,rtdetr,db, ...- All models:
all-models(enables all model features)
See Supported Models for the complete list with feature names.
⚡ Supported Models
| Model | Task / Description | Feature | Example |
|---|---|---|---|
| BEiT | Image Classification | image-classifier |
demo |
| ConvNeXt | Image Classification | image-classifier |
demo |
| FastViT | Image Classification | image-classifier |
demo |
| MobileOne | Image Classification | image-classifier |
demo |
| DeiT | Image Classification | image-classifier |
demo |
| DINOv2 | Vision Embedding | dino |
demo |
| DINOv3 | Vision Embedding | dino |
demo |
| YOLOv5 | Image ClassificationObject DetectionInstance Segmentation | yolo |
demo |
| YOLOv6 | Object Detection | yolo |
demo |
| YOLOv7 | Object Detection | yolo |
demo |
| YOLOv8YOLO11 | Object DetectionInstance SegmentationImage ClassificationOriented Object DetectionKeypoint Detection | yolo |
demo |
| YOLOv9 | Object Detection | yolo |
demo |
| YOLOv10 | Object Detection | yolo |
demo |
| YOLOv12 | Image ClassificationObject DetectionInstance Segmentation | yolo |
demo |
| YOLOv13 | Object Detection | yolo |
demo |
| RT-DETR | Object Detection | rtdetr |
demo |
| RF-DETR | Object Detection | rfdetr |
demo |
| PP-PicoDet | Object Detection | picodet |
demo |
| DocLayout-YOLO | Object Detection | picodet |
demo |
| D-FINE | Object Detection | rtdetr |
demo |
| DEIM | Object Detection | rtdetr |
demo |
| DEIMv2 | Object Detection | rtdetr |
demo |
| RTMPose | Keypoint Detection | rtmpose |
demo |
| DWPose | Keypoint Detection | rtmpose |
demo |
| RTMW | Keypoint Detection | rtmpose |
demo |
| RTMO | Keypoint Detection | rtmo |
demo |
| SAM | Segment Anything | sam |
demo |
| SAM2 | Segment Anything | sam |
demo |
| MobileSAM | Segment Anything | sam |
demo |
| EdgeSAM | Segment Anything | sam |
demo |
| SAM-HQ | Segment Anything | sam |
demo |
| FastSAM | Instance Segmentation | yolo |
demo |
| YOLO-World | Open-Set Detection With Language | yolo |
demo |
| YOLOE | Open-Set Detection And Segmentation | yoloe |
demo-prompt-freedemo-prompt(visual & textual) |
| GroundingDINO | Open-Set Detection With Language | grounding-dino |
demo |
| CLIP | Vision-Language Embedding | clip |
demo |
| jina-clip-v1 | Vision-Language Embedding | clip |
demo |
| jina-clip-v2 | Vision-Language Embedding | clip |
demo |
| mobileclip & mobileclip2 | Vision-Language Embedding | clip |
demo |
| BLIP | Image Captioning | blip |
demo |
| DB(PaddleOCR-Det) | Text Detection | db |
demo |
| FAST | Text Detection | db |
demo |
| LinkNet | Text Detection | db |
demo |
| SVTR(PaddleOCR-Rec) | Text Recognition | svtr |
demo |
| SLANet | Tabel Recognition | slanet |
demo |
| TrOCR | Text Recognition | trocr |
demo |
| YOLOPv2 | Panoptic Driving Perception | yolop |
demo |
| DepthAnything v1DepthAnything v2 | Monocular Depth Estimation | depth-anything |
demo |
| DepthPro | Monocular Depth Estimation | depth-pro |
demo |
| MODNet | Image Matting | modnet |
demo |
| Sapiens | Foundation for Human Vision Models | sapiens |
demo |
| Florence2 | A Variety of Vision Tasks | florence2 |
demo |
| Moondream2 | Open-Set Object DetectionOpen-Set Keypoints DetectionImage CaptionVisual Question Answering | moondream2 |
demo |
| OWLv2 | Open-Set Object Detection | owl |
demo |
| SmolVLM(256M, 500M) | Visual Question Answering | smolvlm |
demo |
| FastVLM(0.5B) | Vision Language Models | fastvlm |
demo |
| RMBG(1.4, 2.0) | Image SegmentationBackground Removal | rmbg |
demo |
| BEN2 | Image SegmentationBackground Removal | ben2 |
demo |
| MediaPipe: Selfie-segmentation | Image Segmentation | mediapipe-segmenter |
demo |
| Swin2SR | Image Super-Resolution and Restoration | swin2sr |
demo |
| APISR | Real-World Anime Super-Resolution | apisr |
demo |
❓ FAQ
See issues or open a new discussion.
🤝 Contributing
Contributions are welcome! If you have suggestions, bug reports, or want to add new features or models, feel free to open an issue or submit a pull request.
🙏 Acknowledgments
This project is built on top of ort (ONNX Runtime for Rust), which provides seamless Rust bindings for ONNX Runtime. Special thanks to the ort maintainers.
Thanks to all the open-source libraries and their maintainers that make this project possible. See Cargo.toml for a complete list of dependencies.
📜 License
This project is licensed under LICENSE.