usls 0.0.15

A Rust library integrated with ONNXRuntime, providing a collection of ML models.
Documentation
<p align="center">
    <h2 align="center">usls</h2>
</p>

<p align="center">
    | <a href="https://docs.rs/usls"><strong>Documentation</strong></a> |
    <br>
    <br>
    <a href='https://github.com/microsoft/onnxruntime/releases'>
      <img src='https://img.shields.io/badge/ONNXRuntime-v1.19.x-239DFF?style=for-the-badge&logo=onnx' alt='ONNXRuntime Release Page'>
    </a>
    <a href='https://developer.nvidia.com/cuda-toolkit-archive'>
      <img src='https://img.shields.io/badge/CUDA-12.x-76B900?style=for-the-badge&logo=nvidia' alt='CUDA Toolkit Page'>
    </a>
    <a href='https://developer.nvidia.com/tensorrt'>
      <img src='https://img.shields.io/badge/TensorRT-10.x.x.x-76B900?style=for-the-badge&logo=nvidia' alt='TensorRT Page'>
    </a>
</p>

<p align="center">
   <a href='https://crates.io/crates/usls'>
      <img src='https://img.shields.io/crates/v/usls.svg?style=for-the-badge&logo=rust' alt='Crates Page'>
   </a>
   <!-- Documentation Badge -->
<!--    <a href="https://docs.rs/usls">
      <img src='https://img.shields.io/badge/Documents-usls-000000?style=for-the-badge&logo=docs.rs' alt='Documentation'>
   </a> -->
   <!-- Downloads Badge -->
   <a href="">
       <img alt="Crates.io Total Downloads" src="https://img.shields.io/crates/d/usls?style=for-the-badge&color=3ECC5F">
   </a>
    
</p>

**`usls`** is a Rust library integrated with **ONNXRuntime** that provides a collection of state-of-the-art models for **Computer Vision** and **Vision-Language** tasks, including:

- **YOLO Models**: [YOLOv5]https://github.com/ultralytics/yolov5, [YOLOv6]https://github.com/meituan/YOLOv6, [YOLOv7]https://github.com/WongKinYiu/yolov7, [YOLOv8]https://github.com/ultralytics/ultralytics, [YOLOv9]https://github.com/WongKinYiu/yolov9, [YOLOv10]https://github.com/THU-MIG/yolov10
- **SAM Models**: [SAM]https://github.com/facebookresearch/segment-anything, [SAM2]https://github.com/facebookresearch/segment-anything-2, [MobileSAM]https://github.com/ChaoningZhang/MobileSAM, [EdgeSAM]https://github.com/chongzhou96/EdgeSAM, [SAM-HQ]https://github.com/SysCV/sam-hq, [FastSAM]https://github.com/CASIA-IVA-Lab/FastSAM
- **Vision Models**: [RTDETR]https://arxiv.org/abs/2304.08069, [RTMO]https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo, [DB]https://arxiv.org/abs/1911.08947, [SVTR]https://arxiv.org/abs/2205.00159, [Depth-Anything-v1-v2]https://github.com/LiheYoung/Depth-Anything, [DINOv2]https://github.com/facebookresearch/dinov2, [MODNet]https://github.com/ZHKKKe/MODNet, [Sapiens]https://arxiv.org/abs/2408.12569
- **Vision-Language Models**: [CLIP]https://github.com/openai/CLIP, [BLIP]https://arxiv.org/abs/2201.12086, [GroundingDINO]https://github.com/IDEA-Research/GroundingDINO, [YOLO-World]https://github.com/AILab-CVC/YOLO-World, [Florence2]https://arxiv.org/abs/2311.06242

<details>
<summary>Click to expand Supported Models</summary>

## Supported Models

| Model                                                               | Task / Type                                                                                   | Example                    | CUDA f32 | CUDA f16 | TensorRT f32 | TensorRT f16 |
|---------------------------------------------------------------------|----------------------------------------------------------------------------------------------|----------------------------|----------|----------|--------------|--------------|
| [YOLOv5]https://github.com/ultralytics/yolov5                    | Classification<br>Object Detection<br>Instance Segmentation                                       | [demo]examples/yolo      |||||
| [YOLOv6]https://github.com/meituan/YOLOv6                        | Object Detection                                                                             | [demo]examples/yolo      |||||
| [YOLOv7]https://github.com/WongKinYiu/yolov7                     | Object Detection                                                                             | [demo]examples/yolo      |||||
| [YOLOv8]https://github.com/ultralytics/ultralytics                | Object Detection<br>Instance Segmentation<br>Classification<br>Oriented Object Detection<br>Keypoint Detection | [demo]examples/yolo      |||||
| [YOLOv9]https://github.com/WongKinYiu/yolov9                     | Object Detection                                                                             | [demo]examples/yolo      |||||
| [YOLOv10]https://github.com/THU-MIG/yolov10                      | Object Detection                                                                             | [demo]examples/yolo      |||||
| [RTDETR]https://arxiv.org/abs/2304.08069                         | Object Detection                                                                             | [demo]examples/yolo      |||||
| [FastSAM]https://github.com/CASIA-IVA-Lab/FastSAM                 | Instance Segmentation                                                                         | [demo]examples/yolo      |||||
| [SAM]https://github.com/facebookresearch/segment-anything         | Segment Anything                                                                             | [demo]examples/sam       |||              |              |
| [SAM2]https://github.com/facebookresearch/segment-anything-2      | Segment Anything                                                                             | [demo]examples/sam       |||              |              |
| [MobileSAM]https://github.com/ChaoningZhang/MobileSAM             | Segment Anything                                                                             | [demo]examples/sam       |||              |              |
| [EdgeSAM]https://github.com/chongzhou96/EdgeSAM                  | Segment Anything                                                                             | [demo]examples/sam       |||              |              |
| [SAM-HQ]https://github.com/SysCV/sam-hq                          | Segment Anything                                                                             | [demo]examples/sam       |||              |              |
| [YOLO-World]https://github.com/AILab-CVC/YOLO-World               | Object Detection                                                                             | [demo]examples/yolo      |||||
| [DINOv2]https://github.com/facebookresearch/dinov2               | Vision-Self-Supervised                                                                        | [demo]examples/dinov2    |||||
| [CLIP]https://github.com/openai/CLIP                             | Vision-Language                                                                             | [demo]examples/clip      ||| ✅ Visual<br>❌ Textual | ✅ Visual<br>❌ Textual |
| [BLIP]https://github.com/salesforce/BLIP                         | Vision-Language                                                                             | [demo]examples/blip      ||| ✅ Visual<br>❌ Textual | ✅ Visual<br>❌ Textual |
| [DB]https://arxiv.org/abs/1911.08947                             | Text Detection                                                                               | [demo]examples/db        |||||
| [SVTR]https://arxiv.org/abs/2205.00159                           | Text Recognition                                                                            | [demo]examples/svtr      |||||
| [RTMO]https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo | Keypoint Detection                                                                          | [demo]examples/rtmo      |||||
| [YOLOPv2]https://arxiv.org/abs/2208.11434                        | Panoptic Driving Perception                                                                   | [demo]examples/yolop     |||||
| [Depth-Anything]https://github.com/LiheYoung/Depth-Anything      | Monocular Depth Estimation                                                                    | [demo]examples/depth-anything |||||
| [MODNet]https://github.com/ZHKKKe/MODNet                         | Image Matting                                                                               | [demo]examples/modnet    |||||
| [GroundingDINO]https://github.com/IDEA-Research/GroundingDINO   | Open-Set Detection With Language                                                             | [demo]examples/grounding-dino |||              |              |
| [Sapiens]https://github.com/facebookresearch/sapiens/tree/main   | Body Part Segmentation                                   | [demo]examples/sapiens |||              |              |
| [Florence2]https://arxiv.org/abs/2311.06242   | a Variety of Vision Tasks | [demo]examples/florence2 |||              |              |



</details>


## ⛳️ ONNXRuntime Linking 

You have two options to link the ONNXRuntime library

- ### Option 1: Manual Linking

    - #### For detailed setup instructions, refer to the [ORT documentation]https://ort.pyke.io/setup/linking.

    - #### For Linux or macOS Users:
        - Download the ONNX Runtime package from the [Releases page]https://github.com/microsoft/onnxruntime/releases.
        - Set up the library path by exporting the `ORT_DYLIB_PATH` environment variable:
           ```shell
           export ORT_DYLIB_PATH=/path/to/onnxruntime/lib/libonnxruntime.so.1.19.0
           ```
       
- ### Option 2: Automatic Download
  Just use `--features auto`
  ```shell
  cargo run -r --example yolo --features auto
  ```


## 🎈 Demo

```Shell
cargo run -r --example yolo   # blip, clip, yolop, svtr, db, ...
```

## 🥂 Integrate Into Your Own Project

- #### Add `usls` as a dependency to your project's `Cargo.toml`
    ```Shell
    cargo add usls
    ```
    
    Or use a specific commit:
    ```Toml
    [dependencies]
    usls = { git = "https://github.com/jamjamjon/usls", rev = "commit-sha" }
    ```
    
- #### Follow the pipeline
    - Build model with the provided `models` and `Options`
    - Load images, video and stream with `DataLoader`
    - Do inference
    - Annotate inference results with `Annotator`
    - Retrieve inference results from `Vec<Y>`
           
      ```rust
        use usls::{models::YOLO, Annotator, DataLoader, Nms, Options, Vision, YOLOTask, YOLOVersion};
    
        fn main() -> anyhow::Result<()> {
            // Build model with Options
            let options = Options::new()
                .with_trt(0)
                .with_model("yolo/v8-m-dyn.onnx")?
                .with_yolo_version(YOLOVersion::V8) // YOLOVersion: V5, V6, V7, V8, V9, V10, RTDETR
                .with_yolo_task(YOLOTask::Detect) // YOLOTask: Classify, Detect, Pose, Segment, Obb
                .with_i00((1, 2, 4).into())
                .with_i02((0, 640, 640).into())
                .with_i03((0, 640, 640).into())
                .with_confs(&[0.2]);
            let mut model = YOLO::new(options)?;
        
            // Build DataLoader to load image(s), video, stream
            let dl = DataLoader::new(
                // "./assets/bus.jpg", // local image
                // "images/bus.jpg",  // remote image
                // "../images-folder",  // local images (from folder)
                // "../demo.mp4",  // local video
                // "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4",  // online video
                "rtsp://admin:kkasd1234@192.168.2.217:554/h264/ch1/",  // stream
            )?
            .with_batch(2)  // iterate with batch_size = 2
            .build()?;
        
            // Build annotator
            let annotator = Annotator::new()
                .with_bboxes_thickness(4)
                .with_saveout("YOLO-DataLoader");
        
            // Run and annotate results
            for (xs, _) in dl {
                let ys = model.forward(&xs, false)?;
                annotator.annotate(&xs, &ys);
      
                // Retrieve inference results
                for y in ys {
                    // bboxes
                    if let Some(bboxes) = y.bboxes() {
                        for bbox in bboxes {
                            println!(
                                "Bbox: {}, {}, {}, {}, {}, {}",
                                bbox.xmin(),
                                bbox.ymin(),
                                bbox.xmax(),
                                bbox.ymax(),
                                bbox.confidence(),
                                bbox.id(),
                            );
                        }
                    }
                }
            }
        
            Ok(())
        }
      ```


## 📌 License
This project is licensed under [LICENSE](LICENSE).