ai-dataloader 0.6.1

Rust implementation to the PyTorch DataLoader
Documentation
[![CI](https://github.com/Tudyx/ai-dataloader/actions/workflows/ci.yml/badge.svg)](https://github.com/Tudyx/ai-dataloader/actions/workflows/ci.yml) 
[![Crates.io](https://img.shields.io/crates/v/ai-dataloader.svg)](https://crates.io/crates/ai-dataloader)
[![Documentation](https://docs.rs/ai-dataloader/badge.svg)](https://docs.rs/ai-dataloader/)

# ai-dataloader

A rust port of [`pytorch`](https://pytorch.org/) `dataloader` library.

> Note: This project is still heavily in development and is at an early stage.

## Highlights

- Iterable or indexable (Map style) `DataLoader`.
- Customizable `Sampler`, `BatchSampler` and `collate_fn`.
- Parallel dataloader using [`rayon`] for indexable dataloader (experimental).
- Integration with [`ndarray`]https://docs.rs/ndarray/latest/ndarray/ and [`tch-rs`]https://github.com/LaurentMazare/tch-rs, CPU and GPU support.
- Default collate function that will automatically collate most of your type (supporting nesting).
- Shuffling for iterable and indexable `DataLoader`.

More info in the [documentation](https://docs.rs/ai-dataloader/).

## Examples

Examples can be found in the [examples](examples/) folder but here there is a simple one

```rust 
use ai_dataloader::DataLoader;
let loader = DataLoader::builder(vec![(0, "hola"), (1, "hello"), (2, "hallo"), (3, "bonjour")]).batch_size(2).shuffle().build();

for (label, text) in &loader {     
    println!("Label {label:?}");
    println!("Text {text:?}");
}
```

## [`tch-rs`]https://github.com/LaurentMazare/tch-rs integration

In order to collate your data into torch tensor that can run on the GPU, you must activate the `tch` feature.

This feature relies on the tch crate for bindings to the C++ `libTorch` API. The `libtorch` library is required can be downloaded either automatically or manually. The following provides a reference on how to set up your environment to use these bindings, please refer to the [tch](https://github.com/LaurentMazare/tch-rs) for detailed information or support.

### Next Features

This features could be added in the future:

- `RandomSampler` with replacement
- parallel `dataloader` for iterable dataset
- distributed `dataloader`


### MSRV

The current MSRV is 1.60.

[`rayon`]: https://docs.rs/rayon/latest/rayon/