ai-dataloader 0.3.0

Rust implementation to the PyTorch DataLoader
Documentation

CI Coverage

ai-dataloader

A rust port of pytorch dataloader library.

Note: This project is still heavily in development and is at an early stage.

Highlights

  • Iterable or indexable (Map style) DataLoader.
  • Customizable Sampler, BatchSampler and collate_fn.
  • Default collate function that will cover most of the uses cases, supporting nested type.
  • Shuffling for iterable and indexable DataLoader.

Feel free to read the doc that contains tutorials for pytorch user.

Examples

Examples can be found in the examples folder but here there is a simple one


let loader = DataLoader::builder(vec![(0, "hola"), (1, "hello"), (2, "hallo"), (3, "bonjour")]).batch_size(2).shuffle().build();

for (label, text) in &loader {     
    println!("Label {label:?}");
    println!("Text {text:?}");
}

GPU

ndarray can't currently run on the GPU.

But if you're tensor library can be created from a ndarray, it could be easily integrated.

I've planned to integrate different tensor libraries using features, file free to add an issue if you want to submit one.

Next Features

This features could be added in the future:

  • customizable BatchSampler (by using a trait)
  • collect function as a closure
  • RandomSampler with replacement
  • parallel dataloader (using rayon?)
  • distributed dataloader