pub trait Dataset: Len + GetSample { }
Expand description

A dataset is just something that has a length and is indexable. A Vec of dataset collate output must also be collatable.

We use a custom GetSample trait instead of std::ops::Index because it provides more flexibility. Indeed we could have provided this implementation:

use ai_dataloader::collate::Collate;
use ai_dataloader::Len;

pub trait Dataset<T>: Len + std::ops::Index<usize>
where
T: Collate<Vec<Self::Output>>,
Self::Output: Sized,
{
}

But as Index::Output must refer as something exist, it will not cover most of our use cases. For instance if the dataset is something like that:

struct Dataset {
    labels: Vec<i32>,
    texts: Vec<String>,
}

And we want to return a tuple (label, text) when indexing, it will no be possible with std:ops::Index.

Implementations on Foreign Types

Implementors