pub trait Dataset: Len + GetSample { }
Expand description
A dataset is just something that has a length and is indexable.
A Vec
of dataset
collate output must also be collatable.
We use a custom GetSample
trait instead of std::ops::Index
because
it provides more flexibility.
Indeed we could have provided this implementation:
use ai_dataloader::collate::Collate;
use ai_dataloader::Len;
pub trait Dataset<T>: Len + std::ops::Index<usize>
where
T: Collate<Vec<Self::Output>>,
Self::Output: Sized,
{
}
But as Index::Output
must refer as something exist, it will not cover most of our use cases.
For instance if the dataset is something like that:
struct Dataset {
labels: Vec<i32>,
texts: Vec<String>,
}
And we want to return a tuple (label, text) when indexing, it will no be possible with std:ops::Index
.