Struct burn_dataset::HuggingfaceDatasetLoader
source · pub struct HuggingfaceDatasetLoader { /* private fields */ }Expand description
Load a dataset from huggingface datasets.
The dataset with all splits is stored in a single sqlite database (see SqliteDataset).
§Example
use burn_dataset::HuggingfaceDatasetLoader;
use burn_dataset::SqliteDataset;
use serde::{Deserialize, Serialize};
#[derive(Deserialize, Debug, Clone)]
struct MNISTItemRaw {
pub image_bytes: Vec<u8>,
pub label: usize,
}
let train_ds:SqliteDataset<MNISTItemRaw> = HuggingfaceDatasetLoader::new("mnist")
.dataset("train")
.unwrap();Implementations§
source§impl HuggingfaceDatasetLoader
impl HuggingfaceDatasetLoader
sourcepub fn with_subset(self, subset: &str) -> Self
pub fn with_subset(self, subset: &str) -> Self
Create a huggingface dataset loader for a subset of the dataset.
The subset name must be one of the subsets listed in the dataset page.
If no subset names are listed, then do not use this method.
sourcepub fn with_base_dir(self, base_dir: &str) -> Self
pub fn with_base_dir(self, base_dir: &str) -> Self
Specify a base directory to store the dataset.
If not specified, the dataset will be stored in ~/.cache/burn-dataset.
sourcepub fn with_huggingface_token(self, huggingface_token: &str) -> Self
pub fn with_huggingface_token(self, huggingface_token: &str) -> Self
Specify a huggingface token to download datasets behind authentication.
You can get a token from tokens settings
sourcepub fn with_huggingface_cache_dir(self, huggingface_cache_dir: &str) -> Self
pub fn with_huggingface_cache_dir(self, huggingface_cache_dir: &str) -> Self
Specify a huggingface cache directory to store the downloaded datasets.
If not specified, the dataset will be stored in ~/.cache/huggingface/datasets.
sourcepub fn dataset<I: DeserializeOwned + Clone>(
self,
split: &str
) -> Result<SqliteDataset<I>, ImporterError>
pub fn dataset<I: DeserializeOwned + Clone>( self, split: &str ) -> Result<SqliteDataset<I>, ImporterError>
Load the dataset.
sourcepub fn db_file(self) -> Result<PathBuf, ImporterError>
pub fn db_file(self) -> Result<PathBuf, ImporterError>
Get the path to the sqlite database file.
If the database file does not exist, it will be downloaded and imported.