axonml-data
Overview
axonml-data provides data loading infrastructure for training neural networks in the AxonML framework. It includes the Dataset trait, efficient DataLoader with batching and shuffling, various sampling strategies, and composable data transforms.
Features
- Dataset Trait - Core abstraction for indexed data access with
TensorDataset,MapDataset,ConcatDataset, andSubsetDatasetimplementations - DataLoader - Efficient batched iteration with configurable batch size, shuffling, and drop-last behavior
- Samplers - Flexible sampling strategies including
SequentialSampler,RandomSampler,SubsetRandomSampler,WeightedRandomSampler, andBatchSampler - Transforms - Composable data augmentation with
Normalize,RandomNoise,RandomCrop,RandomFlip,Scale,Clamp, and more - Collate Functions - Batch assembly with
DefaultCollateandStackCollatefor tensor stacking - Generic DataLoader - Flexible loader that works with any
DatasetandCollatecombination
Modules
| Module | Description |
|---|---|
dataset |
Core Dataset trait and implementations (TensorDataset, MapDataset, ConcatDataset, SubsetDataset, InMemoryDataset) |
dataloader |
DataLoader for batched iteration with shuffling support |
sampler |
Sampling strategies for controlling data access patterns |
transforms |
Composable data transformations for preprocessing and augmentation |
collate |
Batch assembly functions for combining samples into tensors |
Usage
Add to your Cargo.toml:
[]
= "0.1.0"
Creating a Dataset
use *;
// From tensors
let x = from_vec.unwrap;
let y = from_vec.unwrap;
let dataset = new;
assert_eq!;
let = dataset.get.unwrap;
Using the DataLoader
use ;
let dataset = new;
// Create loader with batch size 32
let loader = new
.shuffle
.drop_last;
// Iterate over batches
for batch in loader.iter
Implementing Custom Datasets
use Dataset;
use Tensor;
Data Transforms
use ;
// Compose multiple transforms
let transform = empty
.add
.add
.add;
let output = transform.apply;
Using Samplers
use ;
// Random sampling without replacement
let sampler = new;
for idx in sampler.iter
// Weighted sampling for imbalanced datasets
let weights = vec!;
let sampler = new;
// Batch sampling
let base_sampler = new;
let batch_sampler = new;
for batch_indices in batch_sampler.iter
Dataset Splitting
use ;
let dataset = new;
// Random split: 80% train, 20% validation
let splits = random_split;
let train_dataset = &splits;
let val_dataset = &splits;
Combining Datasets
use ;
// Concatenate datasets
let combined = new;
// Apply transform to dataset
let mapped = new;
Tests
Run the test suite:
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT License (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.