axonml-vision
Overview
axonml-vision provides computer vision functionality for the AxonML framework. It includes image-specific transforms, loaders for common vision datasets (MNIST, CIFAR), pre-defined neural network architectures, and a model hub for pretrained weights.
Features
- Image Transforms - Comprehensive augmentation including
Resize,CenterCrop,RandomHorizontalFlip,RandomVerticalFlip,RandomRotation,ColorJitter,Grayscale,ImageNormalize, andPad - Vision Datasets - Loaders for MNIST, Fashion-MNIST, CIFAR-10, CIFAR-100, and synthetic variants for testing
- Neural Network Models - Pre-defined architectures including LeNet, SimpleCNN, MLP, ResNet (18/34), VGG (11/13/16/19), and Vision Transformer (ViT)
- Model Hub - Download, cache, and load pretrained weights with checksum verification
- Bilinear Interpolation - High-quality image resizing for 2D, 3D, and 4D tensors
- ImageNet Normalization - Built-in presets for ImageNet, MNIST, and CIFAR normalization
Modules
| Module | Description |
|---|---|
transforms |
Image-specific data augmentation and preprocessing transforms |
datasets |
Loaders for MNIST, CIFAR, and synthetic vision datasets |
models |
Pre-defined neural network architectures (LeNet, ResNet, VGG, ViT) |
hub |
Pretrained model weights management (download, cache, load) |
Usage
Add to your Cargo.toml:
[]
= "0.1.0"
Loading Datasets
use *;
// Synthetic MNIST for testing
let train_data = train;
let test_data = test;
// Synthetic CIFAR-10
let cifar = small;
// Get a sample
let = train_data.get.unwrap;
assert_eq!; // MNIST: 1 channel, 28x28
assert_eq!; // One-hot encoded
Image Transforms
use ;
use ;
// Build transform pipeline
let transform = empty
.add
.add
.add
.add;
let output = transform.apply;
assert_eq!;
Normalization Presets
use ImageNormalize;
// ImageNet normalization
let normalize = imagenet;
// mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
// MNIST normalization
let normalize = mnist;
// mean=[0.1307], std=[0.3081]
// CIFAR-10 normalization
let normalize = cifar10;
// mean=[0.4914, 0.4822, 0.4465], std=[0.2470, 0.2435, 0.2616]
Using Vision Models
use ;
use ;
use Module;
use Variable;
// LeNet for MNIST
let model = new;
let input = new; // [N, 1, 28, 28]
let output = model.forward; // [N, 10]
// MLP for flattened images
let model = MLPfor_mnist; // 784 -> 256 -> 128 -> 10
// ResNet18 for ImageNet
let model = resnet18;
let output = model.forward; // [N, 1000]
// VGG16
let model = vgg16; // with batch normalization
Full Training Pipeline
use *;
use DataLoader;
use ;
use ;
// Create dataset and dataloader
let dataset = train;
let loader = new.shuffle;
// Create model and optimizer
let model = new;
let mut optimizer = new;
let loss_fn = new;
// Training loop
for batch in loader.iter
Model Hub for Pretrained Weights
use ;
// List available models
let models = list_models;
for model in models
// Get model info
if let Some = model_info
// Download and load weights
let path = download_weights?;
let state_dict = load_state_dict?;
// Load into model
// model.load_state_dict(state_dict);
Tests
Run the test suite:
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT License (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.