axonml-vision
Overview
axonml-vision provides the computer-vision stack for AxonML: image-specific transforms, loaders for classical vision datasets (MNIST, Fashion-MNIST, CIFAR-10/100) plus synthetic variants, and a wide catalog of neural-network architectures covering classification, detection, dense prediction, anomaly detection, VQA, 3D reconstruction, and biometrics. A pretrained-weights hub with on-disk caching rounds it out.
Features
- Image transforms —
Resize,CenterCrop,RandomHorizontalFlip,RandomVerticalFlip,RandomRotation,ColorJitter,Grayscale,ImageNormalize(presets:imagenet,mnist,cifar10),Pad,ToTensorImage. - Datasets — real-file loaders for
MNIST,FashionMNIST,CIFAR10,CIFAR100, and synthetic variantsSyntheticMNIST/SyntheticCIFARfor fast tests. - Classification —
LeNet,MLP,SimpleCNN,ResNet(resnet18,resnet34,BasicBlock,Bottleneck),VGG(vgg11,vgg13,vgg16,vgg19with optional batch-norm),VisionTransformer(vit_base,vit_large). - Detection —
BlazeFace(dual-scale 128×128 face detector, 896 anchors),RetinaFace(ResNet34 backbone + multi-level FPN head),DETR(transformer-based,smallpreset),NanoDet(mobile-class detector),Helios(YOLO-family detector with 5 sizes Nano/Small/Medium/Large/XLarge and loss utilitiesHeliosLoss,CIoULoss,TaskAlignedAssigner). - Novel detection architectures —
Nexus(predictive dual-pathway detector with multi-scale fusion, object-memory bank, and predictive-coding surprise gating) andPhantom(temporal event-driven face detection with pseudo-event encoder and GRU-based face-state tracker).NightVision(multi-domain infrared detector with thermal stem, CSP backbone, thermal FPN, YOLOX-style decoupled heads,ThermalDomaindomain tagging). - Dense prediction —
DPT(depth transformer,small/basepresets) andFastDepth(mobile depth estimator). - Anomaly detection —
PatchCoreandStudentTeacher, both withdefault_rgb()constructors. - Visual Question Answering —
VQAModel(smallpreset). - 3D reconstruction —
Aegis3D: Fourier-feature SDF networks (LocalSDF+GlobalSDF), adaptive octree spatial indexing, differentiable sphere-tracing renderer, and marching-cubes mesh extraction. - FPN infrastructure — shared
FPN(feature pyramid network) used by multiple detectors. - Aegis biometric identity suite —
AegisIdentityorchestrator withfull/face_only/edge_minimalconstructors; modality modelsMnemosyneIdentity(face),AriadneFingerprint,EchoSpeaker(voice),ArgusIris, plusThemisFusion(uncertainty-weighted fusion). Enrollment, verification, forensic verification, liveness, secure verification, identification. Companion losses:AngularMarginLoss,CenterLoss,ContrastiveLoss,CrystallizationLoss,DiversityRegularization,EchoLoss,ArgusLoss,LivenessLoss,ThemisLoss. - Model Hub —
download_weights,load_state_dict,list_models,model_info,is_cached,model_registry, with on-disk caching. - CUDA feature — optional
cudacargo feature propagates to core/tensor/autograd/nn.
Modules
| Module | Description |
|---|---|
transforms |
Image data-augmentation and preprocessing transforms |
datasets |
MNIST, Fashion-MNIST, CIFAR-10/100 loaders plus synthetic variants |
models |
All neural network architectures (see below) |
models::biometric |
Aegis biometric suite (Mnemosyne, Ariadne, Echo, Argus, Themis + identity orchestrator) |
models::helios |
YOLO-style object detector with 5 size variants |
models::nexus |
Predictive dual-pathway detector with object memory |
models::phantom |
Temporal event-driven face detection |
models::nightvision |
Multi-domain infrared detection |
models::aegis3d |
Octree-adaptive neural implicit surface reconstruction |
camera |
Camera I/O utilities |
edge |
Edge-deployment helpers |
hub |
Pretrained model weights management |
image_io |
Image load/save helpers |
losses |
Vision-specific loss functions |
ops |
Low-level vision ops |
training |
Training utilities |
Usage
Add to your Cargo.toml:
[]
= "0.6.1"
Loading Datasets
use *;
// Synthetic MNIST for fast tests
let train_data = train;
let test_data = test;
// Synthetic CIFAR-10
let cifar = small;
let = train_data.get.unwrap;
assert_eq!; // MNIST: 1 channel, 28x28
assert_eq!; // One-hot encoded
Image Transforms
use ;
use ;
let transform = empty
.add
.add
.add
.add;
let output = transform.apply;
assert_eq!;
Normalization Presets
use ImageNormalize;
let imagenet = imagenet; // mean=[0.485,0.456,0.406] std=[0.229,0.224,0.225]
let mnist = mnist; // mean=[0.1307] std=[0.3081]
let cifar10 = cifar10; // mean=[0.4914,0.4822,0.4465] std=[0.2470,0.2435,0.2616]
Classification Models
use ;
use ;
use Module;
use Variable;
let lenet = new; // [N, 1, 28, 28] -> [N, 10]
let mlp = MLPfor_mnist; // 784 -> 256 -> 128 -> 10
let rn18 = resnet18; // ImageNet classes
let vgg = vgg16;
let vit = vit_base;
Detection Models
use ;
use ;
let blaze = new; // dual-scale 128x128 face detector
let retina = new; // ResNet34 backbone
let nanodet = new;
let detr = DETRsmall;
let helios = small; // also: new(config), large(num_classes)
Novel Detection Architectures
use ;
let nexus = default; // predictive dual-pathway + object memory
let phantom = default; // event-driven temporal face detector
let night = new;
Dense Prediction & Anomaly / VQA
use ;
let dpt = DPTsmall; // transformer depth
let fast = new; // mobile depth
let patch = default_rgb; // anomaly detection, 256-d features
let st = default_rgb; // student-teacher anomaly
let vqa = small; // vocab=100, answers=50
Aegis3D — 3D Reconstruction
use ;
let aegis3d = new; // Fourier-feature SDF + adaptive octree + sphere tracing + marching cubes
Full Training Pipeline
use *;
use DataLoader;
use ;
use ;
let dataset = train;
let loader = new.shuffle;
let model = new;
let mut optim = new;
let loss_fn = new;
for batch in loader.iter
Model Hub for Pretrained Weights
use ;
for model in list_models
if let Some = model_info
if !is_cached
Aegis Identity — Biometric Framework
Unified biometric identity system with 5 modality-specific architectures plus ThemisFusion for uncertainty-weighted evidence fusion. Designed for edge deployment (sub-2 MB total in edge_minimal configuration).
use ;
use Variable;
use Tensor;
// Full multimodal system — face + fingerprint + voice + iris
let mut aegis = full;
// Or smaller deployments:
let face_only = face_only;
let edge = edge_minimal;
// Enroll
let face = new;
let evidence = new.with_face;
let enrolled = aegis.enroll;
// Verify
let probe = new
.with_face;
let verification = aegis.verify;
println!;
// Forensic verification with per-modality scores and cross-modal consistency
let = aegis.verify_forensic;
// Anti-spoofing liveness
let liveness = aegis.assess_liveness;
// Quality -> liveness -> verification secure pipeline
let secure = aegis.secure_verify;
// 1:N identification
let ident = aegis.identify;
Modality architectures:
| Model | Modality | Novel idea |
|---|---|---|
MnemosyneIdentity |
Face | Identity crystallizes via GRU attractor convergence |
AriadneFingerprint |
Fingerprint | Ridge event fields with Gabor wavelets |
EchoSpeaker |
Voice | Identity = unpredictable speech residuals |
ArgusIris |
Iris | Polar-native radial / angular Conv1d encoding (backed by polar::polar_unwrap) |
ThemisFusion |
Fusion | Belief propagation with uncertainty gating |
Features flags
default = ["download"]— enablesreqwestfor hub downloads.cuda— propagates CUDA support toaxonml-tensor,axonml-nn,axonml-autograd,axonml-core.
Tests
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT License (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Last updated: 2026-04-16 (v0.6.1)