medrs
High-performance medical imaging I/O and processing library for Rust and Python.
Overview
medrs is designed for throughput-critical medical imaging workflows, particularly deep learning pipelines that process large 3D volumes. It provides:
- Fast NIfTI I/O: Memory-mapped reading, crop-first loading (read sub-volumes without loading entire files)
- Transform Pipeline: Lazy evaluation with automatic operation fusion and SIMD acceleration
- Mixed Precision: Native f16/bf16 support for 50% storage savings
- Random Augmentation: Reproducible, GPU-friendly augmentations for ML training
- Python Bindings: Zero-copy numpy views, direct PyTorch/JAX tensor creation
- MONAI Integration: Drop-in replacements for MONAI transforms
Why medrs?
Performance vs MONAI & TorchIO (128³ volume)
| Operation | medrs | MONAI | TorchIO | vs MONAI |
|---|---|---|---|---|
| Load | 0.13ms | 4.55ms | 4.71ms | 35x |
| Load Cropped (64³) | 0.41ms | 4.68ms | 9.86ms | 11x |
| Load Resampled | 0.40ms | 6.88ms | 27.65ms | 17x |
| To PyTorch | 0.49ms | 5.14ms | 10.22ms | 10x |
| Load + Normalize | 0.60ms | 5.36ms | 12.26ms | 9x |
At larger volumes (512³), speedups increase dramatically: up to 38,000x vs MONAI and 6,600x vs TorchIO.
Storage Efficiency (128³ volume, compressed)
| Format | Size | vs f32 |
|---|---|---|
| float32 | 8.3 MB | 100% |
| bfloat16 | 3.4 MB | 41% |
| float16 | 4.1 MB | 50% |
| int16 | 1.2 MB | 15% |
Key Advantages
- Crop-First Loading: Load 64³ patch from 512³ volume without reading entire file - 6,600x faster than MONAI
- Mixed Precision: Save in bf16/f16 for 40-50% smaller files with minimal precision loss
- MONAI Drop-in: Replace MONAI I/O transforms with one import change
- Zero-Copy: Direct tensor creation without intermediate numpy allocations
Comprehensive Benchmark Results
Benchmark results comparing medrs, MONAI, and TorchIO across multiple volume sizes and operations.

Load Performance (Basic I/O)
| Size | medrs | MONAI | TorchIO | vs MONAI | vs TorchIO |
|---|---|---|---|---|---|
| 64³ | 0.13ms | 1.34ms | 2.35ms | 10x | 18x |
| 128³ | 0.13ms | 4.55ms | 4.71ms | 35x | 36x |
| 256³ | 0.14ms | 159.11ms | 95.18ms | 1,136x | 680x |
| 512³ | 0.13ms | 5,006.76ms | 866.54ms | 38,513x | 6,665x |
Crop-First Loading (64³ patch)
| Source | medrs | MONAI | TorchIO | vs MONAI | vs TorchIO |
|---|---|---|---|---|---|
| 64³ | 0.27ms | 1.75ms | 6.00ms | 6x | 22x |
| 128³ | 0.41ms | 4.68ms | 9.86ms | 11x | 24x |
| 256³ | 0.55ms | 154.86ms | 104.48ms | 282x | 190x |
| 512³ | 0.76ms | 5,041.42ms | 1,076.89ms | 6,633x | 1,417x |
Load Resampled (Half resolution)
| Source | medrs | MONAI | TorchIO | vs MONAI | vs TorchIO |
|---|---|---|---|---|---|
| 64³ → 32³ | 0.18ms | 1.93ms | 5.45ms | 11x | 30x |
| 128³ → 64³ | 0.40ms | 6.88ms | 27.65ms | 17x | 69x |
| 256³ → 128³ | 2.02ms | 178.87ms | 363.85ms | 89x | 180x |
| 512³ → 256³ | 6.67ms | 5,960.93ms | 4,039.05ms | 894x | 605x |
Direct PyTorch Loading
| Source | medrs | MONAI | TorchIO | vs MONAI | vs TorchIO |
|---|---|---|---|---|---|
| 64³ | 0.34ms | 1.58ms | 5.37ms | 5x | 16x |
| 128³ | 0.49ms | 5.14ms | 10.22ms | 10x | 21x |
| 256³ | 0.60ms | 162.78ms | 53.70ms | 271x | 90x |
| 512³ | 0.84ms | 5,864.85ms | 1,223.24ms | 6,982x | 1,456x |
Load with Z-Normalization
| Source | medrs | MONAI | TorchIO | vs MONAI | vs TorchIO |
|---|---|---|---|---|---|
| 64³ | 0.49ms | 2.15ms | 7.04ms | 4x | 14x |
| 128³ | 0.60ms | 5.36ms | 12.26ms | 9x | 20x |
| 256³ | 0.73ms | 163.38ms | 53.59ms | 224x | 73x |
| 512³ | 1.01ms | 3,735.31ms | 1,092.25ms | 3,698x | 1,081x |
Benchmarks run on Apple M1 Pro, 20 iterations, 3 warmup. Run your own: python benchmarks/bench_medrs.py
Installation
Python
Rust
[]
= "0.1"
Development
Quick Start
Python:
# Load a NIfTI image
=
# Method chaining for transforms
=
# Load directly to PyTorch tensor (most efficient)
=
Rust:
use nifti;
use ;
Transform Pipeline
Build composable transform pipelines with lazy evaluation and automatic optimization:
Python:
# Create a reusable pipeline
=
# Apply to multiple images
=
=
Rust:
use TransformPipeline;
let pipeline = new
.z_normalize
.clamp
.resample_to_shape;
let processed = pipeline.apply;
Random Augmentation
Reproducible augmentations for ML training with optional seeding:
Python:
=
# Individual augmentations
=
=
=
=
=
=
# Combined augmentation (flip + noise + scale + shift)
=
Rust:
use ;
// Individual augmentations
let flipped = random_flip?;
let noisy = random_gaussian_noise?;
// Combined augmentation
let augmented = random_augment?;
Crop-First Loading
Load only the data you need - essential for training pipelines:
# Load a 64^3 patch starting at position (32, 32, 32)
=
# Load with resampling and reorientation in one step
=
# Load directly to GPU tensor
=
Training Data Loader
High-performance patch extraction for training:
=
# Training loop
=
Available Transforms
Intensity Transforms
z_normalize()/z_normalization()- Zero mean, unit variancerescale()/rescale_intensity()- Scale to [min, max] rangeclamp()- Clamp values to range
Spatial Transforms
resample()/resample_to_spacing()- Resample to target spacingresample_to_shape()- Resample to target shapereorient()- Reorient to standard orientation (RAS, LPS, etc.)crop_or_pad()- Crop or pad to target shapeflip()- Flip along specified axes
Random Augmentation
random_flip()- Random axis flippingrandom_gaussian_noise()- Additive Gaussian noiserandom_intensity_scale()- Random intensity scalingrandom_intensity_shift()- Random intensity offsetrandom_rotate_90()- Random 90-degree rotationsrandom_gamma()- Random gamma correctionrandom_augment()- Combined augmentation pipeline
Performance
medrs uses several optimization strategies:
- SIMD: Trilinear interpolation uses AVX2/SSE for 8-way parallel processing
- Parallel Processing: Rayon-based parallelism for large volumes
- Lazy Evaluation: Transform pipelines compose operations before execution
- Memory Mapping: Large files are memory-mapped to avoid full loads
- Buffer Pooling: Reusable buffers reduce allocation overhead
Examples
See the examples/ directory for:
basic/- Loading, transforms, and savingintegrations/- PyTorch, MONAI, JAX integrationadvanced/- Async pipelines, custom transforms
Testing
# Rust tests
# Python tests
# Benchmarks (requires torch, monai, torchio)
# Generate benchmark plots
License
medrs is dual-licensed under MIT and Apache-2.0. See LICENSE for details.
Contributing
See CONTRIBUTING.md for guidelines.
Maintainer
Liam Chalcroft (liam.chalcroft.20@ucl.ac.uk)