OxiGDAL ML

Production-ready Machine Learning infrastructure for geospatial data processing in Pure Rust

Overview

OxiGDAL ML is a comprehensive machine learning framework for geospatial raster data, built on top of the OxiGDAL ecosystem. It provides production-ready ML capabilities including model inference, training, optimization, and deployment across multiple platforms and backends.

Key Benefits:

Pure Rust Implementation: No C/Fortran dependencies for core functionality
Multi-Backend Support: ONNX Runtime, CoreML, TensorFlow Lite
GPU Acceleration: 7 GPU backends (CUDA, Metal, Vulkan, OpenCL, ROCm, DirectML, WebGPU)
Production Features: Model serving, health checks, batch processing, monitoring
Comprehensive Documentation: 10,000+ lines of guides and examples

Key Features

🚀 Model Inference

ONNX Runtime 2.0 - Full integration with GPU acceleration
CoreML - Native Apple acceleration (CPU, GPU, Neural Engine)
TensorFlow Lite - Mobile and edge deployment
Tiled Inference - Process large images efficiently
Batch Processing - Auto-tuning and progress tracking

🎓 Training Infrastructure

Optimizers: Adam (with bias correction), SGD (with momentum)
Schedulers: Step decay, exponential, polynomial
Early Stopping: Patience-based with proper validation
Checkpointing: Save/restore training state
Loss Functions: MSE, CrossEntropy, Focal, Dice, IoU

🏗️ Model Architectures

ResNet (18, 34, 50, 101, 152) - Classification backbone
UNet - Semantic segmentation
Transformer - Multi-head attention for time series
LSTM - Sequential data processing

🔧 Data Processing

GeoTIFF Loading - Integration with oxigdal-geotiff
Data Augmentation - 11 techniques (flip, rotate, blur, crop, noise, etc.)
LRU Caching - Efficient dataset and model caching
Normalization - Per-channel statistics

⚡ Model Optimization

Quantization - INT8, UINT8, FP16, INT4 with calibration
Pruning - Structured, magnitude-based, gradient-based
Knowledge Distillation - Teacher-student model compression
Performance Benchmarking - Speedup and accuracy metrics

🎮 GPU Acceleration

CUDA (NVIDIA) - Dynamic detection, device enumeration
Metal (Apple) - Native macOS/iOS support
Vulkan - Cross-platform compute
OpenCL - Industry standard compute
ROCm (AMD) - AMD GPU support
DirectML (Windows) - Microsoft's ML acceleration
WebGPU - Browser-based compute

🌐 Production Features

Model Zoo - 6 pretrained models with automatic download
Health Checks - Memory monitoring and status reporting
Model Serving - REST API integration patterns
Monitoring - Performance tracking and drift detection
Batch Inference - Memory-aware parallel processing

Installation

Add to your Cargo.toml:

[dependencies]
oxigdal-ml = "0.1"
oxigdal-ml-foundation = "0.1"

# Optional: Enable specific features
oxigdal-ml = { version = "0.1", features = ["gpu", "cuda", "temporal", "cloud-removal"] }

System Requirements

Rust: 1.75 or later
Platform: Linux, macOS (x86_64/ARM64), Windows
Optional: ONNX Runtime, CUDA, Metal framework

Quick Start

Running Inference with ONNX

use oxigdal_ml::models::{OnnxModel, Model};
use oxigdal_ml::models::onnx::OnnxConfig;
use oxigdal_core::raster::RasterBuffer;

// Load ONNX model
let config = OnnxConfig::default();
let model = OnnxModel::from_file("model.onnx", config)?;

// Run inference
let input: RasterBuffer = load_geotiff("input.tif")?;
let output = model.predict(&input)?;

// Process results
save_geotiff("output.tif", &output)?;

Training a Model

use oxigdal_ml_foundation::training::{Trainer, TrainingConfig};
use oxigdal_ml_foundation::data::dataset::GeoTiffDataset;
use oxigdal_ml_foundation::models::unet::UNet;

// Create dataset
let dataset = GeoTiffDataset::new(image_paths, (256, 256))?
    .with_labels(label_paths)?;

// Configure training
let config = TrainingConfig::default()
    .with_batch_size(16)
    .with_epochs(100)
    .with_early_stopping(10, 0.001)?;

// Train model
let model = UNet::new(unet_config)?;
let trainer = Trainer::new(model, dataset, config)?;
let trained_model = trainer.train()?;

Model Optimization

use oxigdal_ml::optimization::quantization::{quantize_model, QuantizationConfig, QuantizationType};

// Quantize model to INT8
let quant_config = QuantizationConfig::builder()
    .quantization_type(QuantizationType::Int8)
    .calibration_samples(100)
    .build();

let result = quantize_model("model.onnx", "quantized_model.onnx", quant_config)?;

println!("Size reduction: {:.1}%", result.size_reduction_percent());
println!("Compression ratio: {:.1}x", result.compression_ratio());

Batch Processing with Progress

use oxigdal_ml::batch::{BatchProcessor, BatchConfig};

// Configure batch processing
let config = BatchConfig::default()
    .with_auto_tuning(true)
    .with_num_workers(4);

let processor = BatchProcessor::new(model, config);

// Process with progress bar
let results = processor.infer_batch_with_progress(inputs, true)?;

Feature Flags

Feature	Description	Default
`std`	Standard library support	✅ Yes
`gpu`	GPU acceleration via CUDA/TensorRT	❌ No
`cuda`	NVIDIA CUDA backend	❌ No
`metal`	Apple Metal backend	❌ No
`vulkan`	Vulkan compute backend	❌ No
`opencl`	OpenCL backend	❌ No
`rocm`	AMD ROCm backend	❌ No
`directml`	DirectML (Windows)	❌ No
`coreml`	CoreML (macOS/iOS)	❌ No
`tflite`	TensorFlow Lite	❌ No
`quantization`	Model quantization	❌ No
`pruning`	Model pruning	❌ No
`distillation`	Knowledge distillation	❌ No
`temporal`	Temporal forecasting	❌ No
`cloud-removal`	Cloud detection/removal	❌ No

Platform Support

Platform	Build	ONNX RT	CoreML	TFLite	GPU
Linux x86_64	✅	✅	❌	✅	✅ CUDA, Vulkan, OpenCL, ROCm
macOS ARM64	✅	✅	✅	✅	✅ Metal
macOS x86_64	✅	✅	✅	✅	✅ Metal
Windows x86_64	✅	✅	❌	✅	✅ CUDA, DirectML, Vulkan
iOS	✅	❌	✅	✅	✅ Metal
Android	✅	❌	❌	✅	✅ Vulkan, OpenCL

Examples

Transfer Learning

use oxigdal_ml_foundation::transfer::{FeatureExtractor, FeatureExtractorConfig};

let config = FeatureExtractorConfig::default()
    .with_freeze_until("layer4")?;

let extractor = FeatureExtractor::new(pretrained_model, config)?;
let features = extractor.extract(&input)?;

// Train custom classifier on extracted features

GPU Acceleration

use oxigdal_ml::gpu::{select_best_device, GpuBackend};

// Automatically select best GPU
let device = select_best_device()?;
println!("Using {} GPU: {}", device.backend, device.name);

// Configure model for GPU
let config = OnnxConfig::default()
    .with_gpu(device.backend)
    .with_device_index(device.index);

Cloud Detection and Removal

use oxigdal_ml::cloud::{CloudDetector, CloudRemover};

// Detect clouds
let detector = CloudDetector::new(cloud_config)?;
let cloud_mask = detector.detect(&satellite_image)?;

// Remove clouds via temporal interpolation
let remover = CloudRemover::new(removal_config)?;
let clean_image = remover.remove(&image_sequence)?;

Temporal Forecasting

use oxigdal_ml::temporal::{TemporalForecaster, ForecastConfig};

let config = ForecastConfig::default()
    .with_horizon(7)  // 7-day forecast
    .with_model("transformer");

let forecaster = TemporalForecaster::new(config)?;
let forecast = forecaster.predict(&time_series)?;

Documentation

Architecture Guide - System design and module structure
API Usage Guide - Complete examples for all features
Deployment Guide - Server, edge, and mobile deployment
Optimization Guide - Quantization, pruning, GPU tuning
Troubleshooting Guide - Common issues and solutions
Feature Matrix - Complete feature status
Integration Guide - Extending the system

Testing

Run the complete test suite:

# All tests
cargo test --all-features

# Specific package
cargo test -p oxigdal-ml --lib --features temporal,cloud-removal
cargo test -p oxigdal-ml-foundation --lib --all-features

# With output
cargo test -- --nocapture --test-threads=1

Test Coverage: 99.68% (316/317 tests passing)

Performance

Quantization: 2-8x model compression (INT8: 4x, INT4: 8x)
Pruning: Up to 80% sparsity supported
GPU Acceleration: 10-100x speedup on supported hardware
Batch Processing: Auto-tuned for available memory

See the Optimization Guide for detailed performance tuning.

Project Status

Version: 0.1.0
Status: Production Ready
Test Coverage: 99.68%
Documentation: Comprehensive (10,000+ lines)
COOLJAPAN Compliance: 100%

Contributing

We welcome contributions! Please see our Integration Guide for:

Adding new model architectures
Implementing new backends
Extending data loaders
Custom optimizers and loss functions

Development

# Build with all features
cargo build --all-features

# Run tests
cargo test --all-features

# Check code quality
cargo clippy --all-features
cargo fmt --check

License

Licensed under the Apache License, Version 2.0 (LICENSE-APACHE).

Acknowledgments

Built with:

ONNX Runtime - Cross-platform ML inference
SciRS2 - Pure Rust scientific computing
OxiBLAS - Pure Rust linear algebra
ndarray - N-dimensional arrays

Part of the COOLJAPAN ecosystem of Pure Rust libraries.

oxigdal-ml 0.1.1