Ahuvista-NN: Modular Multi-Modal Neural Network for Healthcare
A lightweight, Rust-based neural network library optimized for multi-modal data processing in low-compute environments, with a focus on maternal health outcome predictions.
🌟 Key Features
- 🧩 Modular Multi-Modal Architecture: Selectively enable/disable tabular, temporal, text, and image modalities based on available data
- 🔗 Late Fusion Strategy: Extracts specialized features from each modality before combination for optimal performance
- ⚖️ Population-Informed Weighting: Addresses class imbalance with demographic stratification and cause-specific weights
- ⚡ Optimized for Efficiency: Designed to run on low-compute resources (edge devices, embedded systems, limited infrastructure)
- 🔄 Complete Data Pipeline: Includes preprocessors and transformers for all supported data types
- ⚙️ Flexible Configuration: Configure via JSON files, programmatic API, or command-line arguments
- 🚀 Production-Ready Inference: Lightweight prediction binary optimized for deployment
- 📊 Bayesian Calibration: Probability calibration based on population prevalence for reliable risk assessment
📋 Table of Contents
- Installation
- Quick Start
- Architecture Overview
- Training Models
- Making Predictions
- Modality Selection Guide
- Data Format Specifications
- Configuration Reference
- Advanced Usage
- Performance & Benchmarks
- Examples
- Troubleshooting
- Contributing
- License
🚀 Installation
Prerequisites
- Rust: 1.70 or higher (Install Rust)
- Cargo: Comes bundled with Rust
- Operating Systems: Linux, macOS, Windows
Build from Source
# Clone the repository
# Build the project (debug mode)
# Build optimized release version
# Run tests to verify installation
Compiled Binaries
After building, you'll find two binaries in target/release/:
ahuvista-train: Training binary with full configuration optionsahuvista-predict: Lightweight inference binary for production deployment
Verify Installation
# Check training binary
# Check prediction binary
🏃 Quick Start
1. Train a Simple Model (Tabular Data Only)
# Train with minimal configuration
2. Train with Multiple Modalities
# Train with tabular and temporal data
3. Make a Prediction
First, create an input file patient_data.json:
Then run prediction:
🏗️ Architecture Overview
System Design
Ahuvista-NN uses a late fusion architecture where each modality is processed by a specialized neural network before features are combined:
┌─────────────────────────────────────────────────────────────────────┐
│ Modular Late Fusion Network │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌────────┐│
│ │ Tabular │ │ Temporal │ │ Text │ │ Image ││
│ │ Network │ │ Network │ │ Network │ │Network ││
│ │ (Feed-Fwd) │ │ (RNN) │ │ (RNN + │ │ (CNN) ││
│ │ │ │ │ │ Embedding) │ │ ││
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └───┬────┘│
│ │ │ │ │ │
│ │ Feature │ Feature │ Feature │ │
│ │ Vector │ Vector │ Vector │ │
│ │ (16-32d) │ (16-32d) │ (32-64d) │ │
│ │ │ │ │ │
│ └─────────────────┴─────────────────┴──────────────┘ │
│ │ │
│ ┌───────▼────────┐ │
│ │ Concatenation │ │
│ │ Fusion Layer │ │
│ └───────┬────────┘ │
│ │ │
│ ┌───────▼────────┐ │
│ │ Classifier │ │
│ │ Network │ │
│ │ (64→32→1) │ │
│ └───────┬────────┘ │
│ │ │
│ ┌───────▼────────┐ │
│ │ Prediction │ │
│ │ Risk Score │ │
│ │ (0.0-1.0) │ │
│ └────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
Supported Modalities
| Modality | Description | Network Type | Use Cases |
|---|---|---|---|
| Tabular | Structured clinical data | Feed-forward MLP | Demographics, lab results, vital signs |
| Temporal | Time-series sequences | Recurrent (RNN) | Continuous monitoring, longitudinal data |
| Text | Unstructured text | RNN + Embeddings | Clinical notes, reports, documentation |
| Image | Visual data | Convolutional (CNN) | Medical imaging, ultrasounds, photographs |
Why Late Fusion?
- Modality-Specific Learning: Each network learns patterns specific to its data type
- Flexible Architecture: Easy to add/remove modalities without retraining everything
- Computational Efficiency: Only processes enabled modalities
- Interpretability: Can analyze contribution of each modality independently
🎓 Training Models
Ahuvista-NN offers three methods for training models, each suited for different use cases.
1. Configuration File Approach (Recommended)
Best for: Production deployments, reproducible experiments, complex configurations
Step 1: Create Configuration File
Create config.json:
Step 2: Train with Configuration
Step 3: Override Specific Parameters
# Use config but override training parameters
2. Command-Line Interface
Best for: Quick experiments, testing, simple configurations
Basic Training
All CLI Options
| Option | Short | Description | Example |
|---|---|---|---|
--config |
-c |
Path to JSON config file | --config config.json |
--modalities |
-m |
Comma-separated modalities | --modalities tabular,text |
--epochs |
-e |
Number of training epochs | --epochs 100 |
--batch-size |
-b |
Training batch size | --batch-size 64 |
--learning-rate |
-l |
Learning rate | --learning-rate 0.001 |
--output-dir |
-o |
Model output directory | --output-dir ./models |
--verbose |
-v |
Enable verbose logging | --verbose |
Examples
# Quick test with verbose output
# Production training with all options
# Multi-modal with custom output
3. Programmatic API
Best for: Custom training pipelines, research, integration with other systems
use ;
use HashMap;
Advanced: Dynamic Modality Selection
🔮 Making Predictions
Single Prediction
Step 1: Prepare Input Data
Create patient_input.json:
Step 2: Run Prediction
Or using modality specification:
Step 3: View Results
Output file prediction_result.json:
Batch Predictions
Process multiple patients efficiently:
Step 1: Organize Input Files
# Add patient_001.json, patient_002.json, patient_003.json, etc.
Step 2: Run Batch Prediction
Step 3: Review Batch Results
Output batch_results.json contains an array of all predictions:
Calibrated Predictions
Apply Bayesian calibration for more accurate probability estimates:
The --calibrate parameter specifies the population prevalence (e.g., 0.001 = 0.1% prevalence).
Why Use Calibration?
- Adjusts predictions based on known population prevalence
- Provides more reliable probability estimates
- Crucial for imbalanced datasets
- Recommended for clinical decision support
🎛️ Modality Selection Guide
Understanding Modalities
Each modality processes a different type of healthcare data:
| Modality | Data Type | Examples | When to Use |
|---|---|---|---|
| Tabular | Structured | Demographics, lab results, vitals | Always include if available |
| Temporal | Time-series | Continuous monitoring, trends | When tracking changes over time |
| Text | Unstructured | Clinical notes, reports | Rich qualitative information available |
| Image | Visual | Ultrasounds, X-rays, photos | Visual diagnosis required |
Common Configuration Patterns
Pattern 1: Clinical Data Only (Tabular + Temporal)
Use Case: EMR/EHR systems with structured data
Configuration:
Pattern 2: Medical Imaging Focus
Use Case: Radiology, ultrasound analysis
Pattern 3: Documentation-Rich System
Use Case: Systems with extensive clinical notes
Pattern 4: Full Multi-Modal (All Data Types)
Use Case: Comprehensive healthcare systems with all data types
Modality Selection Decision Tree
Do you have structured patient data?
├─ YES → Enable TABULAR
│ │
│ └─ Do you have time-series data?
│ ├─ YES → Enable TEMPORAL
│ └─ NO → Continue
│
└─ Do you have clinical notes?
├─ YES → Enable TEXT
└─ NO → Continue
Do you have medical images?
├─ YES → Enable IMAGE
└─ NO → Complete configuration
Performance Considerations
| Configuration | Memory | Speed | Accuracy Potential |
|---|---|---|---|
| Tabular only | ~50MB | Very Fast | Good |
| Tabular + Temporal | ~75MB | Fast | Better |
| Tabular + Text | ~100MB | Moderate | Better |
| Tabular + Image | ~150MB | Moderate | Better |
| All modalities | ~200MB | Slower | Best |
📊 Data Format Specifications
Tabular Data
Format: CSV files with headers
Requirements:
- Must include
PatientIDcolumn (or configured identifier) - Numeric features only
- Missing values should be imputed before training
Example (patient_data.csv):
PatientID,Age,BMI,BloodPressure,HeartRate,Temperature,Hemoglobin,Glucose,RiskFactor1,RiskFactor2
PAT-001,28,24.5,120,75,98.6,12.5,95,0.8,0.3
PAT-002,35,28.1,130,82,99.1,11.8,110,0.6,0.5
PAT-003,22,22.3,115,70,98.4,13.2,88,0.2,0.1
Temporal Data
Format: CSV with timestamp column
Requirements:
- Must include
PatientIDcolumn - Must include
Timestampor time-ordered rows - Each patient can have multiple time points
Example (vitals_timeseries.csv):
PatientID,Timestamp,SystolicBP,DiastolicBP,Temperature,HeartRate,RespiratoryRate
PAT-001,2024-01-01 08:00,120,80,98.6,72,16
PAT-001,2024-01-01 12:00,125,82,98.8,75,18
PAT-001,2024-01-01 16:00,130,85,99.0,78,20
PAT-002,2024-01-01 08:00,135,88,99.2,85,22
Text Data
Format: Plain text files or CSV with text column
Requirements:
- One document per patient
- UTF-8 encoding
- Vocabulary will be built automatically
Example (clinical_notes.txt):
PAT-001: Patient presents with mild hypertension. Blood pressure elevated but stable...
PAT-002: Routine prenatal visit. Patient reports feeling well. No complications noted...
Image Data
Format: Standard image formats (JPG, PNG, TIFF)
Requirements:
- Images will be resized to configured dimensions (default 224x224)
- RGB or grayscale
- One image per patient (or multiple images in directory structure)
Directory Structure:
images/
├── PAT-001.jpg
├── PAT-002.png
├── PAT-003.jpg
Prediction Input Format
JSON Structure:
Complete Example:
⚙️ Configuration Reference
Complete Configuration Schema
Configuration Parameter Guide
Modalities Section
| Parameter | Type | Default | Description |
|---|---|---|---|
use_tabular |
boolean | true |
Process structured tabular data |
use_temporal |
boolean | true |
Process time-series sequences |
use_text |
boolean | false |
Process unstructured text |
use_image |
boolean | false |
Process image data |
Note: At least one modality must be enabled.
Training Section
| Parameter | Type | Range | Recommended | Description |
|---|---|---|---|---|
epochs |
integer | 1-1000 | 50-100 | Full training passes |
batch_size |
integer | 1-256 | 16-64 | Samples per batch |
learning_rate |
float | 0.00001-0.1 | 0.001-0.01 | Gradient step size |
Tuning Tips:
- Small datasets (< 1000 samples): Lower batch size (8-16), more epochs (100-200)
- Medium datasets (1000-10000): Standard settings (batch=32, epochs=50-100)
- Large datasets (> 10000): Larger batch size (64-128), fewer epochs (20-50)
Model Architecture Section
Tabular Network:
tabular_hidden_sizes: List of hidden layer sizes. Example:[128, 64]creates two layerstabular_output_size: Feature vector dimension (typically 16-32)
Temporal Network:
temporal_hidden_size: RNN memory size (typically 32-64)temporal_output_size: Feature vector dimension (typically 16-32)
Text Network:
text_embedding_dim: Word vector size (typically 50-300)text_hidden_size: RNN memory size (typically 32-128)text_max_vocab: Vocabulary limit (typically 5000-20000)
Image Network:
image_channels: 3 for RGB, 1 for grayscaleimage_height/image_width: Target dimensions (commonly 224x224 or 256x256)
Classifier:
classifier_hidden_sizes: Final layers that combine modality features
🎯 Advanced Usage
Custom Training with Population Weighting
use ;
use HashMap;
// Define population-level cause frequencies
let mut cause_frequencies = new;
cause_frequencies.insert; // Hemorrhage: 25%
cause_frequencies.insert; // Hypertensive disorders: 20%
cause_frequencies.insert; // Infection: 15%
cause_frequencies.insert; // Thromboembolism: 10%
cause_frequencies.insert; // Other: 30%
let cause_weights = from_population_frequencies;
// Define demographic strata
let mut stratum_multipliers = new;
stratum_multipliers.insert;
stratum_multipliers.insert;
stratum_multipliers.insert;
stratum_multipliers.insert;
// Compute sample weights
let sample_weights = compute_sample_weights;
Model Inspection and Debugging
// Check enabled modalities
let modalities = model.enabled_modalities;
println!;
// Verify specific modality
if model.is_enabled
// Get model configuration
let config = model.config;
println!;
Save and Load Models
// Save trained model
model.save?;
// Load model for inference
let mut loaded_model = new?;
loaded_model.load?;
Custom Prediction Pipeline
use MultiModalInput;
// Build input programmatically
let mut input = new;
if has_tabular_data
if has_temporal_data
// Make prediction
let risk_score = model.predict?;
// Apply custom thresholds
let risk_level = if risk_score > 0.8 else if risk_score > 0.5 else ;
📈 Performance & Benchmarks
Computational Requirements
| Configuration | RAM Usage | Training Time* | Inference Time** |
|---|---|---|---|
| Tabular only | ~50 MB | 2-5 min | < 10 ms |
| Tabular + Temporal | ~75 MB | 5-10 min | < 20 ms |
| Tabular + Text | ~100 MB | 10-15 min | < 30 ms |
| Tabular + Image | ~150 MB | 15-25 min | < 50 ms |
| All modalities | ~200 MB | 30-45 min | < 100 ms |
* Training time for 10,000 samples, 50 epochs on CPU (Intel i7) ** Single prediction on CPU
Optimization Tips
For Faster Training:
- Use release mode:
cargo build --release - Increase batch size on machines with more RAM
- Disable unused modalities
- Use smaller network architectures for prototyping
For Lower Memory:
- Reduce batch size
- Use smaller hidden layer sizes
- Limit vocabulary size for text
- Reduce image dimensions
For Faster Inference:
- Compile with optimizations:
--release - Use minimal modalities needed
- Batch predictions when possible
- Consider model quantization (future feature)
Scalability
- Small datasets (< 1,000 samples): Runs on laptops, embedded systems
- Medium datasets (1,000-50,000 samples): Standard workstations
- Large datasets (> 50,000 samples): Server-grade hardware recommended
📚 Examples
Example 1: Quick Prototype with Tabular Data
# Step 1: Create minimal config
# Step 2: Train
# Step 3: Test prediction
Example 2: Production Multi-Modal System
# Step 1: Create production configuration
# Step 2: Train with production settings
# Step 3: Deploy for inference with calibration
Example 3: Batch Processing for Research
# Process 1000 patients
# Generate input files (your data preparation script)
# ...
# Run batch prediction
Example 4: Experimentation with Different Architectures
# Test different modality combinations
for; do
if [; then
modalities="tabular,temporal,text,image"
else
modalities=""
fi
done
🔧 Troubleshooting
Common Issues and Solutions
Issue: "At least one modality must be enabled"
Cause: No modalities selected in configuration
Solution:
# Ensure at least one modality is enabled
Issue: "Tabular data required but not provided"
Cause: Model expects tabular data but input doesn't include it
Solution: Ensure input JSON includes all required modalities:
Issue: Model training is too slow
Solutions:
- Reduce batch size:
--batch-size 16 - Use fewer epochs:
--epochs 20 - Disable unnecessary modalities
- Ensure using release mode:
--release
Issue: Out of memory during training
Solutions:
- Reduce batch size:
--batch-size 8 - Use smaller network architectures
- Process fewer modalities simultaneously
- Reduce image dimensions in config
Issue: Predictions seem uncalibrated
Solution: Use Bayesian calibration:
Issue: "Failed to load config file"
Causes & Solutions:
- Invalid JSON syntax → Validate JSON with
jsonlint - File not found → Check path is correct
- Wrong permissions → Ensure read permissions on file
Getting Help
- Check Logs: Use
--verboseflag for detailed output - Review Examples: See the
examples/directory - Read Documentation: Check inline code documentation
- Open an Issue: GitHub Issues
🤝 Contributing
Contributions are welcome! We appreciate:
- 🐛 Bug reports and fixes
- 📚 Documentation improvements
- ✨ New features and enhancements
- 🧪 Additional tests and benchmarks
Development Setup
# Clone repository
# Create feature branch
# Make changes and test
# Submit pull request
Code Style
- Follow Rust standard formatting (
cargo fmt) - Pass all clippy lints (
cargo clippy) - Add tests for new features
- Update documentation as needed
📄 License
This project is dual-licensed under:
- MIT License (LICENSE-MIT or http://opensource.org/licenses/MIT)
- Apache License 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
You may choose either license for your use.
📧 Contact & Support
- Author: Chukwuma Okoroji
- Email: dandychux@gmail.com
- GitHub: @dandychux
- Issues: GitHub Issues
🙏 Acknowledgments
This project was developed to improve maternal health outcomes through accessible, efficient AI systems that can run in resource-constrained environments.
Special thanks to the open-source Rust community and healthcare professionals who provided domain expertise.
⚠️ Important Notices
Clinical Use Disclaimer
This is a research and development tool. It is NOT approved for clinical use.
- Always consult qualified healthcare professionals for medical decisions
- Do not use as the sole basis for diagnosis or treatment
- Predictions should be validated in clinical context
- Follow all applicable regulations and guidelines
Data Privacy
When using this system with patient data:
- Comply with HIPAA, GDPR, and other relevant regulations
- Implement appropriate data security measures
- De-identify data when possible
- Document all data handling procedures
- Obtain necessary approvals and consents
Research Use
If using for research:
- Cite this project appropriately
- Follow ethical research guidelines
- Obtain IRB approval when required
- Report limitations and biases honestly
🚀 Quick Reference Card
Training
# Basic
# With config
# Full options
Prediction
# Single
# Batch
# Calibrated
Modalities
tabular- Structured datatemporal- Time-seriestext- Clinical notesimage- Medical images
Combine with commas: --modalities tabular,temporal,text
Version: 2.0.0 Last Updated: 2025 Status: Active Development