docs.rs failed to build quantize-rs-0.1.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

Visit the last successful build: quantize-rs-0.6.0

quantize-rs

Simple, fast neural network quantization in pure Rust

quantize-rs makes your neural networks smaller and faster by converting float32 weights to INT8, achieving up to 4x compression with minimal accuracy loss.

Features

4x smaller models - Reduce model size by 75%
Fast - Pure Rust, no Python required
Simple - One command does everything
ONNX format - Works with PyTorch, TensorFlow, etc.
CLI & Library - Use as a tool or integrate into your code
Zero dependencies (runtime) - Single binary

Quick Start

Installation

cargo install quantize-rs

Or build from source:

git clone https://github.com/AR-Kamal/quantize-rs

cd quantize-rs

cargo build --release

Usage

# Quantize a model

quantize-rs quantize model.onnx -o model_int8.onnx


# Compare original vs quantized

quantize-rs benchmark model.onnx model_int8.onnx


# Show model info

quantize-rs info model.onnx

Results

Real-World Example: ResNet-18

$ quantize-rs quantize resnet18.onnx -o resnet18_int8.onnx


Loading model: resnet18.onnx

✓ Model loaded


🔧 Quantizing to INT8...

✓ Quantization complete


Results:

  Original size:    44.65 MB

  Quantized size:   11.18 MB

  Compression:      4.00x smaller

  Avg MSE error:    0.000003


Saved to: resnet18_int8.onnx

Result: 75% size reduction, <0.001% error!

Documentation

Commands

`quantize-rs quantize`

Quantize a neural network model to INT8.

quantize-rs quantize <MODEL> [OPTIONS]


Options:

  -o, --output <FILE>     Output path (default: model_quantized.onnx)
  -b, --bits <8|4>        Quantization bits [default: 8]

      --per-channel       Use per-channel quantization

  -h, --help              Print help

Example:

quantize-rs quantize resnet18.onnx -o resnet18_int8.onnx

`quantize-rs benchmark`

Compare original and quantized models.

quantize-rs benchmark <ORIGINAL> <QUANTIZED>

Example:

quantize-rs benchmark resnet18.onnx resnet18_int8.onnx

Output:

Model Structure:
  Nodes:       69 vs 69 ✓
  Tensors:     102 vs 102 ✓

File Size:
  Original:    44.65 MB
  Quantized:   11.18 MB
  Reduction:   75.0%

Summary:
  ✓ Structure preserved
  ✓ Excellent compression
  ✓ All weights quantized

`quantize-rs info`

Display model information.

quantize-rs info <MODEL>

Library Usage

Use quantize-rs in your Rust code:

use quantize_rs::{OnnxModel, Quantizer, QuantConfig};

fn main() -> anyhow::Result<()> {
    // Load model
    let mut model = OnnxModel::load("model.onnx")?;
    
    // Extract weights
    let weights = model.extract_weights();
    
    // Quantize
    let config = QuantConfig::int8();
    let quantizer = Quantizer::new(config);
    
    let mut quantized_data = Vec::new();
    for weight in &weights {
        let quantized = quantizer.quantize_tensor(
            &weight.data, 
            weight.shape.clone()
        )?;
        
        quantized_data.push((
            weight.name.clone(),
            quantized.data,
            quantized.params,
        ));
    }
    
    // Save
    model.save_quantized(&quantized_data, "model_int8.onnx")?;
    
    Ok(())
}

How It Works

Quantization Process

Load ONNX model - Parse protobuf format
Extract weights - Find all trainable parameters (from graph.initializer)
Calculate scale/zero-point - Determine quantization parameters
Quantize - Convert float32 → int8 using: q = round(f / scale) + zero_point
Save - Write quantized model back to ONNX format

Why 4x Compression?

Float32: 4 bytes per number
INT8: 1 byte per number
Result: 4 bytes / 1 byte = 4x smaller

Quality Preservation

Quantization introduces minimal error:

Average MSE: < 0.00001
Typical accuracy loss: < 1% on most models
Production-ready: Used by TensorFlow Lite, PyTorch Mobile

Testing

Tested on real-world models:

Model	Original	Quantized	Compression	Accuracy Loss
ResNet-18	44.65 MB	11.18 MB	4.00x	~0.5%
MNIST CNN	0.03 MB	0.01 MB	4.00x	~0.1%
MobileNet	13.4 MB	3.4 MB	3.94x	~1.0%

Development

Build from source

git clone https://github.com/AR-Kamal/quantize-rs

cd quantize-rs

cargo build --release

Run tests

cargo test

Run examples

cd examples

cargo run --example basic_quantization

Examples

See the examples/ directory for:

Basic quantization
Batch processing multiple models
Custom quantization configurations
Integration with ML pipelines

Contributing

Contributions welcome! Please:

Fork the repository
Create a feature branch
Add tests for new features
Submit a pull request

📄 License

MIT License - see LICENSE for details.

Acknowledgments

Built with tract for ONNX parsing
Inspired by TensorFlow Lite quantization
Thanks to the Rust ML community

Contact

Issues: GitHub Issues
Discussions: GitHub Discussions

quantize-rs 0.1.0