Skip to main content

OnnxModel

Struct OnnxModel 

Source
pub struct OnnxModel { /* private fields */ }
Expand description

An ONNX model loaded from a protobuf file.

Provides methods for inspecting, extracting weights, saving quantized models, and validating graph connectivity.

Implementations§

Source§

impl OnnxModel

Source

pub fn load(path: impl AsRef<Path>) -> Result<Self>

Load an ONNX model from a file path.

§Errors

Returns QuantizeError::ModelLoad if the file cannot be opened, is too large (>10 GB), or contains invalid protobuf data.

Source

pub fn info(&self) -> ModelInfo

Return a summary of the model’s structure.

Source

pub fn input_shapes(&self) -> Vec<Vec<i64>>

Return the shapes of each graph input from the protobuf type info.

Each inner Vec<i64> contains the dimension values. Dynamic dims (symbolic or missing) are returned as -1. Returns one entry per graph.input that has tensor type information.

Source

pub fn extract_weights(&self) -> Vec<WeightTensor>

Extract all FP32 weight tensors from the model’s initializers.

Source

pub fn total_size_bytes(&self) -> usize

Total size of all weight tensors in bytes (float32).

Prefer computing this from already-extracted weights when available: weights.iter().map(|w| w.size_bytes()).sum() avoids reparsing.

Source§

impl OnnxModel

Source

pub fn save_quantized( &mut self, quantized_data: &[QdqWeightInput], path: impl AsRef<Path>, ) -> Result<()>

Save a quantized model using the QDQ (DequantizeLinear) pattern.

Signature is identical to v0.2.0 — existing callers (CLI, calibration pipeline, examples) compile without changes.

§What changed internally

v0.2.0 appended metadata to initializer names (e.g. conv1.weightconv1.weight__qINT8_s0.001_z-3_len9408) without updating the nodes that reference them. ONNX Runtime rejected these models on load.

v0.3.0 inserts a DequantizeLinear node per weight. The node’s output carries the original name, so every downstream node is unchanged. Graph connectivity is preserved by construction, and the resulting model loads and runs in ONNX Runtime.

§INT4 storage note

DequantizeLinear requires INT8 input (opset < 21). INT4-quantized values ([-8, 7]) are stored as INT8 bytes. Quantization accuracy is still INT4-level; only the on-disk size is 4× instead of the 8× that bit-packing would give. True INT4 packing is a v0.4.0 target.

Source§

impl OnnxModel

Source

pub fn validate_connectivity(&self) -> ConnectivityReport

Check that every node input in the graph resolves to a known tensor.

A “known tensor” is one of:

  • a declared graph input
  • an initializer
  • the output of a node appearing earlier in the node list

This is the exact check ONNX Runtime performs on load. It’s the check that v0.2.0’s validate command skipped, which is why the rename bug went undetected. Integrate report.summary() into the CLI validate output alongside the existing structure / weight checks.

Source§

impl OnnxModel

Source

pub fn load_quantized_info(&self) -> Vec<QuantizedWeightInfo>

Extract metadata about quantized weights from a QDQ-format model.

Looks for initializer triples: {base}_quantized, {base}_scale, {base}_zp

Scale and zero-point values are read directly from the tensors. Bit-width comes from metadata_props (written by save_quantized); defaults to 8 if the metadata entry is missing.

Trait Implementations§

Source§

impl Debug for OnnxModel

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> Downcast for T
where T: Any,

Source§

fn into_any(self: Box<T>) -> Box<dyn Any>

Convert Box<dyn Trait> (where Trait: Downcast) to Box<dyn Any>. Box<dyn Any> can then be further downcast into Box<ConcreteType> where ConcreteType implements Trait.
Source§

fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>

Convert Rc<Trait> (where Trait: Downcast) to Rc<Any>. Rc<Any> can then be further downcast into Rc<ConcreteType> where ConcreteType implements Trait.
Source§

fn as_any(&self) -> &(dyn Any + 'static)

Convert &Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot generate &Any’s vtable from &Trait’s.
Source§

fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)

Convert &mut Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot generate &mut Any’s vtable from &mut Trait’s.
Source§

impl<T> DowncastSync for T
where T: Any + Send + Sync,

Source§

fn into_any_arc(self: Arc<T>) -> Arc<dyn Any + Send + Sync>

Convert Arc<Trait> (where Trait: Downcast) to Arc<Any>. Arc<Any> can then be further downcast into Arc<ConcreteType> where ConcreteType implements Trait.
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V