Struct rten::Model

source ·
pub struct Model { /* private fields */ }
Expand description

The central type used to execute RTen machine learning models.

Models are loaded from .rten format model files using Model::load and executed using Model::run or one of the other run_* methods. They take a list of tensor views as inputs, perform a series of computations and return one or more output tensors. .rten models use FlatBuffers and are conceptually similar to the .ort format used by ONNX Runtime and .tflite used by TensorFlow Lite.

RTen models are logically graphs consisting of three types of nodes:

  • Values which are supplied or generated at runtime
  • Constants which are the weights, biases and other parameters of the model. Their values are determined when the model is trained.
  • Operators which combine the values and constants using operations such as matrix multiplication, convolution etc.

Some of these nodes are designated as inputs and outputs. The IDs of these nodes can be obtained using Model::input_ids and Model::output_ids. These IDs are then used when calling Model::run. Model execution consists of generating a plan which starts with the input nodes, and executes the necessary operators to generate the requested outputs.

§Partial evaluation

Some models, such as transformer decoders, are evaluated repeatedly in a loop. If such models have inputs which are constant in each iteration of the loop, execution can be sped up by using partial evaluation. This involves evaluating the part of the graph that depends only on the constant inputs once, outside the loop. To do this use Model::partial_run.

§Custom operator registries

By default all supported ONNX operators are available for use by the model. You can reduce binary size and compilation time by loading a model with only a subset of operators enabled. See Model::load_with_ops.

Implementations§

source§

impl Model

source

pub fn load(data: &[u8]) -> Result<Model, ModelLoadError>

Load a serialized model.

The model will have all of the built-in operators available to it (see OpRegistry::with_all_ops).

source

pub fn load_with_ops( data: &[u8], registry: &OpRegistry ) -> Result<Model, ModelLoadError>

Load a serialized model with a custom operator registry.

source

pub fn find_node(&self, id: &str) -> Option<NodeId>

Find a node in the model’s graph given its string name.

source

pub fn node_id(&self, id: &str) -> Result<NodeId, RunError>

Find a node in the model’s graph given its string name.

This is a convenience method which is like Model::find_node but returns an error that includes the node’s name if the node is not found.

source

pub fn node_info(&self, id: NodeId) -> Option<NodeInfo<'_>>

Return metadata about a node in the model’s graph.

source

pub fn metadata(&self) -> &ModelMetadata

Return metadata about the model.

source

pub fn input_ids(&self) -> &[NodeId]

Return the IDs of input nodes.

source

pub fn output_ids(&self) -> &[NodeId]

Return the IDs of output nodes.

source

pub fn total_params(&self) -> usize

Return the total number of parameters in the model’s weights.

source

pub fn input_shape(&self, index: usize) -> Option<Vec<Dimension>>

Convenience method that returns the expected input shape for the index’th input.

The shape may contain a mix of fixed and symbolic dimensions.

source

pub fn run( &self, inputs: &[(NodeId, Input<'_>)], outputs: &[NodeId], opts: Option<RunOptions> ) -> Result<Vec<Output>, RunError>

Execute the model and return the outputs specified by outputs.

This method allows for running a model with a variable number of inputs and outputs of different types. See Model::run_one or Model::run_n for the common case of running a model with a single or statically known number of inputs and outputs.

The input and output nodes are specified via IDs looked up via find_node.

source

pub fn run_n<const N: usize>( &self, inputs: &[(NodeId, Input<'_>)], outputs: [NodeId; N], opts: Option<RunOptions> ) -> Result<[Output; N], RunError>

Run a model and retrieve N outputs.

This is a simplified version of Model::run for the common case of executing a model with a statically known number of outputs.

source

pub fn run_one( &self, input: Input<'_>, opts: Option<RunOptions> ) -> Result<Output, RunError>

Run a model with a single input and output.

This is a simplified version of Model::run for the common case of executing a model with a single input and output.

source

pub fn partial_run( &self, inputs: &[(NodeId, Input<'_>)], outputs: &[NodeId], opts: Option<RunOptions> ) -> Result<Vec<(NodeId, Output)>, RunError>

Run the model using an incomplete set of inputs.

Unlike run this will not fail if some values required to compute outputs are missing. Instead it will compute as many intermediate values as possible using the provided inputs and return the leaf values of the subgraph that was executed. These intermediate outputs can then be passed to future calls to run when the other inputs are available.

This method can speed up autoregressive / recurrent models where the model is run in a loop during inference, but some inputs are constant across each iteration of the loop. In such cases, execution times can be reduced by performing a partial_run once outside the loop, providing the constant inputs, and the results can be provided together with the the remaining inputs to run calls inside the loop.

Auto Trait Implementations§

§

impl Freeze for Model

§

impl !RefUnwindSafe for Model

§

impl Send for Model

§

impl Sync for Model

§

impl Unpin for Model

§

impl !UnwindSafe for Model

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> IntoEither for T

source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
source§

impl<T> Pointable for T

source§

const ALIGN: usize = _

The alignment of pointer.
§

type Init = T

The type for initializers.
source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.