pub struct Model { /* private fields */ }Expand description
The central type used to execute RTen machine learning models.
Models are loaded from .rten format model files using Model::load and
executed using Model::run or one of the other run_* methods. They
take a list of tensor views as inputs, perform a series of computations and
return one or more output tensors. .rten models use
FlatBuffers and are conceptually
similar to the .ort format used by ONNX Runtime and .tflite used by
TensorFlow Lite.
RTen models are logically graphs consisting of three types of nodes:
- Values which are supplied or generated at runtime
- Constants which are the weights, biases and other parameters of the model. Their values are determined when the model is trained.
- Operators which combine the values and constants using operations such as matrix multiplication, convolution etc.
Some of these nodes are designated as inputs and outputs. The IDs of these nodes can be obtained using Model::input_ids and Model::output_ids. These IDs are then used when calling Model::run. Model execution consists of generating a plan which starts with the input nodes, and executes the necessary operators to generate the requested outputs.
§Partial evaluation
Some models, such as transformer decoders, are evaluated repeatedly in a loop. If such models have inputs which are constant in each iteration of the loop, execution can be sped up by using partial evaluation. This involves evaluating the part of the graph that depends only on the constant inputs once, outside the loop. To do this use Model::partial_run.
§Custom operator registries
By default all supported ONNX operators are available for use by the model. You can reduce binary size and compilation time by loading a model with only a subset of operators enabled. See Model::load_with_ops.
Implementations§
source§impl Model
impl Model
sourcepub fn load(data: &[u8]) -> Result<Model, ModelLoadError>
pub fn load(data: &[u8]) -> Result<Model, ModelLoadError>
Load a serialized model.
The model will have all of the built-in operators available to it (see OpRegistry::with_all_ops).
sourcepub fn load_with_ops(
data: &[u8],
registry: &OpRegistry
) -> Result<Model, ModelLoadError>
pub fn load_with_ops( data: &[u8], registry: &OpRegistry ) -> Result<Model, ModelLoadError>
Load a serialized model with a custom operator registry.
sourcepub fn find_node(&self, id: &str) -> Option<NodeId>
pub fn find_node(&self, id: &str) -> Option<NodeId>
Find a node in the model’s graph given its string name.
sourcepub fn node_id(&self, id: &str) -> Result<NodeId, RunError>
pub fn node_id(&self, id: &str) -> Result<NodeId, RunError>
Find a node in the model’s graph given its string name.
This is a convenience method which is like Model::find_node but returns an error that includes the node’s name if the node is not found.
sourcepub fn node_info(&self, id: NodeId) -> Option<NodeInfo<'_>>
pub fn node_info(&self, id: NodeId) -> Option<NodeInfo<'_>>
Return metadata about a node in the model’s graph.
sourcepub fn metadata(&self) -> &ModelMetadata
pub fn metadata(&self) -> &ModelMetadata
Return metadata about the model.
sourcepub fn output_ids(&self) -> &[NodeId]
pub fn output_ids(&self) -> &[NodeId]
Return the IDs of output nodes.
sourcepub fn total_params(&self) -> usize
pub fn total_params(&self) -> usize
Return the total number of parameters in the model’s weights.
sourcepub fn input_shape(&self, index: usize) -> Option<Vec<Dimension>>
pub fn input_shape(&self, index: usize) -> Option<Vec<Dimension>>
Convenience method that returns the expected input shape for the index’th input.
The shape may contain a mix of fixed and symbolic dimensions.
sourcepub fn run(
&self,
inputs: &[(NodeId, Input<'_>)],
outputs: &[NodeId],
opts: Option<RunOptions>
) -> Result<Vec<Output>, RunError>
pub fn run( &self, inputs: &[(NodeId, Input<'_>)], outputs: &[NodeId], opts: Option<RunOptions> ) -> Result<Vec<Output>, RunError>
Execute the model and return the outputs specified by outputs.
This method allows for running a model with a variable number of inputs and outputs of different types. See Model::run_one or Model::run_n for the common case of running a model with a single or statically known number of inputs and outputs.
The input and output nodes are specified via IDs looked up via find_node.
sourcepub fn run_n<const N: usize>(
&self,
inputs: &[(NodeId, Input<'_>)],
outputs: [NodeId; N],
opts: Option<RunOptions>
) -> Result<[Output; N], RunError>
pub fn run_n<const N: usize>( &self, inputs: &[(NodeId, Input<'_>)], outputs: [NodeId; N], opts: Option<RunOptions> ) -> Result<[Output; N], RunError>
Run a model and retrieve N outputs.
This is a simplified version of Model::run for the common case of executing a model with a statically known number of outputs.
sourcepub fn run_one(
&self,
input: Input<'_>,
opts: Option<RunOptions>
) -> Result<Output, RunError>
pub fn run_one( &self, input: Input<'_>, opts: Option<RunOptions> ) -> Result<Output, RunError>
Run a model with a single input and output.
This is a simplified version of Model::run for the common case of executing a model with a single input and output.
sourcepub fn partial_run(
&self,
inputs: &[(NodeId, Input<'_>)],
outputs: &[NodeId],
opts: Option<RunOptions>
) -> Result<Vec<(NodeId, Output)>, RunError>
pub fn partial_run( &self, inputs: &[(NodeId, Input<'_>)], outputs: &[NodeId], opts: Option<RunOptions> ) -> Result<Vec<(NodeId, Output)>, RunError>
Run the model using an incomplete set of inputs.
Unlike run this will not fail if some values required to
compute outputs are missing. Instead it will compute as many
intermediate values as possible using the provided inputs and return the
leaf values of the subgraph that was executed. These intermediate
outputs can then be passed to future calls to run when
the other inputs are available.
This method can speed up autoregressive / recurrent models where the
model is run in a loop during inference, but some inputs are constant
across each iteration of the loop. In such cases, execution times can be
reduced by performing a partial_run once outside the loop, providing
the constant inputs, and the results can be provided together with the
the remaining inputs to run calls inside the loop.
Auto Trait Implementations§
impl Freeze for Model
impl !RefUnwindSafe for Model
impl Send for Model
impl Sync for Model
impl Unpin for Model
impl !UnwindSafe for Model
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> IntoEither for T
impl<T> IntoEither for T
source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moresource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more