Skip to main content

RknnModel

Struct RknnModel 

Source
pub struct RknnModel { /* private fields */ }
Expand description

A loaded RKNN model ready for inference.

This is the main type you interact with. It holds the model, pre-allocated zero-copy memory buffers for input and outputs, and handles all communication with the NPU.

§Lifecycle

  1. Load a model with load or load_with_lib. This initializes the NPU context and allocates memory buffers.
  2. Inspect tensor metadata via input_attr and output_attrs to learn expected shapes and formats.
  3. Run inference with run - pass raw RGB bytes (NHWC, u8, no normalization).
  4. Read results with output_raw (zero-copy &[i8]) or output_f32 (dequantized Vec<f32>).

§Example

use rknn_runtime::RknnModel;

let model = RknnModel::load("model.rknn")?;

// Check what the model expects
let input = model.input_attr();
// e.g. [1, 320, 320, 3]
println!("Input shape: {:?}", input.shape);

// Run inference
model.run(&rgb_bytes)?;

// Get raw INT8 output (zero-copy - no allocation, just a slice into NPU memory)
let raw = model.output_raw(0)?;

// Or get dequantized f32 output (allocates a new Vec)
let floats = model.output_f32(0)?;

§Drop order

Internally, memory buffers are dropped before the RKNN context. This is handled automatically - you don’t need to worry about it.

Implementations§

Source§

impl RknnModel

Source

pub fn load(model_path: &str) -> Result<Self, Error>

Load a .rknn model from a file.

Uses the default library path (/usr/lib/librknnmrt.so). If your librknnmrt.so is elsewhere, use load_with_lib.

§Errors
Source

pub fn load_with_lib(model_path: &str, lib_path: &str) -> Result<Self, Error>

Load a .rknn model from a file, using a custom library path.

let model = RknnModel::load_with_lib(
    "model.rknn",
    "/opt/rknn/lib/librknnmrt.so",
)?;
Source

pub fn load_from_bytes(model_data: &[u8], lib_path: &str) -> Result<Self, Error>

Load a model from raw bytes already in memory.

Useful when the .rknn file is embedded in your binary or received over the network.

Source

pub fn input_attr(&self) -> &TensorAttr

Input tensor metadata (shape, format, data type).

The shape is typically [1, H, W, 3] (NHWC). Use this to know what image size the model expects:

let input = model.input_attr();
let (h, w) = (input.shape[1], input.shape[2]);
println!("Model expects {}x{} RGB image", h, w);
Source

pub fn output_attrs(&self) -> &[TensorAttr]

Output tensor metadata for all outputs.

Most models have a single output, but some could have several. Each TensorAttr contains the shape, format, quantization zero-point and scale - everything you need to decode the output.

Source

pub fn run(&self, input: &[u8]) -> Result<(), Error>

Run inference on the NPU.

input must be raw RGB bytes in NHWC format ([1, H, W, 3]). No normalization, no channel reordering - just plain u8 pixel values.

After this returns, read results with output_raw or output_f32.

§What happens inside
  1. Copies input bytes into the pre-allocated NPU input buffer.
  2. Calls rknn_run() - the NPU executes the model.
  3. Calls rknn_mem_sync() on each output buffer (syncs NPU cache to CPU). In my case: this step is critical on RV1106 - without it, I get stale data.
Source

pub fn output_raw(&self, index: usize) -> Result<&[i8], Error>

Raw INT8 output data for the given output index.

Returns a slice pointing directly into the NPU’s zero-copy buffer. No allocation, no copying - this is as fast as it gets.

The data is in whatever layout the NPU uses (often NC1HWC2). Use nc1hwc2_to_flat to convert it to standard NCHW if needed.

§Errors

Returns Error::InvalidIndex if index is out of range.

Source

pub fn output_f32(&self, index: usize) -> Result<Vec<f32>, Error>

Dequantized f32 output for the given output index.

Converts each raw INT8 value to f32 using affine dequantization:

value = (raw_i8 - zero_point) * scale

Zero-point and scale are read from the tensor’s quantization parameters (set during model conversion).

Note: This allocates a new Vec<f32>. If you need to dequantize only part of the output (e.g. after NC1HWC2 conversion), use dequantize_affine directly.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.