Struct RknnModel

Source

pub struct RknnModel { /* private fields */ }

Expand description

A loaded RKNN model ready for inference.

This is the main type you interact with. It holds the model, pre-allocated zero-copy memory buffers for input and outputs, and handles all communication with the NPU.

§Lifecycle

Load a model with load or load_with_lib. This initializes the NPU context and allocates memory buffers.
Inspect tensor metadata via input_attr and output_attrs to learn expected shapes and formats.
Run inference with run - pass raw RGB bytes (NHWC, u8, no normalization).
Read results with output_raw (zero-copy &[i8]) or output_f32 (dequantized Vec<f32>).

§Example

use rknn_runtime::RknnModel;

let model = RknnModel::load("model.rknn")?;

// Check what the model expects
let input = model.input_attr();
// e.g. [1, 320, 320, 3]
println!("Input shape: {:?}", input.shape);

// Run inference
model.run(&rgb_bytes)?;

// Get raw INT8 output (zero-copy - no allocation, just a slice into NPU memory)
let raw = model.output_raw(0)?;

// Or get dequantized f32 output (allocates a new Vec)
let floats = model.output_f32(0)?;

§Drop order

Internally, memory buffers are dropped before the RKNN context. This is handled automatically - you don’t need to worry about it.

Implementations§

Source §

impl RknnModel

Source

pub fn load(model_path: &str) -> Result<Self, Error>

Load a .rknn model from a file.

Uses the default library path (/usr/lib/librknnmrt.so). If your librknnmrt.so is elsewhere, use load_with_lib.

§Errors

Error::IoError if the file cannot be read.
Error::LibraryNotFound if librknnmrt.so is not found.
Error::InitFailed if the NPU rejects the model.

Source

pub fn load_with_lib(model_path: &str, lib_path: &str) -> Result<Self, Error>

Load a .rknn model from a file, using a custom library path.

let model = RknnModel::load_with_lib(
    "model.rknn",
    "/opt/rknn/lib/librknnmrt.so",
)?;

Source

pub fn load_from_bytes(model_data: &[u8], lib_path: &str) -> Result<Self, Error>

Load a model from raw bytes already in memory.

Useful when the .rknn file is embedded in your binary or received over the network.

Source

pub fn input_attr(&self) -> &TensorAttr

Input tensor metadata (shape, format, data type).

The shape is typically [1, H, W, 3] (NHWC). Use this to know what image size the model expects:

let input = model.input_attr();
let (h, w) = (input.shape[1], input.shape[2]);
println!("Model expects {}x{} RGB image", h, w);

Source

pub fn output_attrs(&self) -> &[TensorAttr]

Output tensor metadata for all outputs.

Most models have a single output, but some could have several. Each TensorAttr contains the shape, format, quantization zero-point and scale - everything you need to decode the output.

Source

pub fn run(&self, input: &[u8]) -> Result<(), Error>

Run inference on the NPU.

input must be raw RGB bytes in NHWC format ([1, H, W, 3]). No normalization, no channel reordering - just plain u8 pixel values.

After this returns, read results with output_raw or output_f32.

§What happens inside

Copies input bytes into the pre-allocated NPU input buffer.
Calls rknn_run() - the NPU executes the model.
Calls rknn_mem_sync() on each output buffer (syncs NPU cache to CPU). In my case: this step is critical on RV1106 - without it, I get stale data.

Source

pub fn output_raw(&self, index: usize) -> Result<&[i8], Error>

Raw INT8 output data for the given output index.

Returns a slice pointing directly into the NPU’s zero-copy buffer. No allocation, no copying - this is as fast as it gets.

The data is in whatever layout the NPU uses (often NC1HWC2). Use nc1hwc2_to_flat to convert it to standard NCHW if needed.

§Errors

Returns Error::InvalidIndex if index is out of range.

Source

pub fn output_nc1hwc2_layout( &self, index: usize, ) -> Result<Nc1hwc2Layout, Error>

Precomputed NC1HWC2 layout for the given output index.

Returns an Nc1hwc2Layout with all shape and quantization parameters precomputed. Use this at model load time to prepare channel offset tables, then use them in the per-image (frame of video, most of time) hot loop with zero division.

§Errors

Error::InvalidIndex if index is out of range.
Error::InvalidFormat if the output is not NC1HWC2.

Source

pub fn output_f32(&self, index: usize) -> Result<Vec<f32>, Error>

Dequantized f32 output for the given output index.

Converts each raw INT8 value to f32 using affine dequantization:

value = (raw_i8 - zero_point) * scale

Zero-point and scale are read from the tensor’s quantization parameters (set during model conversion).

Note: This allocates a new Vec<f32>. If you need to dequantize only part of the output (e.g. after NC1HWC2 conversion), use dequantize_affine directly.