pub struct RknnModel { /* private fields */ }Expand description
A loaded RKNN model ready for inference.
This is the main type you interact with. It holds the model, pre-allocated zero-copy memory buffers for input and outputs, and handles all communication with the NPU.
§Lifecycle
- Load a model with
loadorload_with_lib. This initializes the NPU context and allocates memory buffers. - Inspect tensor metadata via
input_attrandoutput_attrsto learn expected shapes and formats. - Run inference with
run- pass raw RGB bytes (NHWC, u8, no normalization). - Read results with
output_raw(zero-copy&[i8]) oroutput_f32(dequantizedVec<f32>).
§Example
use rknn_runtime::RknnModel;
let model = RknnModel::load("model.rknn")?;
// Check what the model expects
let input = model.input_attr();
// e.g. [1, 320, 320, 3]
println!("Input shape: {:?}", input.shape);
// Run inference
model.run(&rgb_bytes)?;
// Get raw INT8 output (zero-copy - no allocation, just a slice into NPU memory)
let raw = model.output_raw(0)?;
// Or get dequantized f32 output (allocates a new Vec)
let floats = model.output_f32(0)?;§Drop order
Internally, memory buffers are dropped before the RKNN context. This is handled automatically - you don’t need to worry about it.
Implementations§
Source§impl RknnModel
impl RknnModel
Sourcepub fn load(model_path: &str) -> Result<Self, Error>
pub fn load(model_path: &str) -> Result<Self, Error>
Load a .rknn model from a file.
Uses the default library path (/usr/lib/librknnmrt.so).
If your librknnmrt.so is elsewhere, use load_with_lib.
§Errors
Error::IoErrorif the file cannot be read.Error::LibraryNotFoundiflibrknnmrt.sois not found.Error::InitFailedif the NPU rejects the model.
Sourcepub fn load_with_lib(model_path: &str, lib_path: &str) -> Result<Self, Error>
pub fn load_with_lib(model_path: &str, lib_path: &str) -> Result<Self, Error>
Load a .rknn model from a file, using a custom library path.
let model = RknnModel::load_with_lib(
"model.rknn",
"/opt/rknn/lib/librknnmrt.so",
)?;Sourcepub fn load_from_bytes(model_data: &[u8], lib_path: &str) -> Result<Self, Error>
pub fn load_from_bytes(model_data: &[u8], lib_path: &str) -> Result<Self, Error>
Load a model from raw bytes already in memory.
Useful when the .rknn file is embedded in your binary or received
over the network.
Sourcepub fn input_attr(&self) -> &TensorAttr
pub fn input_attr(&self) -> &TensorAttr
Input tensor metadata (shape, format, data type).
The shape is typically [1, H, W, 3] (NHWC).
Use this to know what image size the model expects:
let input = model.input_attr();
let (h, w) = (input.shape[1], input.shape[2]);
println!("Model expects {}x{} RGB image", h, w);Sourcepub fn output_attrs(&self) -> &[TensorAttr]
pub fn output_attrs(&self) -> &[TensorAttr]
Output tensor metadata for all outputs.
Most models have a single output, but some could have several.
Each TensorAttr contains the shape, format, quantization zero-point
and scale - everything you need to decode the output.
Sourcepub fn run(&self, input: &[u8]) -> Result<(), Error>
pub fn run(&self, input: &[u8]) -> Result<(), Error>
Run inference on the NPU.
input must be raw RGB bytes in NHWC format ([1, H, W, 3]).
No normalization, no channel reordering - just plain u8 pixel values.
After this returns, read results with output_raw
or output_f32.
§What happens inside
- Copies
inputbytes into the pre-allocated NPU input buffer. - Calls
rknn_run()- the NPU executes the model. - Calls
rknn_mem_sync()on each output buffer (syncs NPU cache to CPU). In my case: this step is critical on RV1106 - without it, I get stale data.
Sourcepub fn output_raw(&self, index: usize) -> Result<&[i8], Error>
pub fn output_raw(&self, index: usize) -> Result<&[i8], Error>
Raw INT8 output data for the given output index.
Returns a slice pointing directly into the NPU’s zero-copy buffer. No allocation, no copying - this is as fast as it gets.
The data is in whatever layout the NPU uses (often NC1HWC2).
Use nc1hwc2_to_flat to convert it
to standard NCHW if needed.
§Errors
Returns Error::InvalidIndex if index is out of range.
Sourcepub fn output_nc1hwc2_layout(
&self,
index: usize,
) -> Result<Nc1hwc2Layout, Error>
pub fn output_nc1hwc2_layout( &self, index: usize, ) -> Result<Nc1hwc2Layout, Error>
Precomputed NC1HWC2 layout for the given output index.
Returns an Nc1hwc2Layout with all shape and quantization parameters
precomputed. Use this at model load time to prepare channel offset tables,
then use them in the per-image (frame of video, most of time) hot loop with zero division.
§Errors
Error::InvalidIndexifindexis out of range.Error::InvalidFormatif the output is not NC1HWC2.
Sourcepub fn output_f32(&self, index: usize) -> Result<Vec<f32>, Error>
pub fn output_f32(&self, index: usize) -> Result<Vec<f32>, Error>
Dequantized f32 output for the given output index.
Converts each raw INT8 value to f32 using affine dequantization:
value = (raw_i8 - zero_point) * scaleZero-point and scale are read from the tensor’s quantization parameters (set during model conversion).
Note: This allocates a new Vec<f32>. If you need to dequantize
only part of the output (e.g. after NC1HWC2 conversion), use
dequantize_affine directly.