ara2 0.2.0

Rust client library for the Kinara ARA-2 neural network accelerator on NXP i.MX platforms
Documentation

ARA-2 Client Library

CI License crates.io

Rust client library for the Kinara ARA-2 neural network accelerator. Provides session management, model loading, and inference on NXP i.MX platforms equipped with ARA-2 PCIe hardware.

Supported Platforms

Platform SoC Status
NXP FRDM i.MX 8M Plus i.MX 8M Plus Tested
NXP FRDM i.MX 95 i.MX 95 Tested

Requires EdgeFirst Yocto Images with ARA-2 SDK support.

Workspace

Crate Description
ara2 Core client library — session, endpoint, model, and DVM metadata APIs
ara2-sys FFI bindings to libaraclient.so via libloading

Integration with edgefirst-hal

The ara2 crate integrates with edgefirst-hal (enabled by default via the hal feature) for:

  • Tensor memory management — DMA-backed tensors for zero-copy NPU transfers
  • Image preprocessing — Hardware-accelerated format conversion and scaling
  • Post-processing — YOLO decoding, overlay rendering, segmentation masks

Disable the hal feature for a minimal FFI-only build:

cargo build --no-default-features

Python Bindings

Python bindings are available as a separate package via PyPI:

pip install edgefirst-ara2

See crates/ara2-py/README.md for the Python API reference.

Quick Start

use ara2::{Session, DEFAULT_SOCKET};
use edgefirst_hal::tensor::{TensorMemory, TensorTrait as _};

// Connect to the ARA-2 proxy service
let session = Session::create_via_unix_socket(DEFAULT_SOCKET)?;

// Enumerate NPU endpoints and check status
let endpoints = session.list_endpoints()?;
let endpoint = &endpoints[0];
println!("Endpoint state: {:?}", endpoint.check_status()?);

// Load a compiled model (.dvm) and allocate DMA tensors
let mut model = endpoint.load_model_from_file("model.dvm".as_ref())?;
model.allocate_tensors(Some(TensorMemory::Dma))?;

// Run inference
let timing = model.run()?;
println!("NPU inference: {:?}", timing.run_time);
# Ok::<(), ara2::Error>(())

Runtime Requirements

The following must be present on the target system:

  • libaraclient.so.1 — Kinara client library (from the ARA-2 SDK)
  • ara2-proxy — System service providing NPU access, must be running
  • ARA-2 hardware — PCIe accelerator card visible via lspci

Building

Native

cargo build --release

Cross-compile for aarch64 (NXP i.MX)

cargo zigbuild --release --target aarch64-unknown-linux-gnu

Performance

Benchmarked on NXP i.MX 8M Plus + ARA-2 with YOLOv8n (640x640), showing the Python API adds minimal overhead over native Rust thanks to DMA-BUF zero-copy tensor sharing — the GPU and NPU operate on the same physical buffers with no CPU copies in the data path.

Stage Rust Python Overhead
GPU preprocess (RGBA → CHW) 6.35 ms 6.37 ms +0.02 ms
NPU inference (wall clock) 8.95 ms 9.13 ms +0.18 ms
  NPU execution 3.33 ms 3.33 ms
  DMA input upload 2.21 ms 2.20 ms
  DMA output download 1.96 ms 1.96 ms
Postprocess (decode + NMS) 1.41 ms 2.53 ms +1.12 ms
Total pipeline 16.71 ms 18.03 ms +1.32 ms
Throughput 59.9 FPS 55.5 FPS

Steady-state mean over 20 iterations after warmup. The Python overhead is entirely in postprocessing (numpy array marshalling); GPU preprocessing and NPU inference are identical since both use the same DMA-BUF tensors.

Examples

Example Description
yolov8.rs Rust — YOLOv8 detection/segmentation with HAL pre/post-processing
yolov8.py Python — YOLOv8 detection with DMA-BUF pipeline and HAL decoder
endpoints.py Python — Connect, list endpoints, check status
test_dvm_metadata.rs Rust — Read and display DVM model metadata

Run examples:

# Rust
cargo run --release --example yolov8 -- model.dvm image.jpg --benchmark 20

# Python
python examples/yolov8.py model.dvm image.jpg --benchmark 20

Testing

Tests require an NXP i.MX + ARA-2 system with the proxy running:

# All tests (on-target with hardware)
cargo test -p ara2

# Metadata tests only (no hardware needed)
cargo test -p ara2 dvm_metadata

# Model tests (needs a .dvm file)
ARA2_TEST_MODEL=/path/to/model.dvm cargo test -p ara2 model

Documentation

License

Licensed under the Apache License 2.0. See LICENSE for details.

Copyright 2025 Au-Zone Technologies. All Rights Reserved.