ARA-2 Client Library

Rust client library for the Kinara ARA-2 neural network accelerator. Provides session management, model loading, and inference on NXP i.MX platforms equipped with ARA-2 PCIe hardware.

Supported Platforms

Platform	SoC	Status
NXP FRDM i.MX 8M Plus	i.MX 8M Plus	Tested
NXP FRDM i.MX 95	i.MX 95	Tested

Requires EdgeFirst Yocto Images with ARA-2 SDK support.

Workspace

Crate	Description
`ara2`	Core client library — session, endpoint, model, and DVM metadata APIs
`ara2-sys`	FFI bindings to `libaraclient.so` via `libloading`

Integration with edgefirst-hal

The ara2 crate integrates with edgefirst-hal (enabled by default via the hal feature) for:

Tensor memory management — DMA-backed tensors for zero-copy NPU transfers
Image preprocessing — Hardware-accelerated format conversion and scaling
Post-processing — YOLO decoding, overlay rendering, segmentation masks

Disable the hal feature for a minimal FFI-only build:

cargo build --no-default-features

Python Bindings

Python bindings are available as a separate package via PyPI:

pip install edgefirst-ara2

See crates/ara2-py/README.md for the Python API reference.

Quick Start

use ara2::{Session, DEFAULT_SOCKET};
use edgefirst_hal::tensor::{TensorMemory, TensorTrait as _};

// Connect to the ARA-2 proxy service
let session = Session::create_via_unix_socket(DEFAULT_SOCKET)?;

// Enumerate NPU endpoints and check status
let endpoints = session.list_endpoints()?;
let endpoint = &endpoints[0];
println!("Endpoint state: {:?}", endpoint.check_status()?);

// Load a compiled model (.dvm) and allocate DMA tensors
let mut model = endpoint.load_model_from_file("model.dvm".as_ref())?;
model.allocate_tensors(Some(TensorMemory::Dma))?;

// Run inference
let timing = model.run()?;
println!("NPU inference: {:?}", timing.run_time);
# Ok::<(), ara2::Error>(())

Runtime Requirements

The following must be present on the target system:

libaraclient.so.1 — Kinara client library (from the ARA-2 SDK)
ara2-proxy — System service providing NPU access, must be running
ARA-2 hardware — PCIe accelerator card visible via lspci

Building

Native

cargo build --release

Cross-compile for aarch64 (NXP i.MX)

cargo zigbuild --release --target aarch64-unknown-linux-gnu

Performance

Benchmarked on NXP i.MX 8M Plus + ARA-2 with YOLOv8n (640x640), showing the Python API adds minimal overhead over native Rust thanks to DMA-BUF zero-copy tensor sharing — the GPU and NPU operate on the same physical buffers with no CPU copies in the data path.

Stage	Rust	Python	Overhead
GPU preprocess (RGBA → CHW)	6.35 ms	6.37 ms	+0.02 ms
NPU inference (wall clock)	8.95 ms	9.13 ms	+0.18 ms
NPU execution	3.33 ms	3.33 ms	—
DMA input upload	2.21 ms	2.20 ms	—
DMA output download	1.96 ms	1.96 ms	—
Postprocess (decode + NMS)	1.41 ms	2.53 ms	+1.12 ms
Total pipeline	16.71 ms	18.03 ms	+1.32 ms
Throughput	59.9 FPS	55.5 FPS

Steady-state mean over 20 iterations after warmup. The Python overhead is entirely in postprocessing (numpy array marshalling); GPU preprocessing and NPU inference are identical since both use the same DMA-BUF tensors.

Examples

Example	Description
`yolov8.rs`	Rust — YOLOv8 detection/segmentation with HAL pre/post-processing
`yolov8.py`	Python — YOLOv8 detection with DMA-BUF pipeline and HAL decoder
`endpoints.py`	Python — Connect, list endpoints, check status
`test_dvm_metadata.rs`	Rust — Read and display DVM model metadata

Run examples:

# Rust
cargo run --release --example yolov8 -- model.dvm image.jpg --benchmark 20

# Python
python examples/yolov8.py model.dvm image.jpg --benchmark 20

Testing

Tests require an NXP i.MX + ARA-2 system with the proxy running:

# All tests (on-target with hardware)
cargo test -p ara2

# Metadata tests only (no hardware needed)
cargo test -p ara2 dvm_metadata

# Model tests (needs a .dvm file)
ARA2_TEST_MODEL=/path/to/model.dvm cargo test -p ara2 model

Documentation

ARCHITECTURE.md — System architecture and ownership model
CONTRIBUTING.md — Contribution guidelines
SECURITY.md — Security policy
CHANGELOG.md — Release history

License

Licensed under the Apache License 2.0. See LICENSE for details.

ara2 0.2.0