# ARA-2 Client Library
[](https://github.com/EdgeFirstAI/ara2-rs/actions/workflows/test.yml)
[](https://opensource.org/licenses/Apache-2.0)
[](https://crates.io/crates/ara2)
Rust client library for the [Kinara](https://kinara.ai) ARA-2 neural network
accelerator. Provides session management, model loading, and inference on
NXP i.MX platforms equipped with ARA-2 PCIe hardware.
## Supported Platforms
| [NXP FRDM i.MX 8M Plus](https://www.nxp.com/design/design-center/development-boards-and-designs/FRDM-IMX8MPLUS) | i.MX 8M Plus | Tested |
| [NXP FRDM i.MX 95](https://www.nxp.com/design/design-center/development-boards-and-designs/i-mx-evaluation-and-development-boards/freedom-development-platform-for-i-mx-95:FRDM-IMX95) | i.MX 95 | Tested |
Requires [EdgeFirst Yocto Images](https://github.com/EdgeFirstAI/yocto) with ARA-2 SDK support.
## Workspace
| [`ara2`](crates/ara2) | Core client library — session, endpoint, model, and DVM metadata APIs |
| [`ara2-sys`](crates/ara2-sys) | FFI bindings to `libaraclient.so` via `libloading` |
### Integration with edgefirst-hal
The `ara2` crate integrates with [`edgefirst-hal`](https://crates.io/crates/edgefirst-hal)
(enabled by default via the `hal` feature) for:
- **Tensor memory management** — DMA-backed tensors for zero-copy NPU transfers
- **Image preprocessing** — Hardware-accelerated format conversion and scaling
- **Post-processing** — YOLO decoding, overlay rendering, segmentation masks
Disable the `hal` feature for a minimal FFI-only build:
```bash
cargo build --no-default-features
```
### Python Bindings
Python bindings are available as a separate package via PyPI:
```bash
pip install edgefirst-ara2
```
See [`crates/ara2-py/README.md`](crates/ara2-py/README.md) for the Python API reference.
## Quick Start
```rust
use ara2::{Session, DEFAULT_SOCKET};
use edgefirst_hal::tensor::{TensorMemory, TensorTrait as _};
// Connect to the ARA-2 proxy service
let session = Session::create_via_unix_socket(DEFAULT_SOCKET)?;
// Enumerate NPU endpoints and check status
let endpoints = session.list_endpoints()?;
let endpoint = &endpoints[0];
println!("Endpoint state: {:?}", endpoint.check_status()?);
// Load a compiled model (.dvm) and allocate DMA tensors
let mut model = endpoint.load_model_from_file("model.dvm".as_ref())?;
model.allocate_tensors(Some(TensorMemory::Dma))?;
// Run inference
let timing = model.run()?;
println!("NPU inference: {:?}", timing.run_time);
# Ok::<(), ara2::Error>(())
```
## Runtime Requirements
The following must be present on the target system:
- **`libaraclient.so.1`** — Kinara client library (from the ARA-2 SDK)
- **`ara2-proxy`** — System service providing NPU access, must be running
- **ARA-2 hardware** — PCIe accelerator card visible via `lspci`
## Building
### Native
```bash
cargo build --release
```
### Cross-compile for aarch64 (NXP i.MX)
```bash
cargo zigbuild --release --target aarch64-unknown-linux-gnu
```
## Performance
Benchmarked on NXP i.MX 8M Plus + ARA-2 with YOLOv8n (640x640), showing
the Python API adds minimal overhead over native Rust thanks to DMA-BUF
zero-copy tensor sharing — the GPU and NPU operate on the same physical
buffers with no CPU copies in the data path.
| GPU preprocess (RGBA → CHW) | 6.35 ms | 6.37 ms | +0.02 ms |
| NPU inference (wall clock) | 8.95 ms | 9.13 ms | +0.18 ms |
| NPU execution | 3.33 ms | 3.33 ms | — |
| DMA input upload | 2.21 ms | 2.20 ms | — |
| DMA output download | 1.96 ms | 1.96 ms | — |
| Postprocess (decode + NMS) | 1.41 ms | 2.53 ms | +1.12 ms |
| **Total pipeline** | **16.71 ms** | **18.03 ms** | **+1.32 ms** |
| **Throughput** | **59.9 FPS** | **55.5 FPS** | |
> Steady-state mean over 20 iterations after warmup. The Python overhead
> is entirely in postprocessing (numpy array marshalling); GPU preprocessing
> and NPU inference are identical since both use the same DMA-BUF tensors.
## Examples
| [`yolov8.rs`](examples/yolov8.rs) | Rust — YOLOv8 detection/segmentation with HAL pre/post-processing |
| [`yolov8.py`](examples/yolov8.py) | Python — YOLOv8 detection with DMA-BUF pipeline and HAL decoder |
| [`endpoints.py`](examples/endpoints.py) | Python — Connect, list endpoints, check status |
| [`test_dvm_metadata.rs`](examples/test_dvm_metadata.rs) | Rust — Read and display DVM model metadata |
Run examples:
```bash
# Rust
cargo run --release --example yolov8 -- model.dvm image.jpg --benchmark 20
# Python
python examples/yolov8.py model.dvm image.jpg --benchmark 20
```
## Testing
Tests require an NXP i.MX + ARA-2 system with the proxy running:
```bash
# All tests (on-target with hardware)
cargo test -p ara2
# Metadata tests only (no hardware needed)
cargo test -p ara2 dvm_metadata
# Model tests (needs a .dvm file)
ARA2_TEST_MODEL=/path/to/model.dvm cargo test -p ara2 model
```
## Documentation
- [ARCHITECTURE.md](ARCHITECTURE.md) — System architecture and ownership model
- [CONTRIBUTING.md](CONTRIBUTING.md) — Contribution guidelines
- [SECURITY.md](SECURITY.md) — Security policy
- [CHANGELOG.md](CHANGELOG.md) — Release history
## License
Licensed under the Apache License 2.0. See [LICENSE](LICENSE) for details.
Copyright 2025 Au-Zone Technologies. All Rights Reserved.