# neug-rust
A Safe Rust wrapper for the [alibaba/neug](https://github.com/alibaba/neug) C++ graph database engine.
**High-performance embedded graph database for analytics and real-time transactions.**
[graphscope.io/neug](https://graphscope.io/docs/neug)
## Overview
This project provides high-level, idiomatic Rust bindings to the `neug` C++ library. It is designed as a Cargo workspace with several crates:
* `neug-sys`: Contains the low-level, unsafe FFI bindings generated by `bindgen`, the C++ compilation scripts, and the **`neug-worker`** binary.
* `neug-protocol`: Pure Rust data structures for Inter-Process Communication (IPC).
* `neug-rust` (in the `neug-bindings` directory): Provides a safe, user-friendly Rust API (`Database`, `Connection`, etc.) that orchestrates the underlying C++ engine via the sidecar worker.
> **Note:** The intelligent CMake build mechanism of this repository was heavily inspired by the excellent [zvec-rust-binding](https://github.com/igobypenn/zvec-rust-binding/) project. Like `zvec`, our build script will automatically download the required C++ source code during `cargo build` if it is not present locally.
## Architecture: The Sidecar Pattern
To prevent profound **One Definition Rule (ODR) Violations** and segmentation faults (e.g. `SIGABRT` during initialization) that occur when linking `neug-cpp` statically alongside other heavy C++ libraries (such as `zvec` which uses a conflicting version of Protobuf), this project utilizes a **Sidecar / IPC Architecture**.
When you create a new `Database` in Rust, `neug-rust` will transparently spawn a hidden child process (`neug-worker`). This completely isolates the C++ memory space, static initializers, and dependencies from your main Rust application.
* Your Rust app (`neug-rust`) communicates with the isolated C++ engine via high-speed, binary IPC (`bincode` over `stdin/stdout`).
* The worker process is automatically managed and terminated when the `Database` goes out of scope.
### Installing the Worker
Because the C++ engine is now isolated in a separate binary, you **must** build and ensure `neug-worker` is available in your `$PATH` before your application can open a database:
```bash
cargo install --path neug-sys --bin neug-worker
```
*(If you are developing inside this workspace, simply running `cargo build` is enough as the worker is built alongside the library).*
## Prerequisites
Building `neug` from source requires several C++ dependencies installed on your system, as defined by its CMake configuration. Please ensure you have the following installed (e.g., via `brew` on macOS or `apt` on Linux):
- CMake (>= 3.16)
- C++20 compatible compiler (Clang/GCC)
- OpenSSL
- gflags, glog
*Note: Heavy dependencies like Apache Arrow, Protobuf, and Abseil are automatically downloaded and built by the internal CMake script.*
## Getting Started
### Local Development
1. **Clone the repository with submodules:**
```bash
git clone --recursive https://github.com/miofthena/neug-rust.git
cd neug-rust
```
*(If already cloned, run `git submodule update --init --recursive`)*
2. **Build the workspace:**
```bash
cargo build
```
*Note: The first build will take a significant amount of time (often >5 minutes) as it compiles the entire C++ codebase and its dependencies.*
3. **Run Tests:**
```bash
cargo test
```
4. **Run Examples:**
```bash
cargo run --example simple_example
cargo run --example crud_operations
cargo run --example parallel_query
```
### Speeding up C++ Compilation
To prevent the underlying C++ library (`neug-cpp`) from compiling from scratch on subsequent builds or across workspaces, it is highly recommended to install a compiler cache tool. Our build script automatically detects and utilizes them:
* Install `sccache` (via `cargo install sccache` or brew/apt)
* Or install `ccache` (via brew/apt)
## Performance & Benchmarks
The library is continuously benchmarked using `criterion` to measure the overhead introduced by the Rust FFI boundary, data preparation, and full end-to-end workload executions. Because the wrappers are extremely thin, the actual dispatch overhead is practically non-existent.
Recent benchmarks on an Apple Silicon (M-series) chip highlight the performance and stability improvements made to the Rust/C++ interface, specifically regarding safe string buffer allocations during `CREATE` statements and `WAL` ingestion:
* **Connection Lifecycle:** `~32-33 ns` - The time required to request a connection proxy from the C++ engine pool.
* **Query Dispatch (Parse & Execute):** `~97.7 µs` - The total time it takes to allocate strings in Rust, pass them across the FFI boundary, and have the C++ engine parse and execute a simple Cypher `MATCH` query.
* **Graph Insertion (DML):** `~130 µs` - End-to-end execution of a `CREATE (:Person {id: X, name: '...'})` statement. Includes dynamic expansion of string buffers without memory corruption or "Out of bounds" exceptions.
* **Graph Traversal (Pathing):** `~79.5 µs` - Extremely fast read paths spanning multiple edges in the embedded graph.
To run the benchmarks yourself locally:
```bash
cargo bench
```
## Usage Example
```rust
use neug_rust::{Database, Mode};
use tempfile::tempdir;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let dir = tempdir()?;
// 1. Initialize the database
let mut db = Database::open(dir.path(), Mode::ReadWrite)?;
// 2. Open a connection
let mut conn = db.connect()?;
// 3. Execute DDL / DML queries
conn.execute("CREATE NODE TABLE person(id INT64, name STRING, age INT64, PRIMARY KEY(id));")?;
conn.execute("CREATE REL TABLE knows(FROM person TO person, weight DOUBLE);")?;
// 4. Query data
let _result = conn.execute("MATCH (n)-[e]-(m) return count(e);")?;
println!("Queries executed successfully.");
// The database and connections are automatically closed when dropped.
Ok(())
}
```
## Contributing
1. Add high-level, safe Rust abstractions in `neug-bindings/src/`.
2. Ensure you adhere to standard Rust community practices (e.g., `cargo fmt`, `cargo clippy`).
3. Add tests to verify your safe wrappers in `neug-bindings/tests/`.
## License
This wrapper is licensed under the same terms as the `neug` project (Apache License 2.0).