neug-rust 0.2.1

Safe Rust bindings for the NeuG C++ graph database engine
Documentation

neug-rust

A Safe Rust wrapper for the alibaba/neug C++ graph database engine.

High-performance embedded graph database for analytics and real-time transactions.
graphscope.io/neug

Overview

This project provides high-level, idiomatic Rust bindings to the neug C++ library. It is designed as a Cargo workspace with several crates:

  • neug-sys: Contains the low-level, unsafe FFI bindings generated by bindgen, the C++ compilation scripts, and the neug-worker binary.
  • neug-protocol: Pure Rust data structures for Inter-Process Communication (IPC).
  • neug-rust (in the neug-bindings directory): Provides a safe, user-friendly Rust API (Database, Connection, etc.) that orchestrates the underlying C++ engine via the sidecar worker.

Note: The intelligent CMake build mechanism of this repository was heavily inspired by the excellent zvec-rust-binding project. Like zvec, our build script will automatically download the required C++ source code during cargo build if it is not present locally.

Architecture: The Sidecar Pattern

To prevent profound One Definition Rule (ODR) Violations and segmentation faults (e.g. SIGABRT during initialization) that occur when linking neug-cpp statically alongside other heavy C++ libraries (such as zvec which uses a conflicting version of Protobuf), this project utilizes a Sidecar / IPC Architecture.

When you create a new Database in Rust, neug-rust will transparently spawn a hidden child process (neug-worker). This completely isolates the C++ memory space, static initializers, and dependencies from your main Rust application.

  • Your Rust app (neug-rust) communicates with the isolated C++ engine via high-speed, binary IPC (bincode over stdin/stdout).
  • The worker process is automatically managed and terminated when the Database goes out of scope.

Installing the Worker

Because the C++ engine is now isolated in a separate binary, you must build and ensure neug-worker is available in your $PATH before your application can open a database:

cargo install --path neug-sys --bin neug-worker

(If you are developing inside this workspace, simply running cargo build is enough as the worker is built alongside the library).

Prerequisites

Building neug from source requires several C++ dependencies installed on your system, as defined by its CMake configuration. Please ensure you have the following installed (e.g., via brew on macOS or apt on Linux):

  • CMake (>= 3.16)
  • C++20 compatible compiler (Clang/GCC)
  • OpenSSL
  • gflags, glog

Note: Heavy dependencies like Apache Arrow, Protobuf, and Abseil are automatically downloaded and built by the internal CMake script.

Getting Started

Local Development

  1. Clone the repository with submodules:

    git clone --recursive https://github.com/miofthena/neug-rust.git
    cd neug-rust
    

    (If already cloned, run git submodule update --init --recursive)

  2. Build the workspace:

    cargo build
    

    Note: The first build will take a significant amount of time (often >5 minutes) as it compiles the entire C++ codebase and its dependencies.

  3. Run Tests:

    cargo test
    
  4. Run Examples:

    cargo run --example simple_example
    cargo run --example crud_operations
    cargo run --example parallel_query
    

Speeding up C++ Compilation

To prevent the underlying C++ library (neug-cpp) from compiling from scratch on subsequent builds or across workspaces, it is highly recommended to install a compiler cache tool. Our build script automatically detects and utilizes them:

  • Install sccache (via cargo install sccache or brew/apt)
  • Or install ccache (via brew/apt)

Performance & Benchmarks

The library is continuously benchmarked using criterion to measure the overhead introduced by the Rust FFI boundary, data preparation, and full end-to-end workload executions. Because the wrappers are extremely thin, the actual dispatch overhead is practically non-existent.

Recent benchmarks on an Apple Silicon (M-series) chip highlight the performance and stability improvements made to the Rust/C++ interface, specifically regarding safe string buffer allocations during CREATE statements and WAL ingestion:

  • Connection Lifecycle: ~32-33 ns - The time required to request a connection proxy from the C++ engine pool.
  • Query Dispatch (Parse & Execute): ~97.7 µs - The total time it takes to allocate strings in Rust, pass them across the FFI boundary, and have the C++ engine parse and execute a simple Cypher MATCH query.
  • Graph Insertion (DML): ~130 µs - End-to-end execution of a CREATE (:Person {id: X, name: '...'}) statement. Includes dynamic expansion of string buffers without memory corruption or "Out of bounds" exceptions.
  • Graph Traversal (Pathing): ~79.5 µs - Extremely fast read paths spanning multiple edges in the embedded graph.

To run the benchmarks yourself locally:

cargo bench

Usage Example

use neug_rust::{Database, Mode};
use tempfile::tempdir;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let dir = tempdir()?;
    
    // 1. Initialize the database
    let mut db = Database::open(dir.path(), Mode::ReadWrite)?;

    // 2. Open a connection
    let mut conn = db.connect()?;

    // 3. Execute DDL / DML queries
    conn.execute("CREATE NODE TABLE person(id INT64, name STRING, age INT64, PRIMARY KEY(id));")?;
    conn.execute("CREATE REL TABLE knows(FROM person TO person, weight DOUBLE);")?;

    // 4. Query data
    let _result = conn.execute("MATCH (n)-[e]-(m) return count(e);")?;
    println!("Queries executed successfully.");

    // The database and connections are automatically closed when dropped.
    Ok(())
}

Contributing

  1. Add high-level, safe Rust abstractions in neug-bindings/src/.
  2. Ensure you adhere to standard Rust community practices (e.g., cargo fmt, cargo clippy).
  3. Add tests to verify your safe wrappers in neug-bindings/tests/.

License

This wrapper is licensed under the same terms as the neug project (Apache License 2.0).