arangors-graph-exporter (ArangoDB Rust Graph Loader)
This Rust-based library provides a high-performance and parallel way to load data from ArangoDB. It supports loading both named graphs and custom graphs, with options to specify which vertex and edge attributes to load.
API Docs Stable | API Docs Main | ArangoDB Docs | ArangoDB
Installation
Add the following crate to your Cargo.toml by doing:
Usage
Initialization
There are two different approaches to initialize the graph loader:
Named Graph
A named graph is a graph in ArangoDB that has a name and its graph definition is already stored in the database.
To initialize a graph loader for a named graph, use the GraphLoader::new_named method.
use ;
async
Custom Graph
A custom graph or anonymous graph is a graph that can act as a graph but does not have a name or a graph definition stored in the database.
To create a graph loader for a custom graph:
use ;
async
AQL-Based Graph Loading
AQL-based graph loading provides a flexible way to load subgraphs from ArangoDB using custom AQL queries. This approach is particularly well-suited for:
- Loading relatively small subgraphs
- Using indexes or traversals to find the right subgraph
- Applying complex filtering conditions to vertices and edges
- Executing graph traversals to define the subgraph
Graph Loading Specification
A "graph loading specification" is a list of lists of AQL queries, where each query is a pair of a query string and a map of bind parameters. The specification has the following semantics:
- The outer list is executed sequentially
- Each inner list contains queries that can be executed in parallel
- Each query must return items in the following format:
Both vertices and edges attributes are optional.
- Vertex entries must contain at least an
_idattribute - Edge entries must contain at least
_fromand_toattributes - Edge values can be
nulland will be silently ignored (useful for traversal start nodes)
Attribute Specification
You can declare the vertex and edge attributes upfront with their types for efficient columnar storage.
Supported Data Types:
-
DataType::Bool- Boolean values- Accepts:
true/false, strings like "true"/"false"/"yes"/"no"/"1"/"0", numbers (0=false, non-zero=true)
- Accepts:
-
DataType::String- Text strings- Accepts: any value (null becomes empty string, objects/arrays become JSON strings)
-
DataType::U64- Unsigned 64-bit integers (non-negative)- Accepts: non-negative integers, positive floats (rounded), numeric strings
- Rejects: negative values
-
DataType::I64- Signed 64-bit integers- Accepts: any integer, floats (rounded), numeric strings
-
DataType::F64- 64-bit floating-point numbers- Accepts: any numeric value, numeric strings
- Rejects: infinity and NaN values
-
DataType::JSON- Any JSON value- Accepts: anything without type conversion
Type Conversion Errors:
When a value cannot be converted to the specified type, a default value is used and the error is recorded:
Bool→falseString→""(empty string)U64/I64→0F64→0.0JSON→null
The _id attribute for vertices and _from/_to attributes for edges are automatically included and don't need to be specified.
Edge Buffering
It's allowed to produce edges whose end-vertices appear later in the same query or in subsequent queries. The loader will buffer such edges until all vertices are available. For optimal performance, produce vertices before edges to minimize buffering.
Example 1: Filtered Vertex and Edge Collections
This example shows how to load a subgraph from multiple vertex and edge collections with filter conditions. The first inner list loads all vertices in parallel, and the second inner list loads all edges in parallel:
This approach ensures that all vertices are loaded first (no edge buffering needed), and allows parallel loading within each phase.
Example 2: Graph Traversals
This example shows how to use graph traversals to define the subgraph. Note that traversals produce both vertices and edges in the same result:
The depth 0 case includes the starting vertex with a null edge. Multiple traversals can be executed in parallel as shown above, or sequentially by placing them in separate inner lists:
Creating an AqlGraphLoader
To create an AQL graph loader, use the AqlGraphLoader::new method:
use ;
use HashMap;
use Value;
async
Example: Graph Traversal
For graph traversals that produce vertices and edges together:
use ;
use HashMap;
use Value;
async
Loading Data with GraphLoader
Once the graph loader is initialized, you can load vertices and edges using the following methods:
do_vertices: Load vertices from the graph.do_edges: Load edges from the graph.
Both methods take a closure as an argument to handle the loaded data.
If during the initialization you specified the global fields to load, the closure will receive the global fields as well.
If no global fields are specified, the closure will receive only the required fields. For vertices, the required fields are the vertex ID and the vertex key.
For edges the required fields are the from vertex IDs and to vertex IDs.
Vertices
The closure for handling vertices takes the following arguments:
let handle_vertices = ;
graph_loader.do_vertices.await?;
Edges
The closure for handling edges takes the following arguments:
let handle_edges = ;
let edges_result = graph_loader.do_edges.await?;
Loading Data with AqlGraphLoader
Once the AQL graph loader is initialized, load the graph data using the do_load method with a callback function.
The Callback Function
The callback receives a mutable reference to a GraphBatch containing both vertices and edges. The batch structure includes:
- vertex_ids: Vector of vertex IDs as byte vectors
- vertex_attribute_values: Vector of attribute vectors, parallel to vertex_ids
- edge_from_ids: Vector of source vertex IDs as byte vectors
- edge_to_ids: Vector of target vertex IDs as byte vectors
- edge_attribute_values: Vector of attribute vectors, parallel to edge IDs
- type_error_count: Total number of type conversion errors encountered
- type_error_messages: Type error messages (limited based on configuration)
Type Error Reporting Configuration
The AqlGraphLoader allows you to configure how many type error messages are collected per GraphBatch:
None(recommended default): All type conversion errors are reported. This is useful for debugging and ensures you see every issue.Some(n): At mostntype error messages are collected perGraphBatch. The total count is always tracked intype_error_count, but only the firstnmessages are stored intype_error_messages. This can help reduce memory usage when dealing with data that has many type errors.Some(0): No error messages are collected (buttype_error_countis still incremented). Use this if you only need to know the count of errors, not the details.
Examples:
// Report all type errors (recommended)
let loader = new?;
// Report at most 10 type errors per batch
let loader = new?;
// Track error count only, no messages
let loader = new?;
The callback signature is:
Fn
Example: Basic Loading
use GraphBatch;
// Load the graph with a callback
aql_loader.do_load.await?;
Example: Collecting Data
use ;
use HashMap;
// Shared state to collect data
let vertices = new;
let edges = new;
let vertices_clone = vertices.clone;
let edges_clone = edges.clone;
// Load with closure that captures shared state
aql_loader.do_load.await?;
// Access collected data after loading
let final_vertices = vertices.lock.unwrap;
let final_edges = edges.lock.unwrap;
println!;
Important Notes
- The callback is called multiple times as batches are loaded (batch size is configured during initialization)
- Vertices and edges may arrive in the same batch (especially for traversal queries)
- The callback must be
Send + Sync + Cloneto support parallel query execution - Attributes in
vertex_attribute_valuesandedge_attribute_valuescorrespond to theDataItemspecifications provided during initialization - Type conversion errors are collected but don't stop the loading process; check
type_error_countto handle them appropriately
Configuration
Database Configuration
Provide your database configuration parameters to DatabaseConfiguration::new.
Please read the documentation for more information on the available parameters.
Data Load Configuration
Configure data loading parameters with DataLoadConfiguration::new.
Please read the documentation for more information on the available parameters.
Attributes
Named Graph
- graph_name: The name of the graph in ArangoDB.
- vertex_global_fields: Optional. List of vertex attributes to load.
- edge_global_fields: Optional. List of edge attributes to load.
Custom Graph
- vertex_collections: List of vertex collections to load.
- edge_collections: List of edge collections to load.
Special Attributes as fields names
Right now there is only one special field available. Special fields are identified by the @ prefix.
- @collection_name: Include the collection name in the returned data.
Flags
- load_all_vertex_attributes: Boolean flag to load all vertex attributes.
- load_all_edge_attributes: Boolean flag to load all edge attributes.
Error Handling
All methods return Result types. Handle errors using Rust's standard error handling mechanisms.
The error type is GraphLoaderError.
Example return type:
Result<(), GraphLoaderError>
match graph_loader.do_vertices.await
License
This project is licensed under the MIT License.
Getting Help
First, see if the answer to your question can be found in the [API documentation]. If your question couldn't be solved, please feel free to pick one of those resources:
-
Please use GitHub for feature requests and bug reports: https://github.com/arangodb/arangors-graph-exporter/issues
-
Ask questions about the driver, Rust, usage scenarios, etc. on StackOverflow: https://stackoverflow.com/questions/tagged/arangodb
-
Chat with the community and the developers on Slack: https://arangodb-community.slack.com/
-
Learn more about ArangoDB with our YouTube channel: https://www.youtube.com/@ArangoDB
-
Follow us on X to stay up to date: https://x.com/arangodb
-
Find out more about our community: https://www.arangodb.com/community
Contributing
Contributions are welcome! Please open an issue or submit a pull request.
This documentation provides a comprehensive overview of the API and usage of the Rust-based ArangoDB graph loader. It covers initialization, configuration, data loading, and error handling. For more detailed examples and advanced usage, please refer to the source code and additional documentation.