helix-db 2.0.0

Library for working with HelixDB
Documentation

helix-db

There is good documentation in the crate doc comments, especially in src/lib.rs. AI agents should read the source code and doc comments to get a feel for the query-building patterns and the full API surface.

The helix-db crate (imported as helix_db) is the Rust SDK for HelixDB. It pairs a query-builder DSL with a small async HTTP client (helix_db::Client) for running those queries against a Helix instance.

The DSL is centered on two entry points:

  • read_batch() for read-only transactions
  • write_batch() for write-capable transactions

Everything in the DSL is designed to be composed inside those batch chains. You write one or more named traversals with .var_as(...) / .var_as_if(...), then choose the final payload with .returning(...).

Install

Add the crate under [dependencies]:

helix-db = "2.0.0"

The crate is published under the name helix-db and its library is imported as helix_db. For shorter query code, bring the curated builder API into scope:

use helix_db::dsl::prelude::*;

The examples below assume that prelude is in scope.

Core Shape

Read chain: read_batch() -> var_as / var_as_if -> returning

Write chain: write_batch() -> var_as / var_as_if -> returning

Each var_as call accepts a traversal expression, usually starting with g(). Traversals can read, traverse, filter, aggregate, or mutate depending on whether they are used in a read or write batch.

Read Batches

read_batch()
    .var_as(
        "user",
        g().n_where(SourcePredicate::eq("username", "alice")),
    )
    .var_as(
        "friends",
        g()
            .n(NodeRef::var("user"))
            .out(Some("FOLLOWS"))
            .dedup()
            .limit(100),
    )
    .returning(["user", "friends"]);
read_batch()
    .var_as(
        "active_users",
        g()
            .n_with_label_where("User", SourcePredicate::eq("status", "active"))
            .where_(Predicate::gt("score", 100i64))
            .order_by("score", Order::Desc)
            .limit(25)
            .value_map(Some(vec!["$id", "name", "score"])),
    )
    .returning(["active_users"]);
let statuses = Expr::param("statuses");

read_batch()
    .var_as(
        "matching_users",
        g()
            .n_with_label("User")
            .where_(Predicate::is_in_expr("status", statuses))
            .value_map(Some(vec!["$id", "name", "status"])),
    )
    .returning(["matching_users"]);

Conditional Queries

Use BatchCondition with var_as_if to run later queries only when earlier variables satisfy runtime conditions.

read_batch()
    .var_as(
        "user",
        g().n_where(SourcePredicate::eq("username", "alice")),
    )
    .var_as_if(
        "posts",
        BatchCondition::VarNotEmpty("user".to_string()),
        g().n(NodeRef::var("user")).out(Some("POSTED")),
    )
    .returning(["user", "posts"]);

Write Batches

write_batch()
    .var_as(
        "alice",
        g().add_n("User", vec![("name", "Alice"), ("tier", "pro")]),
    )
    .var_as("bob", g().add_n("User", vec![("name", "Bob")]))
    .var_as(
        "linked",
        g()
            .n(NodeRef::var("alice"))
            .add_e(
                "FOLLOWS",
                NodeRef::var("bob"),
                vec![("since", "2026-01-01")],
            )
            .count(),
    )
    .returning(["alice", "bob", "linked"]);
write_batch()
    .var_as(
        "inactive_users",
        g().n_with_label_where(
            "User",
            SourcePredicate::eq("status", "inactive"),
        ),
    )
    .var_as_if(
        "deactivated_count",
        BatchCondition::VarNotEmpty("inactive_users".to_string()),
        g()
            .n(NodeRef::var("inactive_users"))
            .set_property("deactivated", true)
            .count(),
    )
    .returning(["deactivated_count"]);

Executing Queries with helix_db::Client

helix_db::Client is a thin async wrapper over reqwest for running queries against a Helix instance. Construct it with an optional base URL, then optionally attach a bearer API key:

use helix_db::Client;

// Defaults to http://localhost:6969 when `url` is None.
let client = Client::new(None)?;

// Or point at a remote cluster and attach an API key:
let client = Client::new(Some("https://11e2fc88c410fa5eb13e.cluster.helix-db.com"))?
    .with_api_key(Some("hx_your_api_key"));

Requests are built with a small fluent builder. Start with client.query::<R>() (where R is the type you want the response deserialized into), optionally toggle request headers, then choose a query kind and .send().await:

// Inline / dynamic query: POSTs a `DynamicQueryRequest` (DSL query + parameters) to `/v1/query`.
let response: MyResponse = client
    .query()
    .dynamic_query(request)        // `request` is a DynamicQueryRequest (see below)
    .send()
    .await?;

// Stored query: POSTs a serializable payload to a deployed query's route
// (`/v1/query/<name>`, e.g. `/v1/query/add_user`).
let response: MyResponse = client
    .query()
    .body(&payload)?               // optional request body for the route
    .stored_query("add_user".to_string())
    .send()
    .await?;

Optional header toggles can be chained before choosing the query kind:

  • .writer_only() — require the request to be served by a writer node (x-helix-require-writer).
  • .warm_only() — only execute if the query is already warm (x-helix-warm); reads only.
  • .should_await_durability(true) — block until the write is durable (x-helix-await-durable).

send() is generic over the deserialized response type R and returns Result<R, HelixError>. HelixError distinguishes transport errors, non-200 responses from the server (RemoteError), serialization failures, and invalid URLs.

Registered queries + dynamic_query

Annotate a query builder with #[register] to get a callable helper that builds a DynamicQueryRequest directly from typed arguments. The generated function returns the request value itself (not a Result) — parameter coercion that can fail (e.g. DateTime, bytes) panics with a descriptive message rather than returning an error.

use helix_db::dsl::prelude::*;
use helix_db::Client;
use serde::Deserialize;

#[register]
pub fn add_user(name: String) -> WriteBatch {
    write_batch()
        .var_as("user_id", g().add_n("user", vec![("name", name)]))
        .returning(vec!["user_id"])
}

#[derive(Deserialize)]
struct AddUserResponse {
    user_id: u64,
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::new(Some("https://11e2fc88c410fa5eb13e.cluster.helix-db.com"))?
        .with_api_key(Some("hx_your_api_key"));

    // Building the request is infallible — no `?` needed here.
    let request = add_user("John".to_string());

    let response: AddUserResponse = client.query().dynamic_query(request).send().await?;
    println!("created user {}", response.user_id);
    Ok(())
}

Notes:

  • A #[register] builder generates a public callable helper only when the function is pub.
  • The serialized payload includes request_type, query, and optional parameters / parameter_types.
  • Private #[register] functions are still registered for bundle generation (helix_db::query_generator::generate()), but they do not generate the public callable helper.

Vector Search Operations (End-to-End)

The current Helix interpreter executes vector search as top-k nearest-neighbor lookup with these runtime semantics:

  • returns up to k hits (top-k behavior)
  • hit order is ascending by $distance (smaller is closer)
  • hit metadata can be read through virtual fields in projections:
    • node hits: $id, $distance
    • edge hits: $id, $from, $to, $distance

Result field contract

Field Type Node hits Edge hits Meaning
$id integer yes yes* Node ID (for node hits) or edge ID (for edge hits)
$distance floating-point yes yes Vector distance from query (lower = closer)
$from integer no yes Edge source node ID
$to integer no yes Edge target node ID

* For edge hits, $id is present when an edge ID is available in storage.

Contract scope in the current Helix interpreter:

  • available on direct vector-hit streams and projection terminals
  • available in value_map, values, project, and (for edges) edge_properties
  • once a traversal step leaves the hit stream (out, in_, both, etc.), downstream traversers no longer carry distance metadata

1) Create indexes and insert vectors

write_batch()
    .var_as(
        "create_doc_index",
        g().create_vector_index_nodes(
            "Doc",
            "embedding",
            None::<&str>,
        ),
    )
    .var_as(
        "create_similar_index",
        g().create_vector_index_edges(
            "SIMILAR",
            "embedding",
            None::<&str>,
        ),
    )
    .var_as(
        "doc_a",
        g().add_n(
            "Doc",
            vec![
                ("title", PropertyValue::from("A")),
                ("embedding", PropertyValue::from(vec![1.0f32, 0.0, 0.0])),
            ],
        ),
    )
    .var_as(
        "doc_b",
        g().add_n(
            "Doc",
            vec![
                ("title", PropertyValue::from("B")),
                ("embedding", PropertyValue::from(vec![0.9f32, 0.1, 0.0])),
            ],
        ),
    )
    .returning(["create_doc_index", "doc_a", "doc_b"]);

2) Node vector search: get ranked hits and fetch node properties

read_batch()
    .var_as(
        "doc_hits",
        g().vector_search_nodes("Doc", "embedding", vec![1.0f32, 0.0, 0.0], 5, None)
            .value_map(Some(vec!["$id", "$distance", "title"])),
    )
    .returning(["doc_hits"]);
doc_hits rows (example shape):
[
  { "$id": 42, "$distance": 0.0031, "title": "A" },
  { "$id": 77, "$distance": 0.0198, "title": "B" }
]

3) Use project(...) on vector hits (including distance)

read_batch()
    .var_as(
        "ranked_docs",
        g().vector_search_nodes("Doc", "embedding", vec![1.0f32, 0.0, 0.0], 10, None)
            .project(vec![
                PropertyProjection::renamed("$id", "doc_id"),
                PropertyProjection::renamed("$distance", "score"),
                PropertyProjection::new("title"),
            ]),
    )
    .returning(["ranked_docs"]);

4) Traverse from hit IDs to related entities

Store hit rows (with $id + $distance) and then use NodeRef::var(...) to continue graph traversal from those hit IDs.

read_batch()
    .var_as(
        "doc_hit_rows",
        g().vector_search_nodes("Doc", "embedding", vec![1.0f32, 0.0, 0.0], 5, None)
            .value_map(Some(vec!["$id", "$distance", "title"])),
    )
    .var_as(
        "authors",
        g().n(NodeRef::var("doc_hit_rows"))
            .out(Some("AUTHORED_BY"))
            .value_map(Some(vec!["$id", "name"])),
    )
    .returning(["doc_hit_rows", "authors"]);

5) Edge vector search and endpoint/property extraction

read_batch()
    .var_as(
        "edge_hits",
        g().vector_search_edges("SIMILAR", "embedding", vec![1.0f32, 0.0, 0.0], 10, None)
            .edge_properties(),
    )
    .var_as(
        "targets",
        g().e(EdgeRef::var("edge_hits"))
            .out_n()
            .value_map(Some(vec!["$id", "title"])),
    )
    .returning(["edge_hits", "targets"]);

edge_hits rows include $from, $to, and $distance (and $id when available), so you can inspect ranking metadata and still traverse from those edges.

6) Optional multitenancy

write_batch()
    .var_as(
        "create_mt_index",
        g().create_vector_index_nodes(
            "Doc",
            "embedding",
            Some("tenant_id"),
        ),
    )
    .var_as(
        "insert_acme",
        g().add_n(
            "Doc",
            vec![
                ("tenant_id", PropertyValue::from("acme")),
                ("title", PropertyValue::from("Acme doc")),
                ("embedding", PropertyValue::from(vec![1.0f32, 0.0, 0.0])),
            ],
        ),
    )
    .returning(["create_mt_index", "insert_acme"]);
read_batch()
    .var_as(
        "acme_hits",
        g().vector_search_nodes(
            "Doc",
            "embedding",
            vec![1.0f32, 0.0, 0.0],
            5,
            Some(PropertyValue::from("acme")),
        )
        .value_map(Some(vec!["$id", "$distance", "title"])),
    )
    .returning(["acme_hits"]);

Multitenant behavior in the current Helix interpreter:

  • multitenant index + missing tenant_value on search => query error
  • multitenant index + unknown tenant => empty result set
  • write with vector present but missing tenant property => write error

Edge-First Reads

read_batch()
    .var_as(
        "heavy_edges",
        g()
            .e_where(SourcePredicate::gt("weight", 0.8f64))
            .edge_has_label("FOLLOWS")
            .order_by("weight", Order::Desc)
            .limit(50),
    )
    .var_as(
        "targets",
        g()
            .e(EdgeRef::var("heavy_edges"))
            .out_n()
            .dedup(),
    )
    .returning(["heavy_edges", "targets"]);

Branching and Repetition

read_batch()
    .var_as(
        "recommendations",
        g()
            .n(1u64)
            .store("seed")
            .repeat(RepeatConfig::new(sub().out(Some("FOLLOWS"))).times(2))
            .without("seed")
            .union(vec![sub().out(Some("LIKES"))])
            .dedup()
            .limit(30),
    )
    .returning(["recommendations"]);

Traversal Building Inside var_as(...)

Common source steps:

  • n(...), n_where(...), n_with_label(...)
  • e(...), e_where(...), e_with_label(...)
  • vector_search_nodes(...), vector_search_edges(...)
    • current Helix runtime exposes vector hit metadata via virtual fields ($id, $distance, $from, $to) in terminal projections

Common navigation and filtering:

  • out/in_/both, out_e/in_e/both_e, out_n/in_n/other_n
  • has, has_label, has_key, where_, within, without, dedup
  • limit, skip, range, order_by, order_by_multiple

Common terminal projections:

  • count, exists, id, label
  • values, value_map, project, edge_properties

Write-only operations (usable in write_batch() traversals):

  • add_n, add_e, set_property, remove_property, drop, drop_edge, drop_edge_by_id
  • create_vector_index_nodes, create_vector_index_edges

For exhaustive catalog-style coverage of every public query-builder function, read the crate docs in src/lib.rs and browse the source directly.

License

Licensed under Apache-2.0.