helix-db 2.0.1

Library for working with HelixDB
Documentation

helix-db

There is good documentation in the crate doc comments, especially in src/lib.rs. AI agents should read the source code and doc comments to get a feel for the query-building patterns and the full API surface.

The helix-db crate (imported as helix_db) is the Rust SDK for HelixDB. It pairs a query-builder DSL with a small async HTTP client (helix_db::Client) for running those queries against a Helix instance.

The DSL is centered on two entry points:

  • read_batch() for read-only transactions
  • write_batch() for write-capable transactions

Everything in the DSL is designed to be composed inside those batch chains. You write one or more named traversals with .var_as(...) / .var_as_if(...), then choose the final payload with .returning(...).

Install

Add the crate under [dependencies]:

helix-db = "2.0.0"

The crate is published under the name helix-db and its library is imported as helix_db. For shorter query code, bring the curated builder API into scope:

use helix_db::dsl::prelude::*;

The examples below assume that prelude is in scope.

Core Shape

Read chain: read_batch() -> var_as / var_as_if -> returning

Write chain: write_batch() -> var_as / var_as_if -> returning

Each var_as call accepts a traversal expression, usually starting with g(). Traversals can read, traverse, filter, aggregate, or mutate depending on whether they are used in a read or write batch.

Read Batches

read_batch()
    .var_as(
        "user",
        g().n_where(SourcePredicate::eq("username", "alice")),
    )
    .var_as(
        "friends",
        g()
            .n(NodeRef::var("user"))
            .out(Some("FOLLOWS"))
            .dedup()
            .limit(100),
    )
    .returning(["user", "friends"]);
read_batch()
    .var_as(
        "active_users",
        g()
            .n_with_label_where("User", SourcePredicate::eq("status", "active"))
            .where_(Predicate::gt("score", 100i64))
            .order_by("score", Order::Desc)
            .limit(25)
            .value_map(Some(vec!["$id", "name", "score"])),
    )
    .returning(["active_users"]);
let statuses = Expr::param("statuses");

read_batch()
    .var_as(
        "matching_users",
        g()
            .n_with_label("User")
            .where_(Predicate::is_in_expr("status", statuses))
            .value_map(Some(vec!["$id", "name", "status"])),
    )
    .returning(["matching_users"]);

Predicate::eq, neq, gt, gte, lt, lte, and between accept either literal property values or Expr parameters. Literal values keep the original literal variants in JSON, while expressions serialize as EqExpr, GteExpr, BetweenExpr, and so on. Use Predicate::compare(...) for arbitrary expression-to-expression comparisons.

Conditional Queries

Use BatchCondition with var_as_if to run later queries only when earlier variables satisfy runtime conditions.

read_batch()
    .var_as(
        "user",
        g().n_where(SourcePredicate::eq("username", "alice")),
    )
    .var_as_if(
        "posts",
        BatchCondition::VarNotEmpty("user".to_string()),
        g().n(NodeRef::var("user")).out(Some("POSTED")),
    )
    .returning(["user", "posts"]);

Write Batches

write_batch()
    .var_as(
        "alice",
        g().add_n("User", vec![("name", "Alice"), ("tier", "pro")]),
    )
    .var_as("bob", g().add_n("User", vec![("name", "Bob")]))
    .var_as(
        "linked",
        g()
            .n(NodeRef::var("alice"))
            .add_e(
                "FOLLOWS",
                NodeRef::var("bob"),
                vec![("since", "2026-01-01")],
            )
            .count(),
    )
    .returning(["alice", "bob", "linked"]);
write_batch()
    .var_as(
        "inactive_users",
        g().n_with_label_where(
            "User",
            SourcePredicate::eq("status", "inactive"),
        ),
    )
    .var_as_if(
        "deactivated_count",
        BatchCondition::VarNotEmpty("inactive_users".to_string()),
        g()
            .n(NodeRef::var("inactive_users"))
            .set_property("deactivated", true)
            .count(),
    )
    .returning(["deactivated_count"]);

Executing Queries with helix_db::Client

helix_db::Client is a thin async wrapper over reqwest for running queries against a Helix instance. Construct it with an optional base URL, then optionally attach a bearer API key:

use helix_db::Client;

// Defaults to http://localhost:6969 when `url` is None.
let client = Client::new(None)?;

// Or point at a remote cluster and attach an API key:
let client = Client::new(Some("https://11e2fc88c410fa5eb13e.cluster.helix-db.com"))?
    .with_api_key(Some("hx_your_api_key"));

Requests are built with a small fluent builder. Start with client.query::<R>() (where R is the type you want the response deserialized into), optionally toggle request headers, then choose a query kind and .send().await:

// Inline / dynamic query: POSTs a `DynamicQueryRequest` (DSL query + parameters) to `/v1/query`.
let response: MyResponse = client
    .query()
    .dynamic(request)              // `request` is a DynamicQueryRequest (see below)
    .send()
    .await?;

// Stored query: POSTs a serializable payload to a deployed query's route
// (`/v1/query/<name>`, e.g. `/v1/query/add_user`).
let response: MyResponse = client
    .query()
    .body(&payload)?               // optional request body for the route
    .stored("add_user".to_string())
    .send()
    .await?;

Optional header toggles can be chained before choosing the query kind:

  • .writer_only() — require the request to be served by a writer node (x-helix-require-writer).
  • .warm_only() — only execute if the query is already warm (x-helix-warm); reads only.
  • .should_await_durability(true) — block until the write is durable (x-helix-await-durable).

send() is generic over the deserialized response type R and returns Result<R, HelixError>. HelixError distinguishes transport errors, non-200 responses from the server (RemoteError), serialization failures, and invalid URLs.

Registered queries + dynamic

Annotate a query builder with #[register] to get a callable helper that builds a DynamicQueryRequest directly from typed arguments. The generated function returns the request value itself (not a Result) — parameter coercion that can fail (e.g. DateTime, bytes) panics with a descriptive message rather than returning an error.

use helix_db::dsl::prelude::*;
use helix_db::Client;
use serde::Deserialize;

#[register]
pub fn add_user(name: String) -> WriteBatch {
    write_batch()
        .var_as("user_id", g().add_n("user", vec![("name", name)]))
        .returning(vec!["user_id"])
}

#[derive(Deserialize)]
struct AddUserResponse {
    user_id: u64,
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::new(Some("https://11e2fc88c410fa5eb13e.cluster.helix-db.com"))?
        .with_api_key(Some("hx_your_api_key"));

    // Building the request is infallible — no `?` needed here.
    let request = add_user("John".to_string());

    let response: AddUserResponse = client.query().dynamic(request).send().await?;
    println!("created user {}", response.user_id);
    Ok(())
}

Notes:

  • A #[register] builder generates a public callable helper only when the function is pub.
  • The serialized payload includes request_type, query, and optional parameters / parameter_types.
  • Private #[register] functions are still registered for bundle generation (helix_db::query_generator::generate()), but they do not generate the public callable helper.

Vector Search Operations (End-to-End)

The current Helix interpreter executes vector search as top-k nearest-neighbor lookup with these runtime semantics:

  • returns up to k hits (top-k behavior)
  • hit order is ascending by $distance (smaller is closer)
  • hit metadata can be read through virtual fields in projections:
    • node hits: $id, $distance
    • edge hits: $id, $from, $to, $distance

Result field contract

Field Type Node hits Edge hits Meaning
$id integer yes yes* Node ID (for node hits) or edge ID (for edge hits)
$distance floating-point yes yes Vector distance from query (lower = closer)
$from integer no yes Edge source node ID
$to integer no yes Edge target node ID

* For edge hits, $id is present when an edge ID is available in storage.

Contract scope in the current Helix interpreter:

  • available on direct vector-hit streams and projection terminals
  • available in value_map, values, project, and (for edges) edge_properties
  • once a traversal step leaves the hit stream (out, in_, both, etc.), downstream traversers no longer carry distance metadata

1) Create indexes and insert vectors

write_batch()
    .var_as(
        "create_doc_index",
        g().create_vector_index_nodes(
            "Doc",
            "embedding",
            None::<&str>,
        ),
    )
    .var_as(
        "create_similar_index",
        g().create_vector_index_edges(
            "SIMILAR",
            "embedding",
            None::<&str>,
        ),
    )
    .var_as(
        "doc_a",
        g().add_n(
            "Doc",
            vec![
                ("title", PropertyValue::from("A")),
                ("embedding", PropertyValue::from(vec![1.0f32, 0.0, 0.0])),
            ],
        ),
    )
    .var_as(
        "doc_b",
        g().add_n(
            "Doc",
            vec![
                ("title", PropertyValue::from("B")),
                ("embedding", PropertyValue::from(vec![0.9f32, 0.1, 0.0])),
            ],
        ),
    )
    .returning(["create_doc_index", "doc_a", "doc_b"]);

2) Node vector search: get ranked hits and fetch node properties

read_batch()
    .var_as(
        "doc_hits",
        g().vector_search_nodes("Doc", "embedding", vec![1.0f32, 0.0, 0.0], 5, None)
            .value_map(Some(vec!["$id", "$distance", "title"])),
    )
    .returning(["doc_hits"]);
doc_hits rows (example shape):
[
  { "$id": 42, "$distance": 0.0031, "title": "A" },
  { "$id": 77, "$distance": 0.0198, "title": "B" }
]

3) Use project(...) on vector hits (including distance)

read_batch()
    .var_as(
        "ranked_docs",
        g().vector_search_nodes("Doc", "embedding", vec![1.0f32, 0.0, 0.0], 10, None)
            .project(vec![
                PropertyProjection::renamed("$id", "doc_id"),
                PropertyProjection::renamed("$distance", "score"),
                PropertyProjection::new("title"),
            ]),
    )
    .returning(["ranked_docs"]);

4) Traverse from hit IDs to related entities

Store hit rows (with $id + $distance) and then use NodeRef::var(...) to continue graph traversal from those hit IDs.

read_batch()
    .var_as(
        "doc_hit_rows",
        g().vector_search_nodes("Doc", "embedding", vec![1.0f32, 0.0, 0.0], 5, None)
            .value_map(Some(vec!["$id", "$distance", "title"])),
    )
    .var_as(
        "authors",
        g().n(NodeRef::var("doc_hit_rows"))
            .out(Some("AUTHORED_BY"))
            .value_map(Some(vec!["$id", "name"])),
    )
    .returning(["doc_hit_rows", "authors"]);

5) Edge vector search and endpoint/property extraction

read_batch()
    .var_as(
        "edge_hits",
        g().vector_search_edges("SIMILAR", "embedding", vec![1.0f32, 0.0, 0.0], 10, None)
            .edge_properties(),
    )
    .var_as(
        "targets",
        g().e(EdgeRef::var("edge_hits"))
            .out_n()
            .value_map(Some(vec!["$id", "title"])),
    )
    .returning(["edge_hits", "targets"]);

edge_hits rows include $from, $to, and $distance (and $id when available), so you can inspect ranking metadata and still traverse from those edges.

6) Optional multitenancy

write_batch()
    .var_as(
        "create_mt_index",
        g().create_vector_index_nodes(
            "Doc",
            "embedding",
            Some("tenant_id"),
        ),
    )
    .var_as(
        "insert_acme",
        g().add_n(
            "Doc",
            vec![
                ("tenant_id", PropertyValue::from("acme")),
                ("title", PropertyValue::from("Acme doc")),
                ("embedding", PropertyValue::from(vec![1.0f32, 0.0, 0.0])),
            ],
        ),
    )
    .returning(["create_mt_index", "insert_acme"]);
read_batch()
    .var_as(
        "acme_hits",
        g().vector_search_nodes(
            "Doc",
            "embedding",
            vec![1.0f32, 0.0, 0.0],
            5,
            Some(PropertyValue::from("acme")),
        )
        .value_map(Some(vec!["$id", "$distance", "title"])),
    )
    .returning(["acme_hits"]);

Multitenant behavior in the current Helix interpreter:

  • multitenant index + missing tenant_value on search => query error
  • multitenant index + unknown tenant => empty result set
  • write with vector present but missing tenant property => write error

Edge-First Reads

read_batch()
    .var_as(
        "heavy_edges",
        g()
            .e_where(SourcePredicate::gt("weight", 0.8f64))
            .edge_has_label("FOLLOWS")
            .order_by("weight", Order::Desc)
            .limit(50),
    )
    .var_as(
        "targets",
        g()
            .e(EdgeRef::var("heavy_edges"))
            .out_n()
            .dedup(),
    )
    .returning(["heavy_edges", "targets"]);

Branching and Repetition

read_batch()
    .var_as(
        "recommendations",
        g()
            .n(1u64)
            .store("seed")
            .repeat(RepeatConfig::new(sub().out(Some("FOLLOWS"))).times(2))
            .without("seed")
            .union(vec![sub().out(Some("LIKES"))])
            .dedup()
            .limit(30),
    )
    .returning(["recommendations"]);

Traversal Building Inside var_as(...)

Common source steps:

  • n(...), n_where(...), n_with_label(...)
  • e(...), e_where(...), e_with_label(...)
  • vector_search_nodes(...), vector_search_edges(...)
    • current Helix runtime exposes vector hit metadata via virtual fields ($id, $distance, $from, $to) in terminal projections

Common navigation and filtering:

  • out/in_/both, out_e/in_e/both_e, out_n/in_n/other_n
  • has, has_label, has_key, where_, within, without, dedup
  • limit, skip, range, order_by, order_by_multiple

Common terminal projections:

  • count, exists, id, label
  • values, value_map, project, edge_properties

Write-only operations (usable in write_batch() traversals):

  • add_n, add_e, set_property, remove_property, drop, drop_edge, drop_edge_by_id
  • create_vector_index_nodes, create_vector_index_edges

For exhaustive catalog-style coverage of every public query-builder function, read the crate docs in src/lib.rs and browse the source directly.

License

Licensed under Apache-2.0.