Skip to main content

Module hnsw_cell

Module hnsw_cell 

Source
Expand description

On-disk format for a single HNSW graph node (Phase 7d.3).

Each cell carries one node’s per-layer neighbor lists. The cells live on TableLeaf-style pages identical to a regular table’s data tree — same slot directory, same sibling next_page chain, same interior- page mechanics from Phase 3d. The only thing different is the per-cell body, signaled by KIND_HNSW.

Reusing the table-tree shape lets Cell::peek_rowid work uniformly across all cell kinds: it skips cell_length | kind_tag and reads the first varint, which is node_id here. So slot-directory binary search by node_id works without HNSW-specific code in the page-level plumbing.

  cell_length   varint          bytes after this field
  kind_tag      u8 = 0x05       (KIND_HNSW)
  node_id       zigzag varint   the rowid this graph node represents
  max_layer     varint          highest layer this node lives in
  for layer in 0..=max_layer:
    count       varint          number of neighbors at this layer
    for each neighbor:
      neighbor  zigzag varint   neighbor's node_id

No null bitmap — every field is always present. No type tag — every field has a fixed type (varint or zigzag varint). The encoding is deliberately minimal because HNSW indexes can have N nodes each with up to ~M·log(N) total neighbors, and we don’t want the per-cell overhead to dominate disk usage.

Structs§

HnswNodeCell
One HNSW node’s persisted form. layers[i] is the list of neighbor node_ids at layer i; the node lives at every layer 0..=layers.len()-1.