Skip to main content

Module schema

Module schema 

Source
Expand description

Arrow schemas for the graph store.

Three foundational tables:

  • Triples: subject/predicate/object quads with provenance
  • Embeddings: entity vectors (FixedSizeList<f32>)
  • Metadata: per-entity access tracking

The triples schema includes an optional layer column (UInt8) for sub-partitioning within namespaces. When layers are not needed, the column is still present but set to 0.

Modules§

chunk_col
Named column indices for the Chunks schema (fine-grained provenance).
col
Named column indices for the Triples schema. Use these instead of hardcoded integers when accessing RecordBatch columns.

Constants§

CHUNKS_SCHEMA_VERSION
Current schema version for the Chunks table.
DEFAULT_EMBEDDING_DIM
Default embedding dimension (all-MiniLM-L6-v2 = 384, but 768 is future-proof).
TRIPLES_SCHEMA_VERSION
Current schema version for the Triples table.

Functions§

chunks_schema
Schema for the Chunks table — fine-grained document provenance.
embeddings_schema
Schema for the Embeddings table — vector representations of entities.
embeddings_schema_with_dim
Embeddings schema with a custom vector dimension.
metadata_schema
Schema for the Metadata table — per-entity access tracking.
normalize_to_current
Normalize a RecordBatch from an older schema version to the current version.
triples_schema
Schema for the Triples table — the core knowledge representation.