The `trible` module defines the `Trible` struct, the smallest unit of
knowledge the system stores. Instances of `Trible`s live inside
`TribleSet`s, which index each fact in several complementary ways so that
queries can be answered with as little work as possible.
``` text
┌────────────────────────────64 byte───────────────────────────┐
┌──────────────┐┌──────────────┐┌──────────────────────────────┐
│ entity-id ││ attribute-id ││ inlined value │
└──────────────┘└──────────────┘└──────────────────────────────┘
└────16 byte───┘└────16 byte───┘└────────────32 byte───────────┘
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─▶
```
At a high level a trible is a three-tuple consisting of an entity, an
attribute, and a value. The entity and attribute are both 128‑bit abstract
extrinsic identifiers as described in [crate::id], while the value is an
arbitrary 256‑bit [crate::value::Value]. The value width deliberately matches
the size of an intrinsic identifier so larger payloads can be referenced via
blobs without inflating the inlined representation.
## Abstract identifiers
Entities and attributes are purely extrinsic; their identifiers do not encode
any meaning beyond uniqueness. An entity may accrue additional tribles over
time and attributes simply name relationships without prescribing a schema.
This keeps the format agnostic to external ontologies and minimises accidental
coupling between datasets.
The value slot can carry any 256‑bit payload. Its size is dictated by the need
to embed an intrinsic identifier for out‑of‑line data. When a fact exceeds this
space the value typically stores a blob handle pointing to the larger payload.
Tribles are stored as a contiguous 64‑byte array with the entity occupying the
first 16 bytes, the attribute the next 16, and the value the final 32 bytes.
The name "trible" is a portmanteau of *triple* and *byte* and is pronounced
like "tribble" from Star Trek – hence the project's mascot, Robert the
tribble. This rigid layout keeps the representation friendly to SIMD
optimisations and allows the storage layer to compute sizes deterministically.
## Index permutations
`TribleSet`s index each fact under all six permutations of entity (E),
attribute (A) and value (V) so any combination of bound variables can be
resolved efficiently. Regardless of which columns a query fixes the
planner can reach matching leaves with a handful of comparisons:
```text
┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐
│ EAV │ │ EVA │ │ AEV │ │ AVE │ │ VEA │ │ VAE │
└──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘
│ │ │ │ │ │
┌───────────────────────────────────────────────────────┐
│ order-specific inner nodes │
└───────────────────────────────────────────────────────┘
│ │ │ │ │ │
▼ ▼ ▼ ▼ ▼ ▼
┌───────────────────────────────────────────────────────┐
│ SHARED LEAVES │
│ single canonical E–A–V tribles used by all │
└───────────────────────────────────────────────────────┘
```
Each permutation maintains its own inner nodes, but all six share leaf nodes
containing the 64‑byte trible. This avoids a naïve six‑fold memory cost while
still letting the query planner pick the most selective ordering, keeping joins
resistant to skew even when cardinalities vary widely.
## Advantages
- A total order over tribles enables efficient storage and canonicalisation.
- Simple byte‑wise segmentation supports indexing and querying without an
interning mechanism, keeping memory usage low and parallelisation easy while
avoiding the need for garbage collection.
- Schemas describe the value portion directly, making serialisation and
deserialisation straightforward.
- The fixed 64‑byte layout makes it easy to estimate the physical size of a
dataset as a function of the number of tribles stored.
- The minimalistic design aims to minimise entropy while retaining collision
resistance, making it likely that a similar format would emerge through
convergent evolution and could serve as a universal data interchange format.
## Set operations and monotonic semantics
`TribleSet`s provide familiar set-theoretic helpers such as
[`TribleSet::union`](https://docs.rs/triblespace/latest/triblespace/trible/struct.TribleSet.html#method.union),
[`TribleSet::intersection`](https://docs.rs/triblespace/latest/triblespace/trible/struct.TribleSet.html#method.intersection)
and
[`TribleSet::difference`](https://docs.rs/triblespace/latest/triblespace/trible/struct.TribleSet.html#method.difference).
`union` consumes the right-hand operand and merges its contents into the
receiver in place, while `intersection` and `difference` each produce a fresh
`TribleSet` without mutating their inputs. Together these helpers make it
straightforward to merge datasets, locate their overlap or identify the facts
that still need to propagate between replicas while keeping the original
sources intact.
This design reflects the crate's commitment to CALM-friendly, monotonic
semantics. New information can be added freely, but existing facts are never
destroyed. Consequently, `difference` is intended for comparing snapshots
(e.g. "which facts are present in the remote set that I have not yet
indexed?") rather than for destructive deletion. This keeps workflows
declarative and convergent: sets can be combined in any order without
introducing conflicts, and subtraction simply reports the gaps that remain to
be filled.
## Direction and consistency
In many triple stores the direction of an edge is chosen incidentally—there
is no intrinsic preference for `hasColor` over `colorOf`. This ambiguity often
leads to confusion, duplication, or both as different writers pick different
conventions. Common mitigations either mirror every edge automatically (as
done by OWL and RDF through `inverseOf`, doubling storage or demanding runtime
inference) or devolve into bikeshedding about the "correct" orientation.
`tribles` avoids that trap by giving edge direction explicit semantics: the
arrow points from the entity making the claim to the entity being described.
The observer owns the identifier and is responsible for the consistency of the
facts it asserts—see [ID Ownership](crate::id). This rule naturally fits the
distributed setting where each entity has a single authoritative writer.
Viewed another way, edges always flow from describing to described entities,
while cycles represent consensus between the parties involved. For example,
`hasColor` must point from the object that exhibits the colour to the entity
representing that colour. The orientation is therefore a consequence of the
statement's meaning, not an arbitrary modelling choice.