syntheca 0.2.0

Content-addressable storage on top of apotheca. Bytes go in, BLAKE3 hash comes out; the underlying cella's compare-and-swap pinax namespace is surfaced as a pass-through.
Documentation

syntheca

Content-addressable storage. Bytes go in, a BLAKE3 hash comes out; lookups by hash; identical bytes coalesce into a single stored depositum. The underlying cella's compare-and-swap pinax namespace is surfaced as a transparent pass-through, so callers do not need a parallel apotheca handle to store mutable pointers (state-chain heads and the like) on the same cella.

This crate (syntheca, binary syn) is the Rust reference implementation of the syntheca protocol. It is a thin layer over apotheca: syntheca derives the apotheca depositum name from blake3(bytes), treats apotheca's collision outcome as benign dedup when the bytes match, adds an optional verify-on-read that rehashes under BLAKE3 (in addition to apotheca's mandatory SHA-256 verification), and forwards the pinax namespace through unchanged.

Phase 2: single hash function (BLAKE3, fixed), single apotheca cella, three depositum operations (deposit, get, stat) and two pinax operations (get_pinax, set_pinax).

Install

The binary:

cargo install syntheca

The library:

[dependencies]
syntheca = "0.2"

CLI

syn exposes the operations one-for-one. The default cella root is $HOME/.syntheca/; override with --cella <dir>.

# Depositum (content-addressed, write-once).
syn deposit <path>      # store the file's bytes; prints the hash to stdout
syn deposit -           # store stdin; prints the hash to stdout
syn get <hash>          # bytes to stdout, verified
syn stat <hash>         # size and sha256 to stdout

# Pinax (compare-and-swap, caller-named).
syn pinax get <name>
syn pinax set --name <name> --expect-absent <path>
syn pinax set --name <name> --expect <hex> <path>

<hash> is 64 lowercase hexadecimal digits (32-byte BLAKE3). Uppercase or mixed-case input is rejected. deposit is idempotent: re-depositing identical bytes returns the same hash and does not modify the stored depositum. A genuine BLAKE3 collision (two distinct byte sequences hashing to the same digest) fails with a hash-collision error and the stored bytes are not modified; this does not occur from honest inputs against a collision-resistant hash. get first runs apotheca's SHA-256 verification, then optionally rehashes under BLAKE3 (default on); a mismatch is reported as an integrity error rather than silently propagated. stat does not read or re-hash the bytes.

syn pinax {get,set} is a verbatim mirror of apo pinax {get,set} against the same on-disk cella; pinax names are caller-chosen apotheca names with no content-addressing constraint. pinax set requires either --expect-absent (the pinax must not yet exist) or --expect <hex> (the stored sha256 must equal <hex>); on precondition failure, exit status is non-zero with conflict: actual=<hex> or conflict: actual=absent on stderr.

Exit status is 0 on success, non-zero on collision, conflict, not-found, integrity error, malformed hash, or any I/O failure, with a diagnostic on stderr.

Library

use syntheca::{Cella, Hash};

let cella = Cella::open("/path/to/cella")?;

let hash = cella.deposit(b"hello")?;   // hash = blake3("hello")
let bytes = cella.get(&hash)?;         // verified before return
let stat = cella.stat(&hash)?;         // { size, sha256 } from apotheca
assert_eq!(bytes, b"hello");

// Hashes round-trip through 64-char lowercase hex.
let s = hash.to_hex();
let parsed = Hash::from_hex(&s)?;
assert_eq!(hash, parsed);

Hashes are 32-byte BLAKE3 digests. The hash function is fixed at the type level; per-cella selection is deferred. Equality is over the underlying octets; the canonical wire encoding is 64 lowercase hex digits.

use syntheca::{Cella, Options};

// Disable BLAKE3 verify-on-read. apotheca's SHA-256 verify still runs.
let cella = Cella::open_with(
    "/path/to/cella",
    Options { verify_on_read: false },
)?;

The pinax namespace is exposed verbatim from apotheca (names, outcomes, error types are re-exported):

use syntheca::{Cella, Name, SetPinaxOutcome};

let cella = Cella::open("/path/to/cella")?;
let head = Name::new(b"head")?;

// First write: require absence.
match cella.set_pinax(&head, b"v1", None)? {
    SetPinaxOutcome::Ok => {}
    SetPinaxOutcome::Conflict { actual } => {
        // Someone else wrote first; `actual` is their stored sha256
        // (or None if a concurrent observer saw absent earlier).
    }
}

let bytes = cella.get_pinax(&head)?;

Errors split into:

  • DepositError (HashCollision, Apotheca)
  • GetError (NotFound, IntegrityError, Apotheca)
  • StatError (NotFound, Apotheca)
  • GetPinaxError, SetPinaxError (re-exported from apotheca unchanged — syntheca adds nothing on top of the pass-through).

Lower-level apotheca errors propagate unchanged through the Apotheca variants. A malformed hash on Hash::from_hex is HashParseError (WrongLength or InvalidChar) — not a protocol error. An invalid pinax name is apotheca::NameError, also re-exported.

On-disk layout

A syntheca cella is an apotheca cella. Inside the cella root, the depositum namespace lives at deposita/<hex-blake3>/ with bytes and meta files (apotheca's layout); the pinax namespace lives at pinakes/<name> (one file per pinax, content = bytes).

<cella>/
  deposita/                # depositum namespace, write-once
    <hex-blake3>/
      bytes                # the depositum's bytes
      meta                 # size and sha256 (apotheca's storage digest)
  pinakes/                 # pinax namespace, compare-and-swap
    <name>                 # one file per pinax; content = bytes
    <name>.lock            # per-name advisory lock (created on demand)
  tmp/
    <staging-id>           # staging area for atomic deposit / set_pinax
      ...

For deposita, bytes is octet-for-octet what was deposited and meta records apotheca's SHA-256 storage digest; the BLAKE3 hash is implicit in the directory name and is not stored separately. A read verifies bytes against the stored SHA-256 (apotheca) and, by default, rehashes under BLAKE3 to confirm the directory name still names what it claims to. For pinakes, the local backend recomputes the digest from the file content on each read; no meta file is written.

Two digests, one depositum

Every depositum has two associated hashes:

  • BLAKE3 — names the depositum within syntheca, the unit of content equality, derived at deposit time and required at get/stat time.
  • SHA-256 — apotheca's storage-integrity digest; the value stat returns; the field apotheca uses to detect collisions and reject corrupted reads.

These are deliberately distinct. apotheca is the substrate and uses SHA-256 as its mandatory integrity hash for any caller; syntheca picks BLAKE3 for content-addressing on top. deposit collision detection rides on apotheca's SHA-256 comparison: differing bytes have differing SHA-256, so apotheca returns Collision even when the BLAKE3-derived names match — which syntheca surfaces as HashCollision.

Pinakes are single-hash (SHA-256 only): there is no content-addressed naming contract on pinax names, so syntheca has nothing to verify against beyond apotheca's mandatory storage digest.

Status and scope

Phase 2 reference implementation. Conformant with the syntheca Phase 2 protocol: depositum operations, the BLAKE3-name encoding, two-hash integrity, verify-on-read, the pinax pass-through, and the syn CLI surface (including pinax {get,set}). Inherits Phase 2 conformance from apotheca for the substrate (single local backend, atomic deposit, atomic compare-and-swap, etc.).

Out of scope here: enumeration (apotheca exclusion, transitive — syntheca has no ls/list operation; consumers maintain their own manifests), deletion (write-once for deposita; pinax replacement is via compare-and-swap, not deletion), alternative hash functions, multi-cella composition, configuration files, state chains, history, schema validation. State and history live in projects above syntheca (metatheca, literium, dbaiv); syntheca surfaces the pinax primitive those projects use, but stops below the chain semantics.

License

Licensed under either of MIT (LICENSE-MIT) or Apache-2.0 (LICENSE-APACHE) at your option.

See also

The protocol specification, decision rationale, and broader project framing live in the syntheca project group at https://gitlab.com/pantheca/syntheca. The substrate apotheca (named write-once store with a compare-and-swap pinax namespace, no content-addressing) lives at https://gitlab.com/pantheca/apotheca.