syntheca 0.3.0

Content-addressable storage on top of apotheca. Bytes go in, BLAKE3 hash comes out; the underlying cella's compare-and-swap pinax namespace is surfaced as a pass-through.
Documentation
# syntheca

Content-addressable storage. Bytes go in, a BLAKE3 hash comes out;
lookups by hash; identical bytes coalesce into a single stored
depositum. The underlying cella's compare-and-swap pinax namespace is
surfaced as a transparent pass-through, so callers do not need a
parallel apotheca handle to store mutable pointers (state-chain heads
and the like) on the same cella.

This crate (`syntheca`, binary `syn`) is the Rust reference
implementation of the syntheca protocol. It is a thin layer over
[`apotheca`](https://crates.io/crates/apotheca): syntheca derives the
apotheca depositum name from `blake3(bytes)`, treats apotheca's
collision outcome as benign dedup when the bytes match, adds an
optional verify-on-read that rehashes under BLAKE3 (in addition to
apotheca's mandatory SHA-256 verification), and forwards the pinax
namespace through unchanged.

Both surfaces (depositum + pinax pass-through): single hash function
(BLAKE3, fixed), single apotheca cella, depositum operations
(`deposit`, `get`, `stat`) and pinax operations (`get_pinax`,
`set_pinax`).

## Install

The binary:

```sh
cargo install syntheca
```

The library:

```toml
[dependencies]
syntheca = "0.2"
```

## CLI

`syn` exposes the operations one-for-one. The default cella root is
`$HOME/.syntheca/`; override with `--cella <dir>`.

```sh
# Depositum (content-addressed, write-once).
syn deposit <path>      # store the file's bytes; prints the hash to stdout
syn deposit -           # store stdin; prints the hash to stdout
syn get <hash>          # bytes to stdout, verified
syn stat <hash>         # size and sha256 to stdout

# Pinax (compare-and-swap, caller-named).
syn pinax get <name>
syn pinax set --name <name> --expect-absent <path>
syn pinax set --name <name> --expect <hex> <path>
```

`<hash>` is 64 lowercase hexadecimal digits (32-byte BLAKE3). Uppercase
or mixed-case input is rejected. `deposit` is idempotent: re-depositing
identical bytes returns the same hash and does not modify the stored
depositum. A genuine BLAKE3 collision (two distinct byte sequences
hashing to the same digest) fails with a hash-collision error and the
stored bytes are not modified; this does not occur from honest inputs
against a collision-resistant hash. `get` first runs apotheca's SHA-256
verification, then optionally rehashes under BLAKE3 (default on); a
mismatch is reported as an integrity error rather than silently
propagated. `stat` does not read or re-hash the bytes.

`syn pinax {get,set}` is a verbatim mirror of `apo pinax {get,set}`
against the same on-disk cella; pinax names are caller-chosen apotheca
names with no content-addressing constraint. `pinax set` requires
either `--expect-absent` (the pinax must not yet exist) or
`--expect <hex>` (the stored sha256 must equal `<hex>`); on
precondition failure, exit status is non-zero with `conflict:
actual=<hex>` or `conflict: actual=absent` on stderr.

Exit status is `0` on success, non-zero on collision, conflict,
not-found, integrity error, malformed hash, or any I/O failure, with a
diagnostic on stderr.

## Library

```rust
use syntheca::{Cella, Hash};

let cella = Cella::open("/path/to/cella")?;

let hash = cella.deposit(b"hello")?;   // hash = blake3("hello")
let bytes = cella.get(&hash)?;         // verified before return
let stat = cella.stat(&hash)?;         // { size, sha256 } from apotheca
assert_eq!(bytes, b"hello");

// Hashes round-trip through 64-char lowercase hex.
let s = hash.to_hex();
let parsed = Hash::from_hex(&s)?;
assert_eq!(hash, parsed);
```

Hashes are 32-byte BLAKE3 digests. The hash function is fixed at the
type level; per-cella selection is deferred. Equality is over the
underlying octets; the canonical wire encoding is 64 lowercase hex
digits.

```rust
use syntheca::{Cella, Options};

// Disable BLAKE3 verify-on-read. apotheca's SHA-256 verify still runs.
let cella = Cella::open_with(
    "/path/to/cella",
    Options { verify_on_read: false },
)?;
```

The pinax namespace is exposed verbatim from apotheca (names,
outcomes, error types are re-exported):

```rust
use syntheca::{Cella, Name, SetPinaxOutcome};

let cella = Cella::open("/path/to/cella")?;
let head = Name::new(b"head")?;

// First write: require absence.
match cella.set_pinax(&head, b"v1", None)? {
    SetPinaxOutcome::Ok => {}
    SetPinaxOutcome::Conflict { actual } => {
        // Someone else wrote first; `actual` is their stored sha256
        // (or None if a concurrent observer saw absent earlier).
    }
}

let bytes = cella.get_pinax(&head)?;
```

Errors split into:

- `DepositError` (`HashCollision`, `Apotheca`)
- `GetError` (`NotFound`, `IntegrityError`, `Apotheca`)
- `StatError` (`NotFound`, `Apotheca`)
- `GetPinaxError`, `SetPinaxError` (re-exported from `apotheca`
  unchanged — syntheca adds nothing on top of the pass-through).

Lower-level apotheca errors propagate unchanged through the `Apotheca`
variants. A malformed hash on `Hash::from_hex` is `HashParseError`
(`WrongLength` or `InvalidChar`) — not a protocol error. An invalid
pinax name is `apotheca::NameError`, also re-exported.

## On-disk layout

A syntheca cella is an apotheca cella. Inside the cella root, the
depositum namespace lives at `deposita/<hex-blake3>/` with `bytes` and
`meta` files (apotheca's layout); the pinax namespace lives at
`pinakes/<name>` (one file per pinax, content = bytes).

```
<cella>/
  deposita/                # depositum namespace, write-once
    <hex-blake3>/
      bytes                # the depositum's bytes
      meta                 # size and sha256 (apotheca's storage digest)
  pinakes/                 # pinax namespace, compare-and-swap
    <name>                 # one file per pinax; content = bytes
    <name>.lock            # per-name advisory lock (created on demand)
  tmp/
    <staging-id>           # staging area for atomic deposit / set_pinax
      ...
```

For deposita, `bytes` is octet-for-octet what was deposited and `meta`
records apotheca's SHA-256 storage digest; the BLAKE3 hash is implicit
in the directory name and is not stored separately. A read verifies
bytes against the stored SHA-256 (apotheca) and, by default, rehashes
under BLAKE3 to confirm the directory name still names what it claims
to. For pinakes, the local backend recomputes the digest from the file
content on each read; no `meta` file is written.

## Two digests, one depositum

Every depositum has two associated hashes:

- **BLAKE3** — names the depositum within syntheca, the unit of
  content equality, derived at `deposit` time and required at
  `get`/`stat` time.
- **SHA-256** — apotheca's storage-integrity digest; the value `stat`
  returns; the field apotheca uses to detect collisions and reject
  corrupted reads.

These are deliberately distinct. apotheca is the substrate and uses
SHA-256 as its mandatory integrity hash for any caller; syntheca picks
BLAKE3 for content-addressing on top. `deposit` collision detection
rides on apotheca's SHA-256 comparison: differing bytes have differing
SHA-256, so apotheca returns `Collision` even when the BLAKE3-derived
names match — which syntheca surfaces as `HashCollision`.

Pinakes are single-hash (SHA-256 only): there is no content-addressed
naming contract on pinax names, so syntheca has nothing to verify
against beyond apotheca's mandatory storage digest.

## Status and scope

Reference implementation. Conformant with syntheca v1.0-rc1 on both
surfaces: depositum operations, the BLAKE3-name encoding, two-hash
integrity, verify-on-read, the pinax pass-through, and the `syn` CLI
surface (including `pinax {get,set}`). Inherits both-surfaces
conformance from apotheca for the substrate (single local backend,
atomic deposit, atomic compare-and-swap, etc.).

Out of scope here: enumeration (apotheca exclusion, transitive — syntheca
has no `ls`/`list` operation; consumers maintain their own manifests),
deletion (write-once for deposita; pinax replacement is via
compare-and-swap, not deletion), alternative hash functions, multi-cella
composition, configuration files, state chains, history, schema
validation. State and history live in projects above syntheca
(metatheca, literium, dbaiv); syntheca surfaces the pinax primitive
those projects use, but stops below the chain semantics.

## License

Licensed under either of MIT (LICENSE-MIT) or Apache-2.0 (LICENSE-APACHE)
at your option.

## See also

The protocol specification, decision rationale, and broader project framing
live in the syntheca project group at
<https://gitlab.com/pantheca/syntheca>. The substrate `apotheca` (named
write-once store with a compare-and-swap pinax namespace, no
content-addressing) lives at <https://gitlab.com/pantheca/apotheca>.