quipu-core
English | 한국어
The embedded storage and query core of Quipu-Log: append-only segments backed by an RFC 6962 Merkle history tree, versioned entity registries, field-level protection, retention, and time-travel queries. This crate is the synchronous engine. The async pipeline and HTTP proxy live in quipu-middleware and quipu-server.
Integrity: a Merkle history tree
Every record's leaf hash (SHA-256(0x00 || payload)) is appended to a Merkle spine — a retention-independent file (merkle.spine) per table whose root commits to every record ever appended (RFC 6962 history tree; node = SHA-256(0x01 || left || right)). The root subsumes a hash chain's tamper-evidence — a record edited in place no longer hashes to its spine leaf, and removing or reordering a segment changes the root — and adds something a chain can't give: independent, third-party verification in O(log n).
- Inclusion proof — "record E is committed to root R" with an audit path, verifiable without the rest of the log and without trusting the operator.
- Consistency proof — "the tree of size m is a prefix of the tree of size n", i.e. the history in between is append-only: nothing was edited or removed.
The spine holds only hashes (no payloads), so it is never purged by retention — which is what lets the root and every proof survive after the underlying records age out. Segment frames carry just [len][crc][ts][payload]; the CRC catches accidental corruption, the spine catches tampering.
let proof = store.prove_inclusion?; // O(log n) audit path
let root = store.merkle_root; // the value an anchor pins
assert!;
let c = store.prove_consistency?; // append-only between sizes
Signed integrity checkpoints
The spine proves partial tampering and append-only history, but not a full rewrite: an insider with disk access can delete the spine and every segment and replay a self-consistent tree from scratch. Checkpoints close that gap by pinning the tree under a signature:
| field | meaning |
|---|---|
created_at |
UTC micros at signing time |
segment_seq |
logs segment that was active at checkpoint time |
record_count |
log records on disk (decreases after retention) |
tree_size |
total records ever appended — the Merkle tree size (monotonic) |
merkle_root |
Merkle root over the first tree_size leaves |
signature |
RSA PKCS#1 v1.5 / SHA-256 over the fields above |
It appends that to checkpoints.log at the store root. Segment files aren't touched, so a store that never checkpoints stays byte-identical on disk.
When checkpoints are written
- On segment seal. A seal is when a prefix becomes immutable. Its frequency is bounded by
max_segment_bytes— unlike the flush/sync path, which would put an RSA signing operation on the every-N-appends hot path. - After a retention purge. Re-checkpointing keeps verification independent of legitimately deleted records.
- On demand via
AuditStore::checkpoint()— from a scheduler, or before a backup.
Verification
verify_integrity() re-derives each surviving record's leaf against the spine, then, if checkpoints exist:
- verifies every checkpoint signature with the RSA public key;
- confirms the latest checkpoint's
merkle_rootis consistent with the current tree — identical whentree_sizematches, otherwise via a consistency proof.
The spine is never purged, so the current tree is always an extension of any honest checkpoint. A rewritten tree can't reproduce the checkpointed root; a truncated tail makes the current tree_size smaller than the checkpoint's. Both fail verification.
Write-only deployments
Signing needs the RSA private key. A log-producing service configured with only the public key (the recommended split — it can encrypt fields but never read them back) can't sign, so checkpointing is silently disabled there. checkpoint() returns Ok(None), segment seals skip the step, no checkpoints.log is created, and verify_integrity() simply has no checkpoints to check.
This is a deliberate choice, not an error: write-path availability comes first. Run checkpointing where the private key lives, or accept spine-only tamper evidence on write-only nodes. (The Merkle root and proofs still work without a key — only the signed anchor needs one.)
External anchoring
A checkpoint inside the store still shares the store's fate. The scheme assumes the insider doesn't hold the signing key — a key-holding insider could re-sign a rewritten tree. The anchor hook exports each checkpoint somewhere the insider can't rewrite:
let cfg = new
.keys
.anchor;
The hook runs synchronously right after each checkpoint is persisted. Errors and panics inside it are swallowed — write-path availability outranks anchoring, so delivery guarantees (queueing, retries) are the hook's job. Comparing the anchored roots against checkpoints.log later proves the checkpoint file itself wasn't rewritten — and anyone holding an anchored (tree_size, root) can demand a consistency proof against the live store.
Threat model summary
| attack | caught by |
|---|---|
| record edited in place (even with fixed CRC) | Merkle spine (leaf mismatch) |
| segment removed / reordered / replaced | Merkle root changes |
| full delete-and-rewrite of the tree | latest checkpoint root |
| newest records truncated | latest checkpoint tree_size / consistency proof |
| checkpoint file forged or edited | RSA signature |
| checkpoint file rewritten by the key holder | external anchor comparison |
| operator hides/forges a record from a third party | inclusion proof against the anchored root |
| operator silently edits past history | consistency proof against an anchored root |