quipu-core

English | 한국어

The embedded storage and query core of Quipu-Log: append-only segments backed by an RFC 6962 Merkle history tree, versioned entity registries, field-level protection, retention, and time-travel queries. This crate is the synchronous engine. The async pipeline and HTTP proxy live in quipu-middleware and quipu-server.

Integrity: a Merkle history tree

Every record's leaf hash (SHA-256(0x00 || payload)) is appended to a Merkle spine — a retention-independent file (merkle.spine) per table whose root commits to every record ever appended (RFC 6962 history tree; node = SHA-256(0x01 || left || right)). The root subsumes a hash chain's tamper-evidence — a record edited in place no longer hashes to its spine leaf, and removing or reordering a segment changes the root — and adds something a chain can't give: independent, third-party verification in O(log n).

Inclusion proof — "record E is committed to root R" with an audit path, verifiable without the rest of the log and without trusting the operator.
Consistency proof — "the tree of size m is a prefix of the tree of size n", i.e. the history in between is append-only: nothing was edited or removed.

The spine holds only hashes (no payloads), so it is never purged by retention — which is what lets the root and every proof survive after the underlying records age out. Segment frames carry just [len][crc][ts][payload]; the CRC catches accidental corruption, the spine catches tampering.

let proof = store.prove_inclusion(log_id)?;           // O(log n) audit path
let root  = store.merkle_root();                       // the value an anchor pins
assert!(quipu_core::merkle::verify_inclusion(
    &proof.leaf, proof.leaf_index as usize,
    proof.tree_size as usize, &proof.path, &root));

let c = store.prove_consistency(earlier_size)?;        // append-only between sizes

Signed integrity checkpoints

The spine proves partial tampering and append-only history, but not a full rewrite: an insider with disk access can delete the spine and every segment and replay a self-consistent tree from scratch. Checkpoints close that gap by pinning the tree under a signature:

field	meaning
`created_at`	UTC micros at signing time
`segment_seq`	logs segment that was active at checkpoint time
`record_count`	log records on disk (decreases after retention)
`tree_size`	total records ever appended — the Merkle tree size (monotonic)
`merkle_root`	Merkle root over the first `tree_size` leaves
`signature`	RSA PKCS#1 v1.5 / SHA-256 over the fields above

It appends that to checkpoints.log at the store root. Segment files aren't touched, so a store that never checkpoints stays byte-identical on disk.

When checkpoints are written

On segment seal. A seal is when a prefix becomes immutable. Its frequency is bounded by max_segment_bytes — unlike the flush/sync path, which would put an RSA signing operation on the every-N-appends hot path.
After a retention purge. Re-checkpointing keeps verification independent of legitimately deleted records.
On demand via AuditStore::checkpoint() — from a scheduler, or before a backup.

Verification

verify_integrity() re-derives each surviving record's leaf against the spine, then, if checkpoints exist:

verifies every checkpoint signature with the RSA public key;
confirms the latest checkpoint's merkle_root is consistent with the current tree — identical when tree_size matches, otherwise via a consistency proof.

The spine is never purged, so the current tree is always an extension of any honest checkpoint. A rewritten tree can't reproduce the checkpointed root; a truncated tail makes the current tree_size smaller than the checkpoint's. Both fail verification.

Write-only deployments

Signing needs the RSA private key. A log-producing service configured with only the public key (the recommended split — it can encrypt fields but never read them back) can't sign, so checkpointing is silently disabled there. checkpoint() returns Ok(None), segment seals skip the step, no checkpoints.log is created, and verify_integrity() simply has no checkpoints to check.

This is a deliberate choice, not an error: write-path availability comes first. Run checkpointing where the private key lives, or accept spine-only tamper evidence on write-only nodes. (The Merkle root and proofs still work without a key — only the signed anchor needs one.)

External anchoring

A checkpoint inside the store still shares the store's fate. The scheme assumes the insider doesn't hold the signing key — a key-holding insider could re-sign a rewritten tree. The anchor hook exports each checkpoint somewhere the insider can't rewrite:

let cfg = StoreConfig::new("/var/audit")
    .keys(keys)
    .anchor(|cp| {
        // ship (created_at, cp.tree_size, cp.merkle_root_hex()) anywhere outside
        // the host: another machine, a ticket, a transparency log, a printed report
    });

The hook runs synchronously right after each checkpoint is persisted. Errors and panics inside it are swallowed — write-path availability outranks anchoring, so delivery guarantees (queueing, retries) are the hook's job. Comparing the anchored roots against checkpoints.log later proves the checkpoint file itself wasn't rewritten — and anyone holding an anchored (tree_size, root) can demand a consistency proof against the live store.

Threat model summary

attack	caught by
record edited in place (even with fixed CRC)	Merkle spine (leaf mismatch)
segment removed / reordered / replaced	Merkle root changes
full delete-and-rewrite of the tree	latest checkpoint root
newest records truncated	latest checkpoint `tree_size` / consistency proof
checkpoint file forged or edited	RSA signature
checkpoint file rewritten by the key holder	external anchor comparison
operator hides/forges a record from a third party	inclusion proof against the anchored root
operator silently edits past history	consistency proof against an anchored root

quipu-core 0.2.0