ant-node 0.10.0

# Replication Logic Specification (Codebase-Agnostic)

> Status: Design-level specification for pre-implementation validation.

## 1. Purpose

This document specifies replication behavior as a pure system design, independent of any language, framework, transport, or existing codebase.
It is designed for Kademlia-style decentralized networks, and assumes Kademlia nearest-peer routing semantics.

Primary goal: validate correctness, safety, and liveness of replication logic before implementation.

## 2. Scope

### In scope

- Permanent record replication in a decentralized key-addressed network.
- Churn-aware maintenance and proactive repair.
- Admission control, quorum verification, and storage audits.

### Out of scope

- Concrete wire formats and RPC APIs.
- Disk layout, serialization details, and database choices.
- Cryptographic algorithm selection beyond required properties.

## 3. System Model

- `Node`: participant with routing view, local store, and replication worker.
- `LocalRT(N)`: node `N`'s current authenticated local routing-table peer set (does not include `N` itself).
- `SelfInclusiveRT(N)`: derived local view `LocalRT(N) ∪ {N}` used for responsibility-range and local membership evaluations that must treat `N` as a candidate.
- `CloseNeighbors(N)`: the `NEIGHBOR_SYNC_SCOPE` nearest peers to `N`'s own address in `LocalRT(N)`, ordered by distance to `N`. This is the set of peers eligible for neighbor-sync repair. Recomputed from `LocalRT(N)` at each cycle snapshot.
- `NeighborSyncOrder(N)`: deterministic ordering of peers, snapshotted from `CloseNeighbors(N)` at the start of each round-robin cycle. Peers joining `CloseNeighbors(N)` mid-cycle are not added (they enter the next cycle's snapshot). Peers may be removed from the snapshot mid-cycle if they are on per-peer cooldown or unreachable during sync.
- `NeighborSyncCursor(N)`: index into the current `NeighborSyncOrder(N)` snapshot indicating the next peer position to schedule. Valid for the lifetime of the snapshot.
- `NeighborSyncSet(N)`: current round's up-to-`NEIGHBOR_SYNC_PEER_COUNT` peers selected from `NeighborSyncOrder(N)` starting at `NeighborSyncCursor(N)`; periodic repair sync partners for `N`.
- `NeighborSyncCycleComplete(N)`: event that fires when node `N`'s cursor reaches or exceeds the end of the current `NeighborSyncOrder(N)` snapshot (all remaining peers synced, on cooldown, or unreachable). Triggers post-cycle pruning (Section 11) and a fresh snapshot from current `CloseNeighbors(N)` for the next cycle.
- `Record`: immutable, content-addressed data unit with key `K`.
- `Distance(K, N)`: deterministic distance metric between key and node identity.
- `CloseGroup(K)`: the `CLOSE_GROUP_SIZE` nearest nodes to key `K`.
- `IsResponsible(N, K)`: true if `N` is among the `CLOSE_GROUP_SIZE` nearest nodes to `K` in `SelfInclusiveRT(N)`.
- `Holder`: node that stores a valid copy of a record.
- `RecordOutOfRangeFirstSeen(N, K)`: per-key timestamp recording when key `K` was first continuously observed as out of storage-responsibility range on node `N` (`IsResponsible(N, K)` is false). Cleared (set to `None`) when `K` is back in range.
- `PaidOutOfRangeFirstSeen(N, K)`: per-key timestamp recording when key `K` was first continuously observed as out of paid-list range on node `N` (`N ∉ PaidCloseGroup(K)`). Cleared (set to `None`) when `N` re-enters `PaidCloseGroup(K)`. Independent of `RecordOutOfRangeFirstSeen`.
- `PoP`: verifiable proof that a record was authorized for initial storage/payment policy.
- `PaidNotify(K)`: fresh-replication paid-list notification carrying key `K` plus PoP/payment proof material needed for receiver-side verification and whitelisting.
- `PaidForList(N)`: persistent set of keys node `N` currently believes are paid-authorized; MUST survive node restarts.
- `PaidCloseGroup(K)`: `PAID_LIST_CLOSE_GROUP_SIZE` nearest nodes to key `K` that participate in paid-list consensus, evaluated from the querying node's local view using `SelfInclusiveRT(querying_node)`.
- `PaidGroupSize(K)`: effective paid-list consensus set size for key `K`, defined as `|PaidCloseGroup(K)|`.
- `ConfirmNeeded(K)`: dynamic paid-list confirmation count for key `K`, defined as `floor(PaidGroupSize(K)/2)+1`.
- `QuorumTargets(K)`: up to `CLOSE_GROUP_SIZE` nearest known peers for key `K` in `LocalRT(self)`, excluding `self`; used as the candidate peer set for presence checks.
- `QuorumNeeded(K)`: effective presence confirmation count for key `K`, defined as `min(QUORUM_THRESHOLD, floor(|QuorumTargets(K)|/2)+1)`.
- `BootstrapDrained(N)`: bootstrap-completion gate for node `N`; true only when peer discovery closest to `N`'s own address has populated `LocalRT(N)`, bootstrap peer requests are finished (response or timeout), and bootstrap work queues are empty (`PendingVerify`, `FetchQueue`, `InFlightFetch` for bootstrap-discovered keys).
- `RepairOpportunity(P, KSet)`: evidence that peer `P` has previously received replication hints/offers for keys in `KSet` and had at least one subsequent neighbor-sync cycle to repair before audit evaluation.
- `BootstrapClaimFirstSeen(N, P)`: timestamp when node `N` first observed peer `P` responding with a bootstrapping claim to a sync or audit request. Reset when `P` stops claiming bootstrap status.
- `TrustEngine`: local trust subsystem (EMA-based response-rate scoring with time decay) that consumes replication evidence events via `AdaptiveDHT::report_trust_event`, updates peer trust scores, and triggers peer eviction when scores fall below `block_threshold`. Consumer-reported events use `TrustEvent::ApplicationSuccess(weight)` / `TrustEvent::ApplicationFailure(weight)` with weight clamped to `MAX_CONSUMER_WEIGHT` (5.0).

## 4. Tunable Parameters

All parameters are configurable. Values below are a reference profile used for logic validation.

| Parameter | Meaning | Reference                           |
|---|---|-------------------------------------|
| `K_BUCKET_SIZE` | Maximum number of peers per k-bucket in the Kademlia routing table | `20`                                |
| `CLOSE_GROUP_SIZE` | Close-group width and target holder count per key | `7`                                 |
| `QUORUM_THRESHOLD` | Full-network target for required positive presence votes (effective per-key threshold is `QuorumNeeded(K)`) | `floor(CLOSE_GROUP_SIZE/2)+1` (`4`) |
| `PAID_LIST_CLOSE_GROUP_SIZE` | Maximum number of closest nodes tracking paid status for a key | `20`                                |
| `NEIGHBOR_SYNC_SCOPE` | Number of closest peers to self eligible for neighbor sync | `20`                                |
| `NEIGHBOR_SYNC_PEER_COUNT` | Number of close-neighbor peers synced concurrently per round-robin repair round | `4`                                 |
| `NEIGHBOR_SYNC_INTERVAL` | Neighbor sync cadence | random in `[10 min, 20 min]`        |
| `NEIGHBOR_SYNC_COOLDOWN` | Per-peer min spacing between successive syncs with the same peer | `1h`                                |
| `SELF_LOOKUP_INTERVAL` | Periodic self-lookup cadence to keep close neighborhood current | random in `[5 min, 10 min]`         |
| `MAX_PARALLEL_FETCH_BOOTSTRAP` | Bootstrap concurrent fetches | `20`                                |
| `AUDIT_TICK_INTERVAL` | Audit scheduler cadence | random in `[30 min, 1 hour]`        |
| *(dynamic)* | Audit sample count per round: `floor(sqrt(local_key_count))` | scales with store size |
| `AUDIT_RESPONSE_TIMEOUT` | Audit response deadline | `12s`                               |
| `BOOTSTRAP_CLAIM_GRACE_PERIOD` | Max duration a peer may claim bootstrap status before penalties apply | `24h`                               |
| `PRUNE_HYSTERESIS_DURATION` | Minimum continuous out-of-range duration before pruning a key | `6h`                                |

Parameter safety constraints (MUST hold):

1. `1 <= QUORUM_THRESHOLD <= CLOSE_GROUP_SIZE`.
2. Effective paid-list authorization threshold is per-key dynamic: `ConfirmNeeded(K) = floor(PaidGroupSize(K)/2)+1`.
3. If constraints are violated at runtime reconfiguration, node MUST reject the config and keep the previous valid config.

## 5. Core Invariants (Must Hold)

1. A record is accepted only if it passes integrity and responsibility checks.
2. Neighbor-sync repair traffic passes verification only if either condition holds: paid confirmations `>= ConfirmNeeded(K)` across `PaidCloseGroup(K)`, or presence positives `>= QuorumNeeded(K)` from `QuorumTargets` (which also derives paid-list authorization via close-group replica majority).
3. Fresh replication bypasses presence quorum only when PoP is valid.
4. Neighbor-sync hints are accepted only from authenticated peers currently in `LocalRT(self)`; hints from peers outside `LocalRT(self)` are dropped.
5. Presence probes return only binary key-presence evidence (`Present` or `Absent`).
6. `CLOSE_GROUP_SIZE` is both the close-group width and the target holder count, not guaranteed send fanout.
7. Receiver stores only records in its current responsible range.
8. Queue dedup prevents duplicate pending/fetch work for same key.
9. Replication emits trust evidence/penalty signals to `TrustEngine` (via `AdaptiveDHT::report_trust_event`); trust-score thresholds and eviction decisions are outside replication logic.
10. Security policy is explicit: anti-injection may sacrifice recovery of data that is simultaneously below presence quorum AND has lost paid-list authorization (including derived authorization from close-group replica majority).
11. Neighbor-sync scheduling is deterministic and round-robin, and every neighbor-sync hint exchange reaches a deterministic terminal state.
12. Presence no-response/timeout is unresolved (neutral), not an explicit negative vote.
13. A failed fetch retries from alternate verified sources before abandoning. Verification evidence is preserved across fetch retries.
14. Paid-list authorization is key-scoped and majority-based across `PaidCloseGroup(K)`, not node-global.
15. `PaidForList(N)` MUST be persisted to stable storage and is bounded: node `N` tracks only keys for which `N` is in `PaidCloseGroup(K)` (plus short-lived transition slack).
16. Fresh-replication paid-list propagation is mandatory: sender MUST attempt `PaidNotify(K)` delivery to every peer in `PaidCloseGroup(K)` (reference profile: up to 20 peers when available), not a subset.
17. A `PaidNotify(K)` only whitelists key `K` after receiver-side proof verification succeeds; sender assertions never whitelist by themselves.
18. Neighbor-sync paid hints are non-authoritative and carry no PoP; receivers MUST only whitelist by paid-list majority verification (`>= ConfirmNeeded(K)`) or close-group replica majority (Section 7.2 rule 4), never by hint claims alone, and paid-hint-only processing MUST NOT enqueue record fetch.
19. Storage-proof audits start only after `BootstrapDrained(self)` becomes true.
20. Storage-proof audits target only peers derived from closest-peer lookups for sampled local keys, filtered through local authenticated routing state (`LocalRT(self)`), and further filtered to peers for which `RepairOpportunity` holds; random global peers and never-synced peers are never audited.
21. Verification-request batching is mandatory for unknown-key neighbor-sync verification and preserves per-key quorum semantics: each key receives explicit per-key evidence, and missing/timeout evidence is unresolved per key.
22. On every `NeighborSyncCycleComplete(self)`, node MUST run a prune pass using current `SelfInclusiveRT(self)`: for stored records where `IsResponsible(self, K)` is false, record `RecordOutOfRangeFirstSeen` if not already set and delete only when `now - RecordOutOfRangeFirstSeen >= PRUNE_HYSTERESIS_DURATION`; clear `RecordOutOfRangeFirstSeen` when back in range. For `PaidForList` entries where `self ∉ PaidCloseGroup(K)`, record `PaidOutOfRangeFirstSeen` if not already set and delete only when `now - PaidOutOfRangeFirstSeen >= PRUNE_HYSTERESIS_DURATION`; clear `PaidOutOfRangeFirstSeen` when back in range. The two timestamps are independent.
23. Peers claiming bootstrap status are skipped for sync and audit without penalty for up to `BOOTSTRAP_CLAIM_GRACE_PERIOD` from first observation. After the grace period, each continued bootstrap claim emits `BootstrapClaimAbuse` evidence to `TrustEngine` (via `report_trust_event` with `ApplicationFailure(weight)`).
24. Audit trust-penalty signals require responsibility confirmation: on audit failure, challenger MUST perform fresh local RT closest-peer lookups for each challenged key and only penalize the peer for keys where it is confirmed responsible.

## 6. Replication

### 6.1 Fresh Replication

Trigger: node accepts a newly written record with valid PoP.

Rules:

1. Store locally after normal validation.
2. Compute holder target set for the key with size `CLOSE_GROUP_SIZE`.
3. Send fresh offers to remote target members only.
4. Fresh offer MUST include PoP.
5. Receiver MUST reject fresh path if PoP is missing or invalid.
6. A node that validates PoP for key `K` MUST add `K` to `PaidForList(self)`.
7. In parallel with record propagation, sender MUST send `PaidNotify(K)` to every member of `PaidCloseGroup(K)` and include the PoP for receiver verification.
8. Sender sends `PaidNotify(K)` with PoP to each peer in `PaidCloseGroup(K)` once (fire-and-forget, no ack tracking or retry).

### 6.2 Neighbor Replication Sync

Triggers:

- Periodic randomized timer (`NEIGHBOR_SYNC_INTERVAL`).

Rules:

1. At the start of each round-robin cycle, node computes `CloseNeighbors(self)` as the `NEIGHBOR_SYNC_SCOPE` nearest peers to self in `LocalRT(self)`, then snapshots `NeighborSyncOrder(self)` as a deterministic ordering of those peers and resets `NeighborSyncCursor(self)` to `0`. The snapshot is fixed for the entire cycle; peers joining `CloseNeighbors(self)` mid-cycle are not added to the current snapshot (they enter the next cycle's snapshot).
2. Node selects `NeighborSyncSet(self)` by scanning `NeighborSyncOrder(self)` forward from `NeighborSyncCursor(self)`:
   a. If a candidate peer is on per-peer cooldown (`NEIGHBOR_SYNC_COOLDOWN` not elapsed since last successful sync with that peer), remove the peer from `NeighborSyncOrder(self)` and continue scanning.
   b. Otherwise, add the peer to `NeighborSyncSet(self)`.
   c. Stop when `|NeighborSyncSet(self)| = NEIGHBOR_SYNC_PEER_COUNT` or no unscanned peers remain in the snapshot.
3. Node initiates sync with each peer in `NeighborSyncSet(self)`. If a peer cannot be synced, remove it from `NeighborSyncOrder(self)` and attempt to fill the vacated slot by resuming the scan from where rule 2 left off. A peer cannot be synced if:
   a. Unreachable (connection failure/timeout).
   b. Peer responds with a bootstrapping claim. On first observation, record `BootstrapClaimFirstSeen(self, peer)`. If `now - BootstrapClaimFirstSeen(self, peer) <= BOOTSTRAP_CLAIM_GRACE_PERIOD`, accept the claim and skip without penalty. If the grace period has elapsed, emit `BootstrapClaimAbuse` evidence to `TrustEngine` (via `report_trust_event` with `ApplicationFailure(weight)`) and skip.
4. On any sync session open (outbound or inbound), receiver validates peer authentication and checks current local route membership (`peer ∈ LocalRT(self)`).
5. If `peer ∈ LocalRT(self)`, sync is bidirectional: both sides send and receive peer-targeted hint sets.
6. If `peer ∉ LocalRT(self)`, sync is outbound-only from receiver perspective: receiver MAY send hints to that peer, but MUST NOT accept replica or paid-list hints from that peer.
7. In each session, sender-side hint construction uses peer-targeted sets:
    - `ReplicaHintsForPeer`: keys the sender believes the receiver should hold (receiver is among the `CLOSE_GROUP_SIZE` nearest to `K` in sender's `SelfInclusiveRT`).
    - `PaidHintsForPeer`: keys the sender believes the receiver should track in `PaidForList` (receiver is among the `PAID_LIST_CLOSE_GROUP_SIZE` nearest to `K` in sender's `SelfInclusiveRT`).
8. Transport-level chunking/fragmentation is implementation detail and out of scope for replication logic.
9. Receiver treats hint sets as unordered collections and deduplicates repeated keys. If a key appears in both `ReplicaHintsForPeer` and `PaidHintsForPeer` in the same session, receiver MUST keep only the replica-hint entry and drop the paid-hint duplicate (single-pipeline processing).
10. Receiver diffs replica hints against local store and pending sets, then runs per-key admission rules before quorum logic.
11. Receiver launches quorum checks exactly once per admitted unknown replica key.
12. Only admitted unknown replica keys that pass presence quorum or paid-list authorization are queued for fetch.
13. Receiver processes unknown paid hints via Section 7.2 majority checks in a paid-list pipeline: successful checks may update `PaidForList(self)` but MUST NOT queue record fetch. If the same key is also present in replica hints, rule 9 drops the paid-hint duplicate and fetch behavior is governed only by the replica-hint pipeline.
14. Sync payloads MUST NOT include PoP material; PoP remains fresh-replication-only.
15. Nodes SHOULD use ongoing neighbor sync rounds to re-announce paid hints for locally paid keys to improve paid-list convergence.
16. After each round, node sets `NeighborSyncCursor(self)` to the position after the last scanned peer in the (possibly shrunk) snapshot. Peers removed during scanning (cooldown or unreachable) do not occupy cursor positions — the cursor reflects the snapshot's state after removals.
17. When `NeighborSyncCursor(self) >= |NeighborSyncOrder(self)|`, the cycle is complete (`NeighborSyncCycleComplete(self)`). Node MUST execute post-cycle responsibility pruning (Section 11), then recompute `CloseNeighbors(self)` from current `LocalRT(self)`, take a fresh snapshot, and reset the cursor to `0` to begin the next cycle.

Rate control:

- `NEIGHBOR_SYNC_INTERVAL` governs the global sync timer cadence (how often batch selection runs).
- `NEIGHBOR_SYNC_COOLDOWN` is per-peer: a peer is skipped and removed from the snapshot if it was last successfully synced within `NEIGHBOR_SYNC_COOLDOWN`.

## 7. Authorization and Admission Rules

### 7.1 Neighbor-Sync Hint Admission (Per Key)

For each hinted key `K`, receiver accepts the hint into verification only if both conditions hold:

1. Sender is authenticated and currently in `LocalRT(self)`.
2. Key is relevant to the receiver:
    - Replica hint: receiver is currently responsible (`IsResponsible(self, K)`) or key already exists in local store/pending pipeline.
    - Paid hint: receiver is currently in `PaidCloseGroup(K)` (or key is already in local `PaidForList` pending cleanup). This admission is paid-list-tracking only and does not make the key fetch-eligible by itself.

Notes:

- Authorization decision is local-route-state only.
- Hints from peers outside current `LocalRT(self)` are dropped immediately.
- For inbound sync sessions from peers outside `LocalRT(self)`, receiver may send outbound hints but does not accept inbound hints.
- Mixed hint sets are valid: process admitted keys, drop non-admitted keys.
- Cross-set precedence is strict: if key `K` is present in both admitted replica hints and admitted paid hints, process `K` only in the replica-hint pipeline and drop the paid-hint duplicate.
- Admitted paid hints can update `PaidForList(self)` after verification but never enqueue record fetch. If the same key is also in replica hints, the paid-hint duplicate is discarded and fetch eligibility is decided only by the replica-hint pipeline.
- Receiver MAY return rejected-key metadata to help sender avoid repeating obviously invalid hints in immediate subsequent sync attempts.

### 7.2 Paid-List Authorization (Per Key)

When handling an admitted unknown key `K` from neighbor sync:

1. If `K` is already in local `PaidForList`, paid-list authorization succeeds immediately.
2. Otherwise run the single verification round defined in Section 9 and collect paid-list responses from peers in `PaidCloseGroup(K)` (same round as presence evidence; no separate paid-list-only round).
3. If paid confirmations from `PaidCloseGroup(K)` are `>= ConfirmNeeded(K)`, add `K` to local `PaidForList` and treat `K` as paid-authorized.
4. If presence positives from `QuorumTargets` (the node's local approximation of `CloseGroup(K)`, computed in Section 9 step 3) during the same verification round reach `>= QuorumNeeded(K)` (close-group replica majority), add `K` to local `PaidForList` and treat `K` as paid-authorized. Close-group replica majority constitutes derived evidence of prior authorization and serves as a paid-list recovery path after cold starts or persistence failures.
5. Fetch gating is strict: only keys in the admitted replica-hint pipeline are fetch-eligible. Keys admitted only via paid hints MUST NOT be queued for fetch, even when rules 1, 3, or 4 succeed.
6. If neither paid-list confirmations (rule 3) nor close-group replica majority via presence evidence (rule 4) are met, paid-list authorization fails for this verification round.
7. Nodes answering paid-list queries MUST answer from local `PaidForList` state only; they MUST NOT infer paid status from record presence alone. (Derived paid-list entries from rule 4 are added to `PaidForList` and are thereafter indistinguishable from PoP-derived entries when answering queries.)
8. If a node learns `K` is paid-authorized by majority or close-group replica majority, it SHOULD include `K` in outbound `PaidHintsForPeer` for relevant neighbors so peers can re-check and converge.
9. Unknown paid hints that fail majority confirmation are dropped for this lifecycle and require a new hint/session to re-enter.

### 7.3 Fresh-Replication Paid-List Notification (Per Key)

When fresh replication accepts a new key `K` with valid PoP:

1. Sender constructs `PaidNotify(K)` containing key `K` and PoP.
2. Sender sends `PaidNotify(K)` to every peer in `PaidCloseGroup(K)` (fire-and-forget, no ack tracking or retry).
3. Receiver MUST validate PoP before adding `K` to local `PaidForList`; invalid PoP is silently dropped.

### 7.4 Paid-List Convergence Maintenance (Ongoing)

Nodes that already treat key `K` as paid-authorized SHOULD help convergence by advertising paid hints during neighbor sync:

1. Trigger on neighbor-sync cadence, topology changes affecting `PaidCloseGroup(K)`, and any observation that a `PaidCloseGroup(K)` peer reports unknown for paid key `K`.
2. Compute current `PaidCloseGroup(K)` membership.
3. During sync with peer `P`, if sender believes `P` is in `PaidCloseGroup(K)` and may be missing `K`, include `K` in `PaidHintsForPeer`.
4. Receiver treats paid hints as claims only and adds `K` to `PaidForList` only after local majority confirmation (`>= ConfirmNeeded(K)`).
5. On topology churn, recompute membership and continue on the new `PaidCloseGroup(K)` set.

### 7.5 Presence Probe Handling (Per Key)

For a presence probe on key `K`:

1. Receiver checks local store for key `K`.
2. Receiver returns `Present` if key `K` exists, else `Absent`.
3. If receiver cannot respond before deadline (overload/network delay), the requester observes timeout/no-response rather than a special protocol error code.

### 7.6 Presence Response Semantics

- `Present`: key exists locally.
- `Absent`: key not found locally.

Quorum counting:

- `Present` counts positive.
- `Absent` counts non-positive.
- Timeout/no-response is unresolved (neutral, not a negative vote).

## 8. Receiver Verification State Machine

```text
Idle
  -> OfferReceived
OfferReceived
  -> FilterRejected
  -> PendingVerify
PendingVerify
  -> QuorumVerified
  -> PaidListVerified
  -> QuorumInconclusive
  -> QuorumFailed
QuorumVerified
  -> QueuedForFetch
PaidListVerified
  -> QueuedForFetch     (admitted replica-hint pipeline only; at least one source responded Present)
  -> FetchAbandoned     (admitted replica-hint pipeline; no peer responded Present — indicates possible data loss, see note below)
  -> Idle               (paid-hint-only pipeline; `PaidForList` updated)
QueuedForFetch
  -> Fetching
Fetching
  -> Stored
  -> FetchRetryable     (timeout/error, transport marks retryable, and alternate sources remain)
  -> FetchAbandoned     (transport marks terminal failure or no alternate sources)
FetchRetryable
  -> QueuedForFetch     (select next alternate source from verified source set)
FetchAbandoned
  -> Idle               (key forgotten; requires new offer to re-enter pipeline)
QuorumFailed
  -> QuorumAbandoned    (quorum failed in this verification pass)
QuorumInconclusive
  -> QuorumAbandoned    (verification pass timed out undecidable)
QuorumAbandoned
  -> Idle               (key forgotten; stops wasting probe resources)
```

Transition requirements:

- `OfferReceived -> PendingVerify` only for unknown admitted keys: replica-hint keys must satisfy replica relevance (`IsResponsible(self, K)` or already local/pending), and paid-hint-only keys must satisfy paid relevance (`self ∈ PaidCloseGroup(K)` or already in local `PaidForList` pending cleanup).
- `PendingVerify -> QuorumVerified` only for keys in the admitted replica-hint pipeline, and only if presence positives from the current verification round reach `>= QuorumNeeded(K)`. On success, record the set of positive responders as verified fetch sources and add `K` to local `PaidForList(self)` (close-group replica majority derives paid-list authorization).
- `PendingVerify -> PaidListVerified` if paid confirmations from the current verification round reach `>= ConfirmNeeded(K)`, or if a paid-hint-only key reaches presence quorum in the same round (derived paid-list authorization). On success, mark key as paid-authorized locally and record peers that responded `Present` as verified fetch sources.
- `PaidListVerified -> QueuedForFetch` only for keys in the admitted replica-hint pipeline and only when at least one peer responded `Present` (verified fetch source exists).
- `PaidListVerified -> FetchAbandoned` for keys in the admitted replica-hint pipeline when the presence-only probe completes with zero `Present` responses (no fetch source available). This transition is abnormal: paid-list authorization implies the record was previously stored, so zero holders suggests severe churn or data loss. Implementations SHOULD log this at warning level. Key is forgotten and requires a new offer to re-enter.
- `PaidListVerified -> Idle` for keys admitted only via paid hints (no record fetch).
- `PendingVerify -> QuorumInconclusive` when neither quorum nor paid-list success is reached and unresolved outcomes (timeout/no-response) keep both outcomes undecidable in this round.
- `Fetching -> Stored` only after all storage validation checks pass.
- `Fetching -> FetchRetryable` when fetch fails (timeout, corrupt response, connection error), the transport classifies the attempt as retryable, and at least one untried verified source remains. Mark the failed source as tried so it is not selected again.
- `Fetching -> FetchAbandoned` when fetch fails and either the transport classifies failure as terminal or all verified sources have been tried. Emit `ReplicationFailure` evidence for the failed source(s).
- `FetchRetryable -> QueuedForFetch` selects the next untried verified source and re-enters the fetch queue without repeating quorum verification.
- `QuorumFailed -> QuorumAbandoned` is immediate and terminal for this offer lifecycle. Key is forgotten and stops consuming probe resources. Requires a new offer to re-enter the pipeline.
- `QuorumInconclusive -> QuorumAbandoned` is immediate and terminal for this offer lifecycle. Requires a new offer to re-enter the pipeline.

## 9. Quorum Verification Logic

For each unknown key:

1. Deduplicate key in pending-verification table.
2. Determine fetch eligibility from admission context:
    - Apply cross-set precedence first (Section 6.2 rule 9): a key present in both hint sets is treated as replica-hint pipeline only.
    - `FetchEligible = true` only if `K` is in the admitted replica-hint pipeline.
    - `FetchEligible = false` for paid-hint-only keys.
3. Compute `QuorumTargets` as up to `CLOSE_GROUP_SIZE` nearest known peers for `K` in `LocalRT(self)` (excluding self).
4. If `K` is already in local `PaidForList`:
    - If `FetchEligible`, mark `PaidListVerified`. Run a presence-only probe to `QuorumTargets` to discover holders (no paid-list or authorization verification needed). Enqueue fetch using peers that responded `Present`; if no peer responds `Present`, transition to `FetchAbandoned`.
    - If not `FetchEligible`, mark `PaidListVerified` and terminate the lifecycle (`PaidListVerified -> Idle`) without fetch.
5. Otherwise compute `PaidTargets = PaidCloseGroup(K)`.
6. Compute `QuorumNeeded(K) = min(QUORUM_THRESHOLD, floor(|QuorumTargets|/2)+1)`.
7. Compute `VerifyTargets = PaidTargets ∪ QuorumTargets`.
8. Send verification requests to peers in `VerifyTargets` and continue the round until either success/fail-fast is reached or a local adaptive verification deadline for this round expires. Responses carry binary presence semantics (Section 7.6); peers in `PaidTargets` also return paid-list presence for `K`.
9. As soon as paid confirmations from `PaidTargets` reach `>= ConfirmNeeded(K)`, add `K` to local `PaidForList(self)` and mark `PaidListVerified`. Fetch sources are peers from the same round that responded `Present` (not all paid-confirming peers).
10. As soon as presence positives from `QuorumTargets` reach `>= QuorumNeeded(K)`, add `K` to local `PaidForList(self)` (derived paid-list authorization; Section 7.2 rule 4). If `FetchEligible`, mark `QuorumVerified`; otherwise mark `PaidListVerified`.
11. Verification succeeds as soon as either step 9 or step 10 condition is met (logical OR).
12. If verification succeeded and `FetchEligible`, enqueue fetch using verified sources (peers that responded `Present` during the verification round). If no peer responded `Present`, transition to `FetchAbandoned` (same abnormal condition as Section 9 step 4). The hint sender is a fetch source only if it also responded `Present`; non-holder forwarders are excluded to avoid false `ReplicationFailure` evidence.
13. If verification succeeded and `FetchEligible = false`, terminate lifecycle without fetch (`PaidListVerified -> Idle`).
14. Fail fast and mark `QuorumFailed` only when both conditions are impossible in this round: `(paid_yes + paid_unresolved < ConfirmNeeded(K))` AND `(quorum_positive + quorum_unresolved < QuorumNeeded(K))`.
15. If the verification-round deadline expires with neither success nor fail-fast, mark `QuorumInconclusive`.
16. On `QuorumFailed` or `QuorumInconclusive`, transition immediately to `QuorumAbandoned` (no automatic quorum retry/backoff).

Undersized verification-set behavior:

- Presence threshold remains dynamic per key via `QuorumNeeded(K) = min(QUORUM_THRESHOLD, floor(|QuorumTargets|/2)+1)`.

Single-round requirement:

- Unknown-key verification MUST NOT run a second sequential network round for presence after a paid-list miss; both evidence types are collected in the same request round.

Verification request batching requirement:

- Implementation MUST coalesce concurrent unknown-key verification into one request per peer carrying many keys.
- Each peer response MUST include explicit per-key results: presence (`Present`/`Absent`) for each requested key, plus paid-list presence for keys where that peer is in `PaidTargets`.
- If a peer response omits key `K`, or the peer times out/no-responds, that peer contributes unresolved evidence for key `K` (never a negative vote).

Security-liveness policy:

- Neighbor-sync repair never stores without either presence quorum or paid-list authorization.
- Fresh replication can store with valid PoP alone.
- Therefore, below-quorum data is recoverable only if paid-list authorization can still be established.

## 10. Record Storage Validation

A fetched record is written only if all checks pass:

1. Type/schema validity.
2. Content-address integrity (`hash(content) == key`).
3. Authorization validity:
    - Fresh replication: valid PoP, or
    - Neighbor-sync repair: prior quorum-verified key or paid-list-authorized key.
4. Responsibility check: `IsResponsible(self, K)` at write time.

## 11. Responsibility Check

A node `N` is responsible for key `K` if `IsResponsible(N, K)` holds — that is, `N` is among the `CLOSE_GROUP_SIZE` nearest nodes to `K` in `SelfInclusiveRT(N)`.

This check is evaluated per-key at decision points:

1. Accept/reject incoming replication writes.
2. Post-cycle pruning eligibility (prune stored records where node is no longer responsible).
3. Post-cycle paid-list retention eligibility (drop `PaidForList` entries for keys where node is no longer in `PaidCloseGroup(K)`).

Post-cycle responsibility pruning (triggered by `NeighborSyncCycleComplete(self)`):

1. For each locally stored key `K`, recompute `IsResponsible(self, K)` using current `SelfInclusiveRT(self)`:
   a. If in range: clear `RecordOutOfRangeFirstSeen(self, K)` (set to `None`).
   b. If out of range: if `RecordOutOfRangeFirstSeen(self, K)` is `None`, set it to `now`. Delete the record only when `now - RecordOutOfRangeFirstSeen(self, K) >= PRUNE_HYSTERESIS_DURATION`.
2. For each key `K` in `PaidForList(self)`, recompute `PaidCloseGroup(K)` membership using current `SelfInclusiveRT(self)`:
   a. If `self ∈ PaidCloseGroup(K)`: clear `PaidOutOfRangeFirstSeen(self, K)` (set to `None`).
   b. If `self ∉ PaidCloseGroup(K)`: if `PaidOutOfRangeFirstSeen(self, K)` is `None`, set it to `now`. Delete the entry only when `now - PaidOutOfRangeFirstSeen(self, K) >= PRUNE_HYSTERESIS_DURATION`.
3. This prune pass is local-state-only and MUST NOT require remote confirmations.

Effect:

- Small network: each node is responsible for more keys.
- Large network: each node is responsible for fewer keys.

## 12. Scheduling and Capacity Rules

Queue model:

- `PendingVerify`: keys awaiting quorum result.
- `FetchQueue`: presence-quorum-passed or paid-list-authorized keys waiting for fetch slot.
- `InFlightFetch`: active downloads.

Rules:

1. Drive quorum checks with an adaptive worker budget that scales with backlog and observed network latency while respecting local CPU/memory/network guardrails.
2. During bootstrap, enforce `MAX_PARALLEL_FETCH_BOOTSTRAP` as fetch concurrency cap; outside bootstrap, fetch concurrency is controlled by the adaptive budget from rule 1.
3. Sort fetch candidates by relevance (e.g., nearest-first) before dequeue.
4. Evict stale queued entries using implementation-defined queue-lifecycle policy.
5. On fetch failure, mark source as tried and transition per `FetchRetryable`/`FetchAbandoned` rules (Section 8). Retry decisions are transport-owned. Retry fetches reuse the verified source set from the original verification pass and do not consume additional verification slots.
6. Storage-audit scheduling and target selection MUST follow Section 15 trigger rules.
7. Responsibility/paid-list prune passes MUST run on `NeighborSyncCycleComplete(self)` per Section 11.

Capacity-managed mode (finite store):

1. If full and new in-range key arrives, evict farthest out-of-range key if available.
2. If no out-of-range key exists, reject new key.
3. On each `NeighborSyncCycleComplete(self)`, prune keys that have been continuously out of range for `>= PRUNE_HYSTERESIS_DURATION` per Section 11.
4. `PaidForList` MUST be persisted to stable storage and SHOULD be bounded with paging/eviction policies; on each `NeighborSyncCycleComplete(self)`, keys outside `PaidCloseGroup(K)` that have been continuously out of range for `>= PRUNE_HYSTERESIS_DURATION` are first candidates for removal.

## 13. Churn and Topology Change Handling

Maintain tracker for neighbor-sync eligibility/order and classify topology events:

- `Trigger`: genuine change, run neighbor sync.
- `Skip`: probable restart churn, suppress.
- `Ignore`: far peers, no action.

Goal: avoid replication storms from restart noise while still reacting to real topology shifts.

### 13.1 Close Neighborhood Maintenance

Nodes MUST periodically perform self-lookups (network closest-peer lookup for their own address) to keep `CloseNeighbors(self)` current:

1. Self-lookup runs on a randomized timer (`SELF_LOOKUP_INTERVAL`).
2. Discovered peers are added to `LocalRT(self)` through normal routing-table maintenance.
3. `CloseNeighbors(self)` is recomputed from `LocalRT(self)` at the start of each neighbor-sync cycle (Section 6.2 rule 1).
4. Without regular self-lookups, a node's close neighborhood becomes stale under churn: new close peers go undetected and departed peers remain in `CloseNeighbors` until routing-table eviction. This delays repair and may cause responsibility misjudgments.

## 14. Failure Evidence and TrustEngine Integration

Failure evidence types include:

- `ReplicationFailure`: failed fetch attempt from a source peer.
- `AuditFailure`: timeout, malformed response, or per-key `AuditKeyDigest` mismatch/absence (emitted per confirmed failed key).
- `BootstrapClaimAbuse`: peer continues claiming bootstrap status after `BOOTSTRAP_CLAIM_GRACE_PERIOD` has elapsed since `BootstrapClaimFirstSeen`.

Rules:

1. Replication MUST emit failure evidence to the local `TrustEngine` via `AdaptiveDHT::report_trust_event`; trust-score computation is out of scope for replication.
2. Replication MUST NOT apply threshold-based peer eviction; eviction/quarantine decisions are owned by `AdaptiveDHT` (which evicts peers whose trust score falls below `block_threshold`).
3. A `ReplicationFailure` is emitted per peer per failed fetch attempt, not per key. If a key requires two retries from two different peers before succeeding on the third, each of the two failed peers emits one failure event.
4. Replication SHOULD mark fetch-failure evidence as stale/low-confidence if the key later succeeds via an alternate verified source.
5. On audit failure, replication MUST first run the responsibility confirmation (Section 15 step 9). If the confirmed failure set is non-empty, emit `AuditFailure` evidence with `challenge_id`, `challenged_peer_id`, confirmed failure keys, and failure reason. If the confirmed failure set is empty, no `AuditFailure` is emitted.
6. Replication MUST emit a trust-penalty signal to `TrustEngine` (via `report_trust_event` with `ApplicationFailure(weight)`) for audit failure only when both conditions hold: the confirmed failure set from responsibility confirmation is non-empty (Section 15 step 9d) AND `RepairOpportunity(challenged_peer_id, confirmed_failure_keys)` is true.
7. On bootstrap claim past grace period, replication MUST emit `BootstrapClaimAbuse` evidence with `peer_id` and `BootstrapClaimFirstSeen` timestamp. Evidence is emitted on each sync or audit attempt where the peer claims bootstrapping after `BOOTSTRAP_CLAIM_GRACE_PERIOD`.
8. When a peer that previously claimed bootstrap status stops claiming it (responds normally to sync or audit), node MUST clear `BootstrapClaimFirstSeen(self, peer)`.
9. Final trust-score updates and any eventual peer eviction are determined by `TrustEngine` / `AdaptiveDHT`, not by replication logic.

## 15. Storage Audit Protocol (Anti-Outsourcing)

Challenge-response for claimed holders:

1. Challenger creates unique challenge id + nonce.
2. Challenger selects one peer uniformly at random from peers with `RepairOpportunity` as `challenged_peer_id`. If no eligible peers exist, the audit tick is idle.
3. Challenger samples `SeedKeys` uniformly at random from locally stored record keys, with `|SeedKeys| = max(floor(sqrt(local_store_key_count)), 1)` (capped at `local_store_key_count`). If local store is empty, the audit tick is idle.
4. For each `K` in `SeedKeys`, challenger checks whether `challenged_peer_id` appears in the `CLOSE_GROUP_SIZE` closest peers for `K` via local RT lookup. Keys where the peer is not responsible are discarded. The remaining keys form `PeerKeySet(challenged_peer_id)`.
5. If `PeerKeySet` is empty, the audit tick is idle.
6. Challenger sends `challenged_peer_id` an ordered challenge key set equal to `PeerKeySet(challenged_peer_id)`.
7. Target responds with either per-key `AuditKeyDigest` values or a bootstrapping claim:
    a. Per-key digests: for each challenged key `K_i` (in challenge order), target computes `AuditKeyDigest(K_i) = H(nonce || challenged_peer_id || K_i || record_bytes_i)`, where `record_bytes_i` is the full raw bytes of the record for `K_i`. Target returns the ordered list of per-key digests. If the target does not hold a challenged key, it MUST signal absence for that position (e.g., a sentinel/empty digest); it MUST NOT omit the position silently.
    b. Bootstrapping claim: target asserts it is still bootstrapping. Challenger applies the bootstrap-claim grace logic (Section 6.2 rule 3b): record `BootstrapClaimFirstSeen` if first observation, accept without penalty within `BOOTSTRAP_CLAIM_GRACE_PERIOD`, emit `BootstrapClaimAbuse` evidence if past grace period. Audit tick ends (no digest verification).
8. On per-key digest response, challenger recomputes the expected `AuditKeyDigest(K_i)` for each challenged key from local copies and verifies equality per key before deadline. Each key is independently classified as passed (digest matches) or failed (mismatch, absent, or malformed).
9. On any per-key audit failures (timeout, malformed response, or one or more `AuditKeyDigest` mismatches/absences), challenger MUST perform a responsibility confirmation for each failed key before emitting penalty evidence:
    a. For each failed key `K` in `PeerKeySet(challenged_peer_id)`, perform a fresh local RT closest-peer lookup for `K`.
    b. If `challenged_peer_id` does not appear in the fresh lookup result for key `K`, remove `K` from the failure set (peer is not currently responsible).
    c. If the filtered failure set is empty after all lookups, discard the audit failure entirely — no `AuditFailure` evidence or trust-penalty signal is emitted.
    d. If the filtered failure set is non-empty, emit per-key `AuditFailure` evidence scoped to the confirmed failed keys only.

Audit-proof requirements:

1. Challenger MUST hold a local copy of each challenged record to recompute per-key digests. Audit selection is therefore limited to records the challenger stores.
2. Records are opaque bytes for replication; digest construction MUST operate over raw record bytes (no schema dependency) and be deterministic.
3. Each `AuditKeyDigest(K_i)` input MUST be exactly: `H(nonce || challenged_peer_id || K_i || record_bytes_i)`. Including `K_i` binds each digest to its specific key and prevents digest reordering attacks.
4. Each `AuditKeyDigest` MUST include full record bytes; key-only digests are invalid.
5. Nodes that advertise audit support MUST produce valid responses within `AUDIT_RESPONSE_TIMEOUT`.
6. Responses MUST include exactly one digest entry per challenged key in challenge order. A response is invalid if it has fewer or more entries than challenged keys.

Audit challenge bound:

- Challenge size is dynamic per selected peer: `1 <= |PeerKeySet(challenged_peer_id)| <= floor(sqrt(local_store_key_count))` when a challenge is issued.
- Worst-case challenge bytes are bounded because each record is max `4 MiB` (`<= floor(sqrt(local_store_key_count)) * 4 MiB`).

Failure conditions:

- Timeout, malformed response, or per-key `AuditKeyDigest` mismatch/absence — subject to responsibility confirmation (step 9) before penalty.
- Bootstrapping claim past `BOOTSTRAP_CLAIM_GRACE_PERIOD` (emits `BootstrapClaimAbuse`, not `AuditFailure`).

Audit trigger and target selection:

1. Node MUST NOT schedule storage-proof audits until `BootstrapDrained(self)` is true.
2. On the transition where `BootstrapDrained(self)` becomes true, node MUST execute one audit tick immediately.
3. After the immediate start tick, audit scheduler runs periodically at randomized `AUDIT_TICK_INTERVAL`.
4. Per tick, node MUST run the round-construction flow in steps 2-6 above (select one eligible peer, sample local keys, filter to keys the peer is responsible for via local RT lookup, then challenge).
5. Node MUST NOT issue storage-proof audits to peers outside the round-construction output set for that tick.
6. If round construction yields no eligible peer, node records an idle audit tick and waits for the next tick (no forced random target).

## 16. New Node Bootstrap Logic

A joining node performs active sync:

1. Node MUST initiate peer discovery closest to its own address and wait until `LocalRT(self)` is at least partially populated before proceeding. Without a sufficiently populated routing table, the node cannot accurately evaluate `IsResponsible(self, K)`, `CloseGroup(K)`, or `PaidCloseGroup(K)`, which would cause incorrect admission decisions and quorum target selection during bootstrap.
2. Compute `CloseNeighbors(self)` from the populated `LocalRT(self)` and snapshot deterministic `NeighborSyncOrder(self)` for the bootstrap cycle.
3. Request replica hints (keys peers think self should hold) and paid hints (keys peers think self should track) in round-robin batches of up to `NEIGHBOR_SYNC_PEER_COUNT` peers at a time. If the same key appears in both hint types, collapse to replica-hint processing only.
4. For each discovered key `K`, compute `QuorumTargets` as up to `CLOSE_GROUP_SIZE` nearest known peers for `K` (excluding self), and compute `QuorumNeeded(K) = min(QUORUM_THRESHOLD, floor(|QuorumTargets|/2)+1)`.
5. Aggregate paid-list reports and add key `K` to local `PaidForList` only if paid reports are `>= ConfirmNeeded(K)`.
6. Aggregate key-presence reports and accept only replica-hint-discovered keys observed from `>= QuorumNeeded(K)` peers, or replica-hint-discovered keys that are now paid-authorized locally. Keys discovered only via paid hints are never accepted for fetch; they only update `PaidForList`. When a key meets presence quorum, also add `K` to local `PaidForList(self)` (close-group replica majority derives paid-list authorization per Section 7.2 rule 4).
7. Fetch accepted keys with bootstrap concurrency.
8. Fall back to normal concurrency after `BootstrapDrained(self)` is true.
9. Set `BootstrapDrained(self)=true` only when both conditions hold:
    - bootstrap peer requests from step 3 have all completed (response or timeout), and
    - bootstrap work queues are empty (`PendingVerify`, `FetchQueue`, `InFlightFetch` for bootstrap-discovered keys).
10. Transition `BootstrapDrained(self): false -> true` opens the audit start gate in Section 15.

This compresses quorum formation into one bootstrap round instead of waiting for multiple periodic cycles.

## 17. Logic-Risk Checklist (Pre-Implementation)

Use this list to find design flaws before coding:

1. Quorum deadlock risk:
    - Can strict admission + strict quorum prevent legitimate repair in sparse/partitioned states?
2. Bootstrap incompleteness:
    - If enough neighbor-sync peers are unavailable, is there a deterministic retry strategy?
3. Range oscillation (mitigated):
    - Pruning requires a key to be continuously out of range for `PRUNE_HYSTERESIS_DURATION` before deletion. This is time-based, not cycle-based, so pruning behavior is consistent regardless of routing-table size or cycle cadence. A single partition-and-heal event clears the timestamp and resets the clock.
4. Restart suppression false negatives:
    - Could real topology loss be suppressed too long?
5. Hint-set integrity:
    - How are duplicate keys, partial deliveries, and retries handled deterministically?
6. Neighbor-sync coverage:
    - Under sustained backlog/churn, do neighbor sync rounds still revisit all relevant keys within an acceptable bound?
7. Admission asymmetry:
    - Can temporary disagreement about `LocalRT` membership between honest nodes delay propagation?
8. Capacity fairness:
    - Can nearest-first plus finite capacity starve less-near but still responsible keys?
9. Audit bias:
    - Are audit targets selected fairly, or can adversaries avoid frequent challenge?
10. Failure attribution:
- Could transient network issues create unfair trust penalties without sufficient dampening/evidence quality? (Note: `TrustEngine` uses EMA with time decay toward neutral, which provides natural dampening for transient failures.)
11. Paid-list poisoning:
- Can colluding nodes in `PaidCloseGroup(K)` falsely mark unpaid keys as paid?
12. Paid-list cold-start (mitigated):
- `PaidForList` is now persisted, surviving normal restarts. Close-group replica majority (Section 7.2 rule 4) provides a recovery path when persistence is corrupted or unavailable. Residual risk: keys below both presence quorum AND lost paid-list remain unrecoverable — accepted as explicit security-over-liveness tradeoff.

## 18. Pre-Implementation Test Matrix

Each scenario should assert exact expected outcomes and state transitions.

1. Fresh write happy path:
    - Valid PoP propagates to target holders without quorum check.
2. Fresh write invalid PoP:
    - Receiver rejects and does not enqueue fetch.
3. Neighbor-sync unknown key quorum pass:
    - Key transitions to stored through full state machine.
4. Neighbor-sync unknown key quorum fail:
    - Key transitions to `QuorumAbandoned` (then `Idle`) and is not fetched.
5. Unauthorized sync peer:
    - Hints from peers not in `LocalRT(self)` are dropped and do not enter verification.
6. Presence probe response shape:
    - Presence responses are only `Present` or `Absent`; there are no `RejectedUnauthorized`/`RejectedBusy` presence codes.
7. Out-of-range key hint:
    - Key rejected regardless of quorum.
8. Duplicate and retry safety:
    - Duplicate keys and repeated hints do not create invalid acceptance or duplicate queue/fetch work. If the same key appears in both replica and paid hints in one session, receiver collapses to replica-hint pipeline only.
9. Fetch timeout with alternate source retry:
    - First source times out, key transitions to `FetchRetryable`, re-enters `QueuedForFetch` with next verified source, and succeeds. Verification is not re-run. Failed source receives one `ReplicationFailure`; successful alternate source clears stale failure attribution (rule 14.4).
10. Fetch retry exhaustion:
- All verified sources fail or transport classifies failure as terminal. Key transitions to `FetchAbandoned`. Each failed source receives one `ReplicationFailure`.
11. Repeated confirmed failures:
- Replication emits failure evidence and trust-penalty signals to `TrustEngine` (via `report_trust_event`); eviction decisions are made by `AdaptiveDHT` block-threshold policy rather than replication thresholds.
12. Bootstrap quorum aggregation:
- Node accepts only keys meeting multi-peer threshold.
13. Responsible range shrink:
- Out-of-range records have `RecordOutOfRangeFirstSeen` recorded; they are pruned only after being continuously out of range for `>= PRUNE_HYSTERESIS_DURATION`. New in-range keys still accepted per capacity policy.
14. Neighbor-sync coverage under backlog:
- Under load, each local key is eventually re-hinted within expected neighbor-sync timing bounds as round-robin peer batches rotate through `CloseNeighbors(self)`.
15. Partition and heal:
- Confirm below-quorum recovery succeeds when paid-list authorization survives, and fails when it cannot be re-established.
16. Quorum responder timeout handling:
- No-response/timeouts are unresolved and can yield `QuorumInconclusive`, which is terminal for that offer lifecycle (`QuorumAbandoned` -> `Idle`).
17. Neighbor-sync admission asymmetry:
- When two honest nodes temporarily disagree on `LocalRT` membership, hints are accepted only once sender is present in receiver `LocalRT`; before that, inbound sync is outbound-only at the receiver.
18. Invalid runtime config:
- Node rejects configs violating parameter safety constraints.
19. Audit per-key digest mismatch with confirmed responsibility:
- Peer `P` is challenged on keys `{K1, K2, K3}`. `P` returns per-key digests: `K1` matches, `K2` mismatches, `K3` absent. Challenger runs responsibility confirmation for failed keys `{K2, K3}`: `P` appears in fresh lookup for `K2` but not `K3`. `AuditFailure` is emitted for `{K2}` only. Trust-penalty signal is emitted only when `RepairOpportunity(P, {K2})` is also true.
20. Paid-list local hit:
- Admitted unknown replica key with local paid-list entry bypasses presence quorum and enters fetch pipeline.
21. Paid-list majority confirmation:
- Admitted unknown replica key not in local paid list is accepted for fetch only after `>= ConfirmNeeded(K)` confirmations from `PaidCloseGroup(K)`. For a paid-hint-only key, the same confirmation updates `PaidForList` but does not enqueue fetch.
22. Paid-list rejection:
- Admitted unknown replica key is rejected when paid confirmations are below threshold and presence quorum also fails.
23. Paid-list cleanup after churn:
- Node drops paid-list entries for keys where it is no longer in `PaidCloseGroup(K)`.
24. Fresh-replication paid-list propagation:
- Freshly accepted key sends `PaidNotify` with PoP to all peers in current `PaidCloseGroup(K)` (fire-and-forget).
25. Paid-list convergence repair:
- For a known paid key with incomplete `PaidCloseGroup(K)` coverage, nodes include `K` in `PaidHintsForPeer` during neighbor sync; receiver whitelists only after `>= ConfirmNeeded(K)` confirmations (no PoP in sync payloads).
26. Dynamic paid-list threshold in undersized consensus set:
- With `PaidGroupSize(K)=8`, paid-list authorization requires `ConfirmNeeded(K)=5` confirmations (not 11).
27. Single-round dual-evidence verification:
- For unknown key verification, implementation sends one request round to `VerifyTargets`; no second sequential quorum-probe round is issued after paid-list miss.
28. Dynamic quorum threshold in undersized verification set:
- With `|QuorumTargets|=3`, unknown-key presence quorum requires `QuorumNeeded(K)=2` confirmations (not 4).
29. Audit start gate:
- Node does not schedule audits before `BootstrapDrained(self)`; first audit tick fires immediately when `BootstrapDrained(self)` transitions to true.
30. Audit peer selection from sampled keys:
- Scheduler samples `floor(sqrt(total_keys))` local keys (minimum 1), finds closest peers from the local routing table, builds `PeerKeySet` from those results only, and selects one random peer to audit.
31. Audit periodic cadence with jitter:
- Consecutive audit ticks occur on randomized intervals bounded by configured `AUDIT_TICK_INTERVAL` window.
32. Dynamic challenge size:
- Challenged key count equals `|PeerKeySet(challenged_peer_id)|` and is dynamic per round; if no eligible peer remains after `LocalRT` filtering, the tick is idle and no audit is sent.
33. Batched unknown-key verification:
- When multiple unknown keys share a target peer, implementation MUST send one batched verification request (not separate per-key requests); responses must still be keyed per key with binary presence semantics (and paid-list presence where applicable).
34. Batched partial response semantics:
- If a batched response omits key `K` or a peer times out, evidence for that peer/key pair is unresolved for `K` and does not count as an explicit negative vote.
35. Neighbor-sync round-robin batch selection with cooldown skip:
- With more than `NEIGHBOR_SYNC_PEER_COUNT` eligible peers, consecutive rounds scan forward from cursor, skip and remove cooldown peers, and sync the next batch of up to `NEIGHBOR_SYNC_PEER_COUNT` non-cooldown peers. Cycle completes when all snapshot peers have been synced, skipped (cooldown), or removed (unreachable).
36. Post-cycle responsibility pruning with time-based hysteresis:
- When a full neighbor-sync round-robin cycle completes, node runs one prune pass using current `SelfInclusiveRT(self)` (`LocalRT(self) ∪ {self}`): stored keys with `IsResponsible(self, K)=false` have `RecordOutOfRangeFirstSeen` recorded (if not already set) but are deleted only when `now - RecordOutOfRangeFirstSeen >= PRUNE_HYSTERESIS_DURATION`. Keys that are in range have their `RecordOutOfRangeFirstSeen` cleared. Same logic applies independently to `PaidForList` entries where `self ∉ PaidCloseGroup(K)` using `PaidOutOfRangeFirstSeen`.
37. Non-`LocalRT` inbound sync behavior:
- If a peer opens sync while not in receiver `LocalRT(self)`, receiver may still send hints to that peer, but receiver drops all inbound replica/paid hints from that peer.
38. Neighbor-sync snapshot stability under peer join:
- Peer `P` joins `CloseNeighbors(self)` mid-cycle. `P` does not appear in the current `NeighborSyncOrder(self)` snapshot. After cycle completes and a new snapshot is taken from recomputed `CloseNeighbors(self)`, `P` is included in the next cycle's ordering.
39. Neighbor-sync unreachable peer removal and slot fill:
- Peer `P` is in the snapshot. Sync attempt with `P` fails (unreachable). `P` is removed from `NeighborSyncOrder(self)`. Node resumes scanning from where batch selection left off and picks the next available peer `Q` to fill the slot. `P` is not in the next cycle's snapshot (unless it has rejoined `CloseNeighbors`).
40. Neighbor-sync per-peer cooldown skip:
- Peer `P` was successfully synced in a prior round and is still within `NEIGHBOR_SYNC_COOLDOWN`. When batch selection reaches `P`, it is removed from `NeighborSyncOrder(self)` and scanning continues to the next peer. `P` does not consume a batch slot.
41. Neighbor-sync cycle completion is guaranteed:
- Under arbitrary churn, cooldowns, and unreachable peers, the cycle always terminates because the snapshot can only shrink (removals) and the cursor advances monotonically. Cycle completes when `NeighborSyncCursor >= |NeighborSyncOrder|`.
42. Quorum-derived paid-list authorization:
- Unknown key `K` passes presence quorum (`>= QuorumNeeded(K)` positives from `QuorumTargets`). Key is stored AND added to local `PaidForList(self)`. Node subsequently answers paid-list queries for `K` as "paid."
43. Paid-list persistence across restart:
- Node stores key `K` in `PaidForList`, restarts. After restart, `PaidForList` is loaded from stable storage and node correctly answers paid-list queries for `K` without re-verification.
44. Paid-list cold-start recovery via replica majority:
- Multiple nodes restart simultaneously and lose `PaidForList` (persistence corrupted). Key `K` has `>= QuorumNeeded(K)` replicas in the close group. During neighbor-sync verification, presence quorum passes and all verifying nodes re-derive `K` into their `PaidForList` via close-group replica majority.
45. Paid-list unrecoverable below quorum:
- Key `K` has only 1 replica (below quorum) and `PaidForList` is lost across all `PaidCloseGroup(K)` members. Key cannot be recovered via either presence quorum or paid-list majority — accepted as explicit security-over-liveness tradeoff.
46. Bootstrap claim within grace period (sync):
- Peer `P` responds with bootstrapping claim during sync. Node records `BootstrapClaimFirstSeen(self, P)`. `P` is removed from `NeighborSyncOrder(self)` and slot is filled from next peer. No penalty emitted.
47. Bootstrap claim within grace period (audit):
- Challenged peer responds with bootstrapping claim during audit. Node records `BootstrapClaimFirstSeen`. Audit tick ends without `AuditFailure`. No penalty emitted.
48. Bootstrap claim abuse after grace period:
- Peer `P` first claimed bootstrapping 25 hours ago (`> BOOTSTRAP_CLAIM_GRACE_PERIOD`). On next sync or audit attempt where `P` still claims bootstrapping, node emits `BootstrapClaimAbuse` evidence to `TrustEngine` (via `report_trust_event` with `ApplicationFailure(weight)`) with `peer_id` and `BootstrapClaimFirstSeen` timestamp.
49. Bootstrap claim cleared on normal response:
- Peer `P` previously claimed bootstrapping. `P` later responds normally to a sync or audit request. Node clears `BootstrapClaimFirstSeen(self, P)`. No residual penalty tracking.
50. Prune hysteresis prevents premature deletion:
- Key `K` goes out of range at time `T`. `RecordOutOfRangeFirstSeen(self, K)` is set to `T`. Key is NOT deleted. At `T + 3h` (less than `PRUNE_HYSTERESIS_DURATION`), key is still retained. At `T + 6h` (`>= PRUNE_HYSTERESIS_DURATION`), key is deleted on the next prune pass.
51. Prune hysteresis timestamp reset on partition heal:
- Key `K` goes out of range at time `T`. `RecordOutOfRangeFirstSeen(self, K)` is set to `T`. At `T + 4h`, partition heals, peers return, `K` is back in range. `RecordOutOfRangeFirstSeen` is cleared. Key is retained. If `K` later goes out of range again, the clock restarts from zero.
52. Prune hysteresis applies to paid-list entries:
- `PaidForList` entry for key `K` where `self ∉ PaidCloseGroup(K)` follows the same time-based hysteresis using `PaidOutOfRangeFirstSeen(self, K)`: timestamp recorded, entry deleted only when `now - PaidOutOfRangeFirstSeen >= PRUNE_HYSTERESIS_DURATION`, timestamp cleared if `self` re-enters `PaidCloseGroup(K)`. This timestamp is independent of `RecordOutOfRangeFirstSeen` — clearing one does not affect the other.
53. Audit partial per-key failure with mixed responsibility:
- Peer `P` is challenged on `{K1, K2, K3}`. Per-key digests: `K1` matches, `K2` and `K3` mismatch. Responsibility confirmation: `P` is confirmed responsible for `K2` but not `K3`. `AuditFailure` is emitted for `{K2}` only. `K3` is discarded — no penalty for a key the network confirms `P` is not responsible for. `K1` passed digest verification and is not part of the failure set.
54. Audit per-key digest all pass:
- Peer `P` is challenged on `{K1, K2, K3}`. `P` returns per-key digests for all three keys, all match challenger's expected values. Audit passes — no failure set, no responsibility confirmation needed, no evidence emitted.
55. Audit per-key failure with no confirmed responsibility:
- Peer `P` is challenged on `{K1, K2}`. Per-key digests: both mismatch. Responsibility confirmation: `P` does not appear in fresh lookup results for either key. Entire audit failure is discarded — no `AuditFailure` evidence emitted, no trust-penalty signal.
56. Audit skips never-synced peer:
- Peer `P` appears in closest-peer lookup results for sampled keys and is in `LocalRT(self)`, but `RepairOpportunity(P, _)` is false (no prior sync). `P` is removed from `CandidatePeersRT` before `PeerKeySet` construction. If no other eligible peers remain, audit tick is idle. No challenge is sent to `P`, no network resources consumed.

## 19. Acceptance Criteria for This Design

The design is logically acceptable for implementation when:

1. All invariants in Section 5 can be expressed as executable assertions.
2. Every scenario in Section 18 has deterministic pass/fail expectations.
3. Security-over-liveness tradeoffs are explicitly accepted by stakeholders.
4. Parameter sensitivity (especially, quorum, `PAID_LIST_*`, and suppression windows) has been reviewed with failure simulations.
5. Audit-proof digest requirements are implemented and test-validated.