pmat 3.22.0

PMAT - Zero-config AI context generation and code quality toolkit (CLI, MCP, HTTP)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
# Modern Agentic Coding Support (MACS)

> Sub-spec of [pmat-spec.md](../pmat-spec.md) | **Component 32 (proposed)**
> **Status**: DRAFT for review | **Version**: 1.0.0 | **Date**: 2026-07-02
> **Baseline commit**: `bab0f9a` (2026-06-25)
> **Ticket prefix**: `MACS-` | **CB range**: CB-1650–CB-1666 (verified unallocated at baseline; CB-1640 is implemented here per Component 31's reservation)
> **Audience**: autonomous coding agents (**Claude Fable 5 in autonomous mode**, model id `claude-fable-5`) and humans reviewing their work.
> **Normative internal refs**: [agent-integration.md](agent-integration.md) (C10), [pmat-work-verification-ladder.md](pmat-work-verification-ladder.md) (C28), [pmat-work-cot-proof-derivation.md](pmat-work-cot-proof-derivation.md) (C31), [autonomous-verify-loop.md](../../agent-instructions/autonomous-verify-loop.md), [provable-contract-first-agents.md](../../agent-instructions/provable-contract-first-agents.md).

---

## 0. Executive Summary

Claude Code's 2026 agentic surface — **ultracode** (xhigh reasoning + automatic
dynamic-workflow orchestration), **Fable 5**, per-skill **effort pinning**,
nested subagents, and saveable workflow scripts — changed the operating
environment PMAT's agent components were designed for. PMAT's architecture
(**agents propose, receipts dispose**) is the correct response to that
environment, but six load-bearing pieces are missing or decorative at baseline:

| # | Feature | One-line problem (evidence at baseline `bab0f9a`) |
|---|---------|-----------------------------------------------------|
| F1 | Agent provenance in the falsification ledger | `FalsificationReceipt` records *what* passed, never *which agent configuration* produced it (`src/cli/handlers/work_ledger_receipt.rs:3–22`) |
| F2 | Verification-ladder enforcement | `verification_level` is an unparsed `String`; a ticket can claim L5 with zero evidence (C28 spec, "the field is decorative") |
| F3 | CoT proof derivation | `chain_of_thought` is write-only prose; assumptions never become `require` clauses (C31 spec, "a hallucination magnet") |
| F4 | Effort pinning for skills/subagents | None of the six `.claude/skills/*` pin `effort:`; behavior and cost vary with whatever the session left set |
| F5 | Deterministic sweep harness + committed workflow | The release MCP sweep spends stochastic LLM tokens on schema-derivable work; the orchestration is an ephemeral session, not a versioned artifact |
| F6 | Canonical roadmap artifact + manifest/doc drift | `roadmap.yaml` frozen ("HISTORICAL — SUPERSEDED"); `mcp.json` declares 2 tools vs ~20 real; `docs/features/refactor-auto.md:422` ships `claude-3-opus` |

This spec delivers all six as a **totally ordered ticket sequence
MACS-000…MACS-016**, each ticket bound to a provable contract *before* any code
is written (CB-1400), gated by `pmat verify --format json`, and closed only by
a falsification receipt. §3 gives the extended reasoning (structured
chain-of-thought that dogfoods the C31 schema, plus a Five-Whys per feature).
§6 states the determinism guarantees that make the build reproducible even
though the builder is an LLM. §7 is the exact execution protocol for Fable 5.

**Non-goals** (v1): distributed agent state (C-actix spec Phase 2+), replacing
ultracode's orchestration with a pmat-native one, LSP/IDE surface, non-Rust
implementation targets, retro-fitting provenance onto pre-MACS ledger entries
(they remain valid under `schema_version: 1`).

---

## 1. Environment: Normative Facts (with citations)

Facts the design depends on. Each is dated and sourced; if a fact changes
upstream, re-derive the dependent tickets (they are marked).

- **E1 — ultracode is a harness setting, not a model effort level.** It sends
  `xhigh` to the model and additionally lets Claude Code orchestrate dynamic
  workflows automatically; it is session-only and cannot be set via
  `effortLevel`, `--effort`, or `CLAUDE_CODE_EFFORT_LEVEL` [1][4].
- **E2 — dynamic workflows fan out to 10s–100s of parallel subagents and run
  independent verification before results reach the user** [3]. Since
  v2.1.172, subagents may spawn their own subagents up to 5 levels deep [4].
- **E3 — xhigh (hence ultracode) is available on Fable 5**, alongside Mythos 5,
  Opus 4.8, Opus 4.7, and Sonnet 5; Fable 5's **default effort is `high`**, and
  Fable 5 uses always-on adaptive thinking [1][2].
- **E4 — skill and subagent frontmatter can pin `effort:`**, overriding the
  session level while that skill/subagent runs (env var still wins) [1].
- **E5 — flagged-request semantics on Fable 5**: Claude Code may auto-switch
  the session to an Opus model on a flagged request; in **non-interactive/SDK
  mode a flagged request ends the turn with a refusal** [1]. → F1/F3 must make
  such events *recorded states*, never silent gaps.
- **E6 — cost profile**: community reports include a full 5-hour Max
  allocation consumed in ~7 minutes by Fable + ultracode on a codebase-wide
  audit [6], and 62 Opus 4.8 subagents exhausting a 5-hour cap in 18 minutes
  [5]. → deterministic gates must run at zero LLM cost; ultracode is reserved
  for judgment.
- **E7 — workflow runs are session-bound (exit ⇒ restart fresh)** [7], but a
  run **can be saved as a versionable orchestration script** the whole team can
  re-run [4]. → durable state belongs in `.pmat-work/` checkpoints; the
  orchestration belongs in git.
- **E8 — the `ultracode` keyword in a prompt triggers a one-off workflow**
  (rename from `workflow` in v2.1.160) [4]. → §7 forbids the bare keyword in
  MACS build prompts to avoid accidental fan-out.
- **E9 — Claude Code exposes `CLAUDE_CODE_EFFORT_LEVEL`** for standard levels
  [5]; harness detection beyond that is best-effort. → F1 treats *declared*
  provenance (flags) as canonical and *detected* provenance (env) as advisory.

---

## 2. Design Principle: Agents Propose, Receipts Dispose

Ultracode's own verification design — a separate agent that never wrote the
answer it is judging [3] — is Popperian falsification [9] implemented in
harness form, i.e. PMAT's thesis restated by the vendor. MACS therefore does
**not** compete with orchestration. It hardens the boundary:

```
        stochastic side (may vary run-to-run)          deterministic side (must not)
  ┌────────────────────────────────────────────┐   ┌─────────────────────────────────┐
  │ Fable 5 sessions · ultracode workflows ·   │   │ pmat verify (CI-faithful gate)  │
  │ subagent fleets · skeptic re-verification  │──▶│ pmat work  (contracts, ladder)  │
  │ (discovery, judgment, adversarial review)  │   │ falsification receipts + ledger │
  └────────────────────────────────────────────┘   │ canonical artifacts (hashes)    │
        every crossing of this boundary  ──────────▶ is STAMPED with provenance (F1) │
                                                    └─────────────────────────────────┘
```

Design-by-contract [10] supplies the interface discipline (require/ensure/
invariant); Bruni et al. [8] extend it to multi-agent behavioral contracts,
which C10 already mandates (CB-1400). MACS closes the remaining gaps so the
right-hand side is *complete*: every acceptance is attributable (F1), every
claimed strength is enforced (F2), every reasoning step is dischargeable (F3),
every skill's cost/behavior is pinned (F4), every mechanical check is LLM-free
(F5), and every planning artifact is canonical (F6).

---

## 3. Extended Reasoning

### 3.1 Chain of Thought (structured; dogfoods the Component 31 schema)

Each step below uses the C31 v2 shape `{id, assumption, implication,
evidence_method}`. Chain integrity: step *N*'s assumption is discharged by step
*N−1*'s implication (or by a cited environment fact E*n* / repo evidence).
Once MACS-008 lands, this very section is CB-1640-checkable.

```yaml
chain_of_thought:
  - id: CoT-1
    assumption: "PMAT's goal is deterministic roadmap/PM/QC substrate for agentic AI (README, pmat-spec)."
    implication: "Any nondeterminism must live outside the acceptance boundary; anything crossing it must be attributable and replayable."
    evidence_method: "grep README.md for 'Determinism' section; pmat-spec component list."
  - id: CoT-2
    assumption: "Crossings are attributable and replayable (CoT-1)."
    implication: "Receipts need agent provenance {model, effort, harness, workflow_id, parent}; absent today (work_ledger_receipt.rs:3-22)."
    evidence_method: "cargo test ledger::provenance_roundtrip after MACS-001; field-presence check CB-1651."
  - id: CoT-3
    assumption: "Provenance can be silently wrong if the harness swaps models mid-loop (E5)."
    implication: "Interruptions (refusal, model-switch, session-restart) must be first-class AgentEvents in the receipt, and a refusal-terminated turn must map to a paused ticket, never a completed one."
    evidence_method: "MACS-003 RED test event_refusal_blocks_complete; CB-1654."
  - id: CoT-4
    assumption: "Acceptance strength claims exist as free-form strings (C28 evidence)."
    implication: "Type the ladder (L0–L5) and gate `work complete` on evidence ≥ claim; strict parse rejects 'l4', 'L3 ', 'strong'."
    evidence_method: "proptest: FromStr total over generated corruptions; gate test claim_L4_without_kani_blocks."
  - id: CoT-5
    assumption: "Fable 5's marginal value is reasoning depth at high/xhigh (E3), currently discarded as prose (C31 evidence)."
    implication: "Structure CoT steps and auto-derive one proof_obligation + one FalsifiableClaim per step; unmatched assumptions are CB-1640 violations."
    evidence_method: "MACS-007..009 tests; `pmat comply check` shows CB-1640 firing on a seeded broken chain."
  - id: CoT-6
    assumption: "Per-skill effort is pinnable in frontmatter (E4) and cost blowups are real (E6)."
    implication: "Mechanical skills pin low/medium, adversarial skills pin xhigh; a lint (CB-1650) makes the pin mandatory, making per-skill behavior reproducible across sessions."
    evidence_method: "CB-1650 fails on a skill missing `effort:`; passes on the six updated skills."
  - id: CoT-7
    assumption: "The MCP sweep layer is schema-derivable (tool defs in src/agent/mcp_server_tool_defs.rs) and hence needs no LLM (E6 corollary)."
    implication: "Build `pmat qa mcp-sweep` as a pure harness with byte-framing goldens and N-way concurrency; reserve ultracode for anomaly interpretation via a committed workflow script (E7)."
    evidence_method: "Two consecutive sweep runs byte-identical modulo timestamp fields; workflow file exists in git and stamps provenance."
  - id: CoT-8
    assumption: "Planning artifacts are scattered (roadmap.yaml frozen; live plan in CHANGELOG/gh/specs) and manifests drift (mcp.json=2 tools; refactor-auto.md=claude-3-opus)."
    implication: "Regenerate ROADMAP.yaml and mcp.json canonically from single sources of truth with content hashes; lint stale model strings; freshness/drift are CB checks so drift cannot recur silently."
    evidence_method: "CB-1655/1656/1657 red on seeded drift, green after regen."
```

### 3.2 Five-Whys (root cause per feature) [11]

**F1 — Why can't we attribute a green receipt to an agent config?**
Why 1: The receipt schema has no agent fields.
Why 2: It was designed when one human-supervised agent ran at a time.
Why 3: Ultracode made N concurrent, nested agents the normal case (E2) after the schema froze.
Why 4: There was no forcing function — no CB check requires provenance.
Why 5: Provenance was nobody's contract obligation.
**Root cause**: schema predates multi-agent reality and nothing enforces catching up. **Fix**: MACS-001/002 + CB-1651/1652.

**F2 — Why can a ticket claim L5 unearned?**
Why 1: `verification_level` is a `String` no code path parses.
Why 2: The ladder was specified (C28) before the storage type existed.
Why 3: Implementation stored the display value, deferring semantics.
Why 4: Completion gating reads receipts, not the contract's level field.
Why 5: No type forces the two to meet.
**Root cause**: spec/implementation type gap. **Fix**: MACS-004/005/006 (typed enum + gate).

**F3 — Why do agent assumptions inflate without discharge?**
Why 1: CoT steps are `{summary, rationale, timestamp}` prose.
Why 2: No consumer parses them (C31: "only writers exist").
Why 3: The formal side (proof_obligations, claims) has no inbound edge from CoT.
Why 4: The schema lacks the fields an edge would need (assumption/implication/evidence_method).
Why 5: CoT was built as an audit *log*, not an audit *ledger*.
**Root cause**: missing schema fields ⇒ missing derivation path. **Fix**: MACS-007/008/009 (+CB-1640/1658).

**F4 — Why does skill behavior vary run-to-run?**
Why 1: Skills inherit session effort.
Why 2: Session effort is whatever the operator last set (or ultracode).
Why 3: Frontmatter pinning (E4) postdates the skills' authoring.
Why 4: No lint requires the pin.
Why 5: Effort was treated as a human preference, not a reproducibility parameter.
**Root cause**: unpinned execution parameter. **Fix**: MACS-010 + CB-1650.

**F5 — Why do releases spend stochastic tokens on deterministic checks?**
Why 1: The whole sweep runs through agent fleets.
Why 2: The harness capable of schema-driven calls doesn't exist as a CLI.
Why 3: Tool schemas live in Rust, and the fleet was the fastest path to "exercised".
Why 4: "Exercised" conflated two layers: mechanical conformance and judgment.
Why 5: No spec separated them.
**Root cause**: layer conflation. **Fix**: MACS-011 (harness) + MACS-012 (committed judgment workflow).

**F6 — Why did the roadmap stop being an artifact?**
Why 1: `roadmap.yaml` froze at v2.180.x and was marked historical.
Why 2: Live status moved to CHANGELOG + `gh issue list` + specs.
Why 3: `pmat work` arrived with its own store, and nothing rendered it back out.
Why 4: GitHub is mutable/remote, so no committed snapshot exists to hash.
Why 5: No freshness check makes staleness visible.
**Root cause**: source-of-truth migration without a renderer. **Fix**: MACS-013 (+1655), MACS-014 (+1656), MACS-015 (+1657).

---

## 4. Features

Each feature ships a provable contract in the repo's real YAML format
(`metadata` / `equations{formula, domain, codomain, invariants, preconditions,
lean_theorem}` / `falsification{condition, severity, action}` — cf.
`contracts/benchmarking-v1.yaml`), plus `contracts/binding.yaml` entries
(Appendix B) and comply checks.

### F1 — Agent Provenance in the Falsification Ledger

**Problem.** `FalsificationReceipt { id, git_sha, timestamp, trigger,
work_item_id, verdicts, overrides, summary, content_hash }` — no agent
dimension; `LedgerEntry` likewise. Under E2/E5 a green receipt is
unattributable and model-switches are invisible.

**Schema (Rust).**

```rust
// src/cli/handlers/work_ledger_types.rs (extend)

/// Which kind of runner produced the work
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
#[serde(rename_all = "snake_case")]
pub enum AgentHarness {
    ClaudeCode,        // interactive or -p
    ClaudeAgentSdk,
    UltracodeWorkflow, // dynamic-workflow subagent [1][3]
    CiPipeline,
    Human,
    Other(String),
}

/// Declared-first provenance; detection is advisory (E9)
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct AgentProvenance {
    /// e.g. "claude-fable-5" — API model id, verbatim
    pub model: String,
    /// "low"|"medium"|"high"|"xhigh"|"max" — as sent to the model [2]
    pub effort: String,
    pub harness: AgentHarness,
    /// ultracode workflow id, if any (E2)
    pub workflow_id: Option<String>,
    /// parent agent/session id for nested subagents (E2)
    pub parent: Option<String>,
    /// "declared" (flags) | "detected" (env) | "mixed"
    pub source: ProvenanceSource,
}

/// Interruptions that must never be silent (E5)
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum AgentEvent {
    Refusal      { at: String, note: Option<String> },
    ModelSwitch  { at: String, from: String, to: String },
    SessionRestart { at: String },
    WorkflowSpawn { at: String, workflow_id: String, subagents: u32 },
}
```

```rust
// src/cli/handlers/work_ledger_receipt.rs (extend — additive, hash-versioned)
pub struct FalsificationReceipt {
    /// 1 = pre-MACS (fields below absent from hash), 2 = MACS
    #[serde(default = "one")] pub schema_version: u32,
    // ... existing nine fields unchanged ...
    #[serde(default)] pub agent: Option<AgentProvenance>,
    #[serde(default)] pub agent_events: Vec<AgentEvent>,
    // content_hash: SHA-256 over canonical JSON of all fields above,
    // canonical = serde_json with sorted maps (BTreeMap) + no floats.
}
```

**Capture.** `pmat work start|checkpoint|complete` gain
`--agent-model/--agent-effort/--agent-harness/--agent-workflow-id/--agent-parent`
(declared, canonical) with env fallback `PMAT_AGENT_*`; advisory detection:
`CLAUDE_CODE_EFFORT_LEVEL` [5] and, if present, harness markers — recorded
with `source: detected`. New `pmat work event --type refusal|model-switch|...`
appends an `AgentEvent` (E5): a `Refusal` on the active ticket **blocks
`complete` until acknowledged** via `--ack-event <id>`.

**Contract** — `contracts/macs-provenance-v1.yaml`:

```yaml
metadata:
  version: "1.0.0"
  created: "2026-07-02"
  author: PAIML Engineering
  registry: true
  description: >
    Agent provenance and interruption events on falsification receipts.
    Declared-first; detection advisory; hash-versioned for old receipts.
  contract: macs-provenance
  status: draft

equations:
  provenance_roundtrip:
    formula: deserialize(serialize(r)) = r
    domain: r in FalsificationReceipt{schema_version=2}
    codomain: bool
    invariants:
      - serialization uses canonical JSON (sorted keys, no NaN/Inf)
      - unknown future fields are rejected (deny_unknown_fields)
    preconditions:
      - proptest deterministic mode enabled (MACS-000)
    lean_theorem: Theorems.Macs.Provenance_Roundtrip
  hash_stability:
    formula: content_hash(r) = golden(r)
    domain: r in fixtures/receipts/v2/*.json
    codomain: hex64
    invariants:
      - byte-identical across OS/arch (CI matrix)
      - v1 receipts verify under v1 rules keyed by schema_version
    preconditions:
      - fixtures committed at MACS-001
  refusal_gates_completion:
    formula: has_unacked(Refusal) -> complete(ticket) = Err(Blocked)
    domain: ticket in .pmat-work/*
    codomain: Result
    invariants:
      - Refusal never auto-acks; ack requires --ack-event with reason
    preconditions:
      - MACS-003 landed

falsification:
  - condition: "A receipt with schema_version=2 lacks `agent` yet CB-1651 passes"
    severity: P0
    action: reject_push
  - condition: "Two serializations of the same receipt differ in bytes"
    severity: P0
    action: reject_push
  - condition: "`pmat work complete` succeeds while an unacked Refusal event exists"
    severity: P0
    action: reject_push
  - condition: "A v1 (pre-MACS) receipt fails ledger verify after upgrade"
    severity: P1
    action: block_release
```

**Checks**: **CB-1651** provenance present on v2 receipts; **CB-1652** ledger
hash re-verification (`pmat work ledger verify`, MACS-016); **CB-1654**
refusal events acked before completion.

### F2 — Verification-Ladder Enforcement (implements C28)

**Problem.** Opaque `String`; misspellings silently downgrade; completion
never consults it (C28 evidence list).

**Schema + gate.**

```rust
// src/cli/handlers/work_contract_core.rs (replace String)
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord,
         Serialize, Deserialize)]
pub enum VerificationLevel { L0, L1, L2, L3, L4, L5 }

impl FromStr for VerificationLevel {
    type Err = LadderError;
    /// STRICT: exactly "L0".."L5". "l4", "L3 ", "strong" => Err (C28).
    fn from_str(s: &str) -> Result<Self, LadderError> { /* MACS-004 */ }
}

/// Evidence actually achieved, computed—never trusted (C28 §Goal):
/// L1 debug_assert contracts ran; L2 #[contract] bound (binding.yaml);
/// L3 falsification_tests[] pass in tree; L4 kani_harnesses[] exit 0;
/// L5 lean_theorem status=proved, zero `sorry`.
pub fn achieved_level(t: &Ticket) -> VerificationLevel { /* MACS-005 */ }

// gate in core_handlers/resolution.rs:
if achieved_level(t) < t.contract.verification_level {
    return Err(WorkError::LadderShortfall {
        claimed: t.contract.verification_level,
        achieved: achieved_level(t),
    }); // ticket cannot close (C28: "Completion gates are hard")
}
```

**Contract** — `contracts/macs-ladder-v1.yaml` (equations
`parse_total_strict` — parse is total and strict over a corruption corpus;
`gate_monotone` — `complete ⇒ achieved ≥ claimed`; `no_silent_downgrade` —
storing a level requires parse-success):

```yaml
metadata: {version: "1.0.0", created: "2026-07-02", author: PAIML Engineering,
           registry: true, contract: macs-ladder, status: draft,
           description: "Typed L0–L5 with evidence-gated completion (C28)."}
equations:
  parse_total_strict:
    formula: parse(s) = Ok(l) <-> s in {"L0".."L5"}
    domain: s in String (proptest-generated incl. corruptions)
    codomain: Result<VerificationLevel>
    invariants: ["Ord matches numeric order", "Display∘parse = id on valid set"]
    preconditions: []
    lean_theorem: Theorems.Macs.Ladder_Parse_Total
  gate_monotone:
    formula: complete(t)=Ok -> achieved_level(t) >= t.claimed_level
    domain: t in .pmat-work/*
    codomain: bool
    invariants: ["achieved_level is computed from evidence, never stored"]
    preconditions: ["binding.yaml well-formed (CB-1250 family)"]
falsification:
  - {condition: "\"l4\" or \"L3 \" parses successfully", severity: P0, action: reject_push}
  - {condition: "A ticket claiming L4 closes while a kani harness fails",   severity: P0, action: reject_push}
  - {condition: "A ticket claiming L5 closes with any `sorry` in its Lean proof", severity: P0, action: reject_push}
```

**Checks**: **CB-1653** claimed-vs-achieved drift surfaces in
`pmat comply check`; migration tool `pmat work ticket migrate --levels`
rewrites legacy strings interactively (invalid ⇒ L0 + audit note).

### F3 — CoT Proof Derivation (implements C31)

**Problem.** `ChainOfThoughtStep { summary, rationale, timestamp }` prose;
no derivation edge to proof obligations/claims; unmatched assumptions never
surface (C31).

**Schema + derivation.**

```rust
// C31 v2 shape — src/models/work_cot.rs (new)
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ChainOfThoughtStep {
    pub id: String,                     // "CoT-1"
    pub assumption: String,             // input to the reasoning
    pub implication: String,            // output of the reasoning
    pub evidence_method: String,        // how to falsify the implication
    pub timestamp: DateTime<Utc>,
    /// which prior implication (or bound equation) discharges `assumption`
    pub discharged_by: Option<DischargeRef>,   // CoT id | contract#equation | "E<n>"
}

/// CB-1640 (reserved by C31): every assumption must be discharged.
/// Discharge graph must be a DAG rooted in evidence/equations.
pub fn check_chain(steps: &[ChainOfThoughtStep]) -> Vec<Cb1640Violation> { /* MACS-008 */ }

/// Per C31 §Goal — one derivation per step:
///   step -> proof_obligation (ticket YAML, C25 Phase 1)
///   step -> FalsifiableClaim { hypothesis: implication,
///                              method: evidence_method } (C29 roster)
///   step -> optional require/ensure clause (C30 codegen)
pub fn derive(step: &ChainOfThoughtStep) -> Derived { /* MACS-009 */ }
```

For Fable 5 specifically: at `high` (its default [1]) and above, the model's
reasoning is exactly the raw material this schema structures — E3 is what
makes F3 the *Fable integration*, not the harness plumbing.

**Contract** — `contracts/macs-cot-v1.yaml`:

```yaml
metadata: {version: "1.0.0", created: "2026-07-02", author: PAIML Engineering,
           registry: true, contract: macs-cot, status: draft,
           description: "Structured CoT with mandatory discharge + auto-derivation (C31)."}
equations:
  chain_integrity:
    formula: forall s in steps: discharged(s.assumption)
    domain: steps in Vec<ChainOfThoughtStep>
    codomain: bool
    invariants: ["discharge graph is acyclic", "roots are equations or E-facts only"]
    preconditions: ["ticket has >=1 CoT step when kind in {feature,bugfix,refactor}"]
    lean_theorem: Theorems.Macs.CoT_Discharge_DAG
  derivation_complete:
    formula: |claims_from(steps)| = |steps| and |obligations_from(steps)| = |steps|
    domain: steps as above
    codomain: bool
    invariants: ["claim.hypothesis = step.implication verbatim",
                 "claim.method    = step.evidence_method verbatim"]
    preconditions: ["MACS-007 schema active"]
falsification:
  - {condition: "A step's assumption has no discharge and CB-1640 stays green", severity: P0, action: reject_push}
  - {condition: "A ticket closes with fewer FalsifiableClaims than CoT steps",  severity: P0, action: reject_push}
  - {condition: "Legacy prose steps crash the parser instead of migrating to v2-with-L0 annotation", severity: P1, action: block_release}
```

**Checks**: **CB-1640** (implemented), **CB-1658** derivation completeness.

### F4 — Effort Pinning + Skill Lint

**Problem.** E4 exists; none of `.claude/skills/{dogfood, pmat-context,
pmat-multi-lang, pmat-quality, pmat-refactor, pmat-tech-debt}` use it; the
dogfood skill fans out via `Agent` at inherited effort — under an ultracode
session that is xhigh × N on mechanical work (E6).

**Change.** Add frontmatter to each skill; policy table becomes normative:

```yaml
# .claude/skills/dogfood/SKILL.md (frontmatter delta)
---
effort: medium        # mechanical sweep; judgment lives in F5's workflow
...
# .claude/skills/pmat-quality/skill.md
---
effort: xhigh         # adversarial/skeptic review earns depth [2][3]
```

| Skill | Pinned effort | Rationale |
|---|---|---|
| dogfood | medium | command exercising is mechanical (E6) |
| pmat-context | low | I/O-bound generation |
| pmat-multi-lang | medium | breadth over depth |
| pmat-quality | **xhigh** | adversarial verification (E3) |
| pmat-refactor | high | design judgment, single-agent |
| pmat-tech-debt | high | prioritization judgment |

**Lint**: **CB-1650** — every file under `.claude/skills/**/{SKILL,skill}.md`
has frontmatter `effort ∈ {low, medium, high, xhigh}` (session-only values
`max`/`ultracode` are rejected here by design [1][4]).

**Contract** — `contracts/macs-skill-effort-v1.yaml` (equation
`skill_effort_pinned: forall f in skills: effort(f) ∈ {low,medium,high,xhigh}`;
falsification: missing/invalid pin with CB-1650 green ⇒ P0 reject_push).

### F5 — Deterministic Sweep Harness + Committed Judgment Workflow

**Problem.** The mechanical MCP layer (per-tool calls with schema-derived
args, byte-level framing, N-way concurrency) needs no model; running it via
fleets spends stochastic tokens (E6) and its orchestration evaporates with the
session (E7).

**Part A — `pmat qa mcp-sweep` (LLM-free).**

```rust
// src/cli/handlers/qa_mcp_sweep.rs (new)
/// For every tool in src/agent/mcp_server_tool_defs.rs:
///   1. derive a minimal valid argument set from its inputSchema
///   2. call over stdio JSON-RPC; assert schema-valid response
///   3. byte-framing golden: stdout is exclusively JSON-RPC (no stray bytes)
///   4. repeat under --concurrency N (default 8) against ONE working tree:
///      zero lock errors, zero scratch leftovers (advisory-lock invariants,
///      cf. src/services/metric_trends_core.rs)
/// Output: --format json|junit ; exit non-zero on any red.
pub async fn handle_mcp_sweep(opts: SweepOpts) -> Result<SweepReport> { /* MACS-011 */ }
```

**Part B — committed workflow for the judgment layer** (E7: saved scripts are
versionable [4]). `contracts/workflows/release-sweep.ultracode.mjs` fans out
only over *anomalies* emitted by Part A and by the CLI sweep, runs skeptic
re-verification, and stamps every finding through F1:

```js
// contracts/workflows/release-sweep.ultracode.mjs  (judgment layer ONLY)
// Deterministic inputs: sweep JSON artifacts; nondeterministic judgment is
// stamped via `pmat work event` + PMAT_AGENT_* before any receipt is touched.
const anomalies = JSON.parse(read('artifacts/qa/mcp-sweep.json')).anomalies;
for (const batch of chunk(anomalies, 8)) {
  await Promise.all(batch.map(a =>
    spawnSubagent({
      prompt: `Skeptically re-verify anomaly ${a.id}; falsify before believing`,
      env: { PMAT_AGENT_HARNESS: 'ultracode-workflow',
             PMAT_AGENT_WORKFLOW_ID: WORKFLOW_ID },
    })));
}
```

**Contract** — `contracts/macs-sweep-v1.yaml` (equations:
`sweep_deterministic: run1 = run2 modulo {timestamp, duration} fields`;
`framing_pure: stdout_bytes ⊂ JSONRPC_frames`; `concurrency_safe: lock_errors
= 0 ∧ scratch_leftovers = 0 for N ≤ 8`; falsification incl. "any LLM call in
Part A code path" ⇒ P0 reject_push).

### F6 — Canonical Roadmap Artifact + Manifest/Doc Drift

**Problem.** `roadmap.yaml` header: "HISTORICAL — SUPERSEDED… live
roadmap/status lives in CHANGELOG.md, `gh issue list`, and specs." `mcp.json`
declares 2 tools; README claims 20. `docs/features/refactor-auto.md:422`:
`"primary": "claude-3-opus", "fallback": "gpt-4-turbo"` — 2.5 generations
stale in docs that double as agent instructions.

**Change.**
`pmat roadmap sync` renders **ROADMAP.yaml** deterministically from
`.pmat-work/` (tickets, ledger states) plus an optional pinned GitHub snapshot
(`--gh-snapshot <sha-of-export>`): BTreeMap ordering, ids sorted, generation
timestamp **excluded** from the content hash, source snapshot ids **included**.
`pmat mcp manifest --write` regenerates `mcp.json` from
`mcp_server_tool_defs.rs` (byte-equal or CB-1656 red). A model-registry doc
`docs/agent-models.md` becomes the single place model ids appear;
**CB-1657** lints the deny-list `{claude-3-*, gpt-4-turbo, claude-2*}` across
`docs/` (allow-listed only inside `agent-models.md` history table).

**Contract** — `contracts/macs-artifacts-v1.yaml` (equations:
`roadmap_canonical: render(state) byte-stable ∧ hash covers sources not
wall-clock`; `manifest_faithful: tools(mcp.json) = tools(tool_defs)`;
`docs_current: deny-list hits = 0 outside allow-list`; falsification rows for
each, P0/P1 as in Appendix A of `pmat-query-search-modes-v1.yaml` style).

---

## 5. Deterministic Ticket Sequence

### 5.1 Order, dependencies, and the total-order rule

```mermaid
graph LR
  T0[MACS-000 bootstrap] --> T1[001 provenance types]
  T1 --> T2[002 capture] --> T12[012 workflow]
  T1 --> T3[003 events] --> T12
  T0 --> T4[004 ladder enum] --> T5[005 gate] --> T6[006 CB checks]
  T4 --> T7[007 CoT schema] --> T8[008 CB-1640] --> T9[009 derivation]
  T0 --> T10[010 effort pins]
  T0 --> T11[011 mcp-sweep] --> T12
  T2 --> T13[013 roadmap sync]
  T0 --> T14[014 manifest]
  T0 --> T15[015 doc drift]
  T2 --> T16[016 ledger verify]; T3 --> T16
```

**Rule R1 (total order).** Execute strictly in ascending ID. The DAG above
proves ascending-ID is a valid topological order; independence between some
tickets is *not* license to parallelize — sequential execution keeps the
ledger single-writer and the receipt chain reproducible (§6-D7). Rule R1 is
itself a falsification condition: two receipts with out-of-order
`work_item_id`s in `ledger.jsonl` ⇒ P0.

**Per-ticket invariants (all tickets).** Contract before code (CB-1400);
design budget cyclomatic ≤ 10 / cognitive ≤ 7 (roadmap meta), hard gate at the
`pmat verify` incremental thresholds (cyclomatic ≤ 30 / cognitive ≤ 25 on
changed files); zero SATD; atomic commit `feat(macs): MACS-0NN <title>`;
completion only via `pmat work falsify && pmat work complete`.

### 5.2 Ticket cards

Each card: **Depends / Files / Contract (`.pmat-work/MACS-0NN/contract.json`)
/ RED tests / Level / Notes.** The canonical execution loop is defined once in
§7 — do not restate it per ticket.

---

**MACS-000 — Bootstrap: tickets, spec registration, deterministic proptest**
Depends: — | Level: **L1** | Files: `proptest.toml`, `docs/specifications/components/modern-agentic-coding-support.md`, `contracts/macs-*.yaml` (6 stubs), `contracts/binding.yaml`
Notes: run the Appendix C command block to create MACS-001…016 via `pmat work add`; flip `deterministic = false` → `true` in `proptest.toml` (found at baseline; a nondeterministic property-RNG contradicts "green here ⇒ green in CI"); commit the six contract YAML stubs with `status: draft`.

```json
{ "ticket": "MACS-000", "kind": "chore", "verification_level": "L1",
  "require": [
    {"claim": "baseline is bab0f9a or descendant", "evidence": "git merge-base --is-ancestor bab0f9a HEAD"},
    {"claim": "CB-1650..1666 unallocated", "evidence": "grep -rhoE 'CB-16(5[0-9]|6[0-6])' docs/ src/ | sort -u | wc -l  # == 0 pre-ticket"}],
  "ensure": [
    {"claim": "17 MACS tickets exist", "evidence": "pmat work status --all | grep -c 'MACS-0' # == 17"},
    {"claim": "proptest RNG deterministic", "evidence": "grep -q '^deterministic = true' proptest.toml"},
    {"claim": "spec + 6 contract stubs committed", "evidence": "ls contracts/macs-*.yaml | wc -l # == 6"}],
  "invariant": [
    {"claim": "no production code changed", "evidence": "git diff --stat HEAD~1 -- src/ | wc -l # == 0"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "build must be replayable (spec §6)",
     "implication": "ticket set + RNG must be fixed before code",
     "evidence_method": "re-run Appendix C on a clean clone; identical ticket ids"}]}
```

---

**MACS-001 — `AgentProvenance` / `AgentEvent` types + canonical serialization + `schema_version`**
Depends: 000 | Level: **L4** | Files: `src/cli/handlers/work_ledger_types.rs`, `work_ledger_receipt.rs`, `fixtures/receipts/v2/*.json`, `src/cli/handlers/work_ledger_tests.rs`
RED tests: `ledger::provenance_roundtrip_prop` (proptest), `ledger::hash_stability_golden_v2`, `ledger::v1_receipts_still_verify`, `ledger::canonical_json_sorted_keys`.

```json
{ "ticket": "MACS-001", "kind": "feature", "verification_level": "L4",
  "contract_yaml": "contracts/macs-provenance-v1.yaml#provenance_roundtrip,hash_stability",
  "require": [
    {"claim": "receipt struct at baseline has 9 fields ending content_hash", "evidence": "grep -A22 'pub struct FalsificationReceipt' src/cli/handlers/work_ledger_receipt.rs"}],
  "ensure": [
    {"claim": "roundtrip property holds", "evidence": "cargo test ledger::provenance_roundtrip_prop"},
    {"claim": "golden hashes byte-stable", "evidence": "cargo test ledger::hash_stability_golden_v2"},
    {"claim": "v1 receipts verify under v1 rules", "evidence": "cargo test ledger::v1_receipts_still_verify"},
    {"claim": "proptest for roundtrip", "evidence": "grep -q 'proptest!' src/cli/handlers/work_ledger_tests.rs"}],
  "invariant": [
    {"claim": "existing ledger.jsonl parses unchanged", "evidence": "cargo test ledger::legacy_jsonl_parse"},
    {"claim": "no field removed/renamed on receipt", "evidence": "git diff HEAD~1 -- src/cli/handlers/work_ledger_receipt.rs | grep -c '^-.*pub ' # == 0"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "old receipts hashed without new fields (F1 §Schema)",
     "implication": "hash must be versioned by schema_version, not recomputed retroactively",
     "evidence_method": "v1_receipts_still_verify red-then-green"}]}
```

---

**MACS-002 — Provenance capture: flags + env, wired into receipts and `LedgerEntry`**
Depends: 001 | Level: **L3** | Files: `src/cli/commands/work_commands_work.rs`, `src/cli/command_dispatcher/command_dispatcher_work.rs`, `src/cli/handlers/work_handlers/core_handlers/{handlers,resolution}.rs`
RED tests: `work::declared_flags_win_over_env`, `work::env_detected_marked_advisory`, `work::ledger_entry_carries_agent`, `work::missing_provenance_yields_none_v2`.

```json
{ "ticket": "MACS-002", "kind": "feature", "verification_level": "L3",
  "contract_yaml": "contracts/macs-provenance-v1.yaml#provenance_roundtrip",
  "require": [
    {"claim": "MACS-001 types exported", "evidence": "cargo build 2>&1 | grep -c error # == 0 after use in handlers"}],
  "ensure": [
    {"claim": "--agent-model/--agent-effort/... accepted on start|checkpoint|complete", "evidence": "pmat work start --help | grep -c 'agent-' # >= 5"},
    {"claim": "PMAT_AGENT_* env honored, CLAUDE_CODE_EFFORT_LEVEL detected as advisory", "evidence": "cargo test work::env_detected_marked_advisory"},
    {"claim": "declared beats detected (source=declared)", "evidence": "cargo test work::declared_flags_win_over_env"},
    {"claim": "LedgerEntry gains agent summary {model,effort,harness}", "evidence": "cargo test work::ledger_entry_carries_agent"}],
  "invariant": [
    {"claim": "invocation without any agent info still works (agent=None, v2)", "evidence": "cargo test work::missing_provenance_yields_none_v2"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "detection is best-effort (E9)",
     "implication": "declared flags are canonical; env is advisory and labeled",
     "evidence_method": "declared_flags_win_over_env"}]}
```

---

**MACS-003 — `pmat work event` + refusal blocking (E5)**
Depends: 001 | Level: **L3** | Files: `src/cli/commands/work_commands_work.rs`, new `src/cli/handlers/work_event.rs`, `core_handlers/resolution.rs`
RED tests: `work::event_appends_to_active_ticket`, `work::event_refusal_blocks_complete`, `work::ack_event_unblocks_with_reason`, `work::model_switch_recorded_from_to`.

```json
{ "ticket": "MACS-003", "kind": "feature", "verification_level": "L3",
  "contract_yaml": "contracts/macs-provenance-v1.yaml#refusal_gates_completion",
  "require": [
    {"claim": "AgentEvent enum from MACS-001 compiled", "evidence": "pmat query \"AgentEvent\" --limit 1 --files-with-matches"}],
  "ensure": [
    {"claim": "refusal blocks complete until --ack-event", "evidence": "cargo test work::event_refusal_blocks_complete"},
    {"claim": "ack requires non-empty reason, recorded in receipt", "evidence": "cargo test work::ack_event_unblocks_with_reason"},
    {"claim": "model-switch stores {from,to} verbatim", "evidence": "cargo test work::model_switch_recorded_from_to"}],
  "invariant": [
    {"claim": "events are append-only within a ticket", "evidence": "cargo test work::events_append_only"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "non-interactive flagged turns end in refusal (E5)",
     "implication": "a refusal must be a recorded, blocking state — never a silent gap",
     "evidence_method": "event_refusal_blocks_complete"}]}
```

---

**MACS-004 — `VerificationLevel` typed enum + strict parse + migration**
Depends: 000 | Level: **L4** | Files: `src/cli/handlers/work_contract_core.rs`, `src/cli/handlers/work_handlers/ticket_validate_migrate.rs`
RED tests: `ladder::parse_total_strict_prop` (proptest over corruptions: case, whitespace, words), `ladder::ord_matches_numeric`, `ladder::display_parse_id`, `migrate::legacy_strings_interactive_to_typed`.

```json
{ "ticket": "MACS-004", "kind": "feature", "verification_level": "L4",
  "contract_yaml": "contracts/macs-ladder-v1.yaml#parse_total_strict",
  "require": [
    {"claim": "verification_level is String at baseline", "evidence": "pmat query --literal 'verification_level: String' --limit 1"}],
  "ensure": [
    {"claim": "strict parse: only \"L0\"..\"L5\"", "evidence": "cargo test ladder::parse_total_strict_prop"},
    {"claim": "Ord matches numeric order", "evidence": "cargo test ladder::ord_matches_numeric"},
    {"claim": "migration rewrites legacy tickets; invalid=>L0+audit note", "evidence": "cargo test migrate::legacy_strings_interactive_to_typed"}],
  "invariant": [
    {"claim": "serde representation stays the display string (no wire break)", "evidence": "cargo test ladder::serde_string_repr"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "misspellings silently downgrade today (C28)",
     "implication": "parse must be strict and total; storage must go through it",
     "evidence_method": "parse_total_strict_prop with corruption corpus"}]}
```

---

**MACS-005 — Ladder completion gate (`achieved ≥ claimed`)**
Depends: 004 | Level: **L3** | Files: `core_handlers/resolution.rs`, new `src/quality/ladder_evidence.rs`
RED tests: `gate::claim_L4_without_kani_blocks`, `gate::claim_L5_with_sorry_blocks`, `gate::claim_L3_with_passing_falsification_closes`, `gate::achieved_never_stored`.

```json
{ "ticket": "MACS-005", "kind": "feature", "verification_level": "L3",
  "contract_yaml": "contracts/macs-ladder-v1.yaml#gate_monotone",
  "require": [
    {"claim": "typed enum landed (MACS-004)", "evidence": "cargo test ladder::parse_total_strict_prop"}],
  "ensure": [
    {"claim": "L4 claim + failing kani harness => LadderShortfall", "evidence": "cargo test gate::claim_L4_without_kani_blocks"},
    {"claim": "L5 claim + any `sorry` => blocked", "evidence": "cargo test gate::claim_L5_with_sorry_blocks"},
    {"claim": "L3 with green falsification_tests[] closes", "evidence": "cargo test gate::claim_L3_with_passing_falsification_closes"}],
  "invariant": [
    {"claim": "achieved_level computed each call, never persisted", "evidence": "cargo test gate::achieved_never_stored"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "evidence sources exist per level (C28 L1..L5 defs)",
     "implication": "achieved_level maps evidence->level; gate compares Ord",
     "evidence_method": "the three gate tests, one per boundary"}]}
```

---

**MACS-006 — Comply checks CB-1653 (+ ladder wiring into receipts)**
Depends: 005 | Level: **L3** | Files: `src/cli/handlers/comply_handlers/...`, `work_ledger_receipt.rs` (receipt gains `claimed_level`, `achieved_level` under v2)
RED tests: `comply::cb1653_red_on_drift`, `comply::cb1653_green_after_fix`, `ledger::receipt_records_both_levels`.

```json
{ "ticket": "MACS-006", "kind": "feature", "verification_level": "L3",
  "contract_yaml": "contracts/macs-ladder-v1.yaml#gate_monotone",
  "require": [{"claim": "gate active (MACS-005)", "evidence": "cargo test gate::claim_L4_without_kani_blocks"}],
  "ensure": [
    {"claim": "CB-1653 fires on seeded claim/evidence drift", "evidence": "cargo test comply::cb1653_red_on_drift"},
    {"claim": "receipts carry {claimed,achieved}", "evidence": "cargo test ledger::receipt_records_both_levels"}],
  "invariant": [{"claim": "existing comply checks unaffected", "evidence": "pmat comply check --format json | jq '.regressions|length' # == 0"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "gates without visibility rot",
     "implication": "drift must be a comply check, not only a completion error",
     "evidence_method": "cb1653_red_on_drift"}]}
```

---

**MACS-007 — `ChainOfThoughtStep` v2 schema + parser + migration**
Depends: 004 | Level: **L4** | Files: new `src/models/work_cot.rs`, `work_contract_core.rs`, `ticket_validate_migrate.rs`
RED tests: `cot::v2_roundtrip_prop`, `cot::legacy_prose_migrates_annotated_L0`, `cot::discharge_ref_parses_cot_equation_efact`.

```json
{ "ticket": "MACS-007", "kind": "feature", "verification_level": "L4",
  "contract_yaml": "contracts/macs-cot-v1.yaml#chain_integrity",
  "require": [{"claim": "baseline step is {summary,rationale,timestamp}", "evidence": "pmat query --literal 'ChainOfThoughtStep' --include-source --limit 1"}],
  "ensure": [
    {"claim": "v2 fields {id,assumption,implication,evidence_method,discharged_by}", "evidence": "cargo test cot::v2_roundtrip_prop"},
    {"claim": "legacy prose steps migrate (annotated, not dropped)", "evidence": "cargo test cot::legacy_prose_migrates_annotated_L0"},
    {"claim": "DischargeRef parses CoT-id | contract#equation | E<n>", "evidence": "cargo test cot::discharge_ref_parses_cot_equation_efact"}],
  "invariant": [{"claim": "`pmat work status` still renders steps", "evidence": "cargo test cot::status_renders_v2"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "prose has no derivable edge (C31)",
     "implication": "schema must carry the edge fields before any checker exists",
     "evidence_method": "v2_roundtrip_prop"}]}
```

---

**MACS-008 — Chain-integrity checker (implements CB-1640)**
Depends: 007 | Level: **L4** | Files: `src/models/work_cot.rs` (`check_chain`), comply wiring
RED tests: `cot::undischarged_assumption_violates_1640`, `cot::cycle_detected_violates_1640`, `cot::dag_property_prop` (proptest: random discharge graphs — checker accepts iff DAG rooted in E-facts/equations), `cot::spec_section_3_1_passes` (this document's §3.1 as fixture).

```json
{ "ticket": "MACS-008", "kind": "feature", "verification_level": "L4",
  "contract_yaml": "contracts/macs-cot-v1.yaml#chain_integrity",
  "require": [{"claim": "v2 schema live (MACS-007)", "evidence": "cargo test cot::v2_roundtrip_prop"}],
  "ensure": [
    {"claim": "unmatched assumption => CB-1640 violation", "evidence": "cargo test cot::undischarged_assumption_violates_1640"},
    {"claim": "cycles rejected", "evidence": "cargo test cot::cycle_detected_violates_1640"},
    {"claim": "acceptance iff DAG rooted in evidence (property)", "evidence": "cargo test cot::dag_property_prop"},
    {"claim": "spec §3.1 chain is itself green", "evidence": "cargo test cot::spec_section_3_1_passes"}],
  "invariant": [{"claim": "checker is pure (no I/O)", "evidence": "grep -c 'std::fs\\|tokio' src/models/work_cot.rs # == 0 in check_chain path"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "the spec dogfoods the schema (§3.1)",
     "implication": "the spec's own chain is the first fixture; self-enforcement",
     "evidence_method": "spec_section_3_1_passes"}]}
```

---

**MACS-009 — Auto-derivation: step → proof_obligation + FalsifiableClaim (+ optional require/ensure)**
Depends: 008 | Level: **L3** | Files: `src/models/work_cot.rs` (`derive`), C25/C29/C30 touchpoints, comply check **CB-1658**
RED tests: `derive::one_claim_per_step_verbatim_fields`, `derive::one_obligation_per_step_in_ticket_yaml`, `derive::optional_clause_codegen_gated_by_flag`, `comply::cb1658_red_when_counts_mismatch`.

```json
{ "ticket": "MACS-009", "kind": "feature", "verification_level": "L3",
  "contract_yaml": "contracts/macs-cot-v1.yaml#derivation_complete",
  "require": [{"claim": "CB-1640 checker green on fixtures", "evidence": "cargo test cot::spec_section_3_1_passes"}],
  "ensure": [
    {"claim": "|claims| == |steps|, hypothesis/method verbatim", "evidence": "cargo test derive::one_claim_per_step_verbatim_fields"},
    {"claim": "|obligations| == |steps| in generated ticket YAML", "evidence": "cargo test derive::one_obligation_per_step_in_ticket_yaml"},
    {"claim": "CB-1658 enforces completeness", "evidence": "cargo test comply::cb1658_red_when_counts_mismatch"}],
  "invariant": [{"claim": "derivation is deterministic (stable ordering by step id)", "evidence": "cargo test derive::stable_output_two_runs"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "claims must be falsifiable to be worth deriving [9]",
     "implication": "claim.method is copied from evidence_method verbatim — no paraphrase drift",
     "evidence_method": "one_claim_per_step_verbatim_fields"}]}
```

---

**MACS-010 — Effort pinning across skills + CB-1650 lint**
Depends: 000 | Level: **L1→L3** | Files: `.claude/skills/**` (6 files), lint in comply handlers
RED tests: `comply::cb1650_red_on_missing_effort`, `comply::cb1650_red_on_session_only_values`, `comply::cb1650_green_on_repo_skills`.

```json
{ "ticket": "MACS-010", "kind": "feature", "verification_level": "L3",
  "contract_yaml": "contracts/macs-skill-effort-v1.yaml#skill_effort_pinned",
  "require": [{"claim": "frontmatter effort is honored by harness (E4)", "evidence": "citation [1] pinned in spec §1"}],
  "ensure": [
    {"claim": "all six skills pinned per F4 table", "evidence": "grep -l '^effort:' .claude/skills/*/*.md | wc -l # == 6"},
    {"claim": "lint rejects max/ultracode in frontmatter", "evidence": "cargo test comply::cb1650_red_on_session_only_values"}],
  "invariant": [{"claim": "skill bodies unchanged (frontmatter-only diff)", "evidence": "git diff HEAD~1 -- .claude/skills | grep -c '^[+-][^+-]' # == frontmatter lines only, reviewed"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "unpinned effort => run-to-run variance + E6 cost risk",
     "implication": "pin per skill; adversarial skills earn xhigh, mechanical do not",
     "evidence_method": "CB-1650 on repo; F4 table as normative mapping"}]}
```

---

**MACS-011 — `pmat qa mcp-sweep`: LLM-free deterministic harness**
Depends: 000 | Level: **L3** | Files: new `src/cli/handlers/qa_mcp_sweep.rs`, `src/agent/mcp_server_tool_defs.rs` (read-only), goldens under `fixtures/mcp-sweep/`
RED tests: `sweep::args_derived_from_every_tool_schema`, `sweep::framing_stdout_pure_jsonrpc_golden`, `sweep::concurrency8_zero_lock_errors_zero_scratch`, `sweep::two_runs_byte_identical_modulo_time`, `sweep::no_llm_symbols_linked`.

```json
{ "ticket": "MACS-011", "kind": "feature", "verification_level": "L3",
  "contract_yaml": "contracts/macs-sweep-v1.yaml#sweep_deterministic,framing_pure,concurrency_safe",
  "require": [
    {"claim": "tool defs enumerate full MCP surface", "evidence": "pmat query --literal 'tool_defs' --files-with-matches"}],
  "ensure": [
    {"claim": "every tool called once with schema-derived args", "evidence": "cargo test sweep::args_derived_from_every_tool_schema"},
    {"claim": "stdout is exclusively JSON-RPC frames (byte golden)", "evidence": "cargo test sweep::framing_stdout_pure_jsonrpc_golden"},
    {"claim": "8-way concurrency: 0 lock errors, 0 scratch leftovers", "evidence": "cargo test sweep::concurrency8_zero_lock_errors_zero_scratch"},
    {"claim": "replay determinism", "evidence": "cargo test sweep::two_runs_byte_identical_modulo_time"}],
  "invariant": [
    {"claim": "no model/API client linked in this path", "evidence": "cargo test sweep::no_llm_symbols_linked"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "schema-derivable => model-free (CoT-7, §3.1)",
     "implication": "the sweep must be pure Rust with goldens; fleets add only variance and cost (E6)",
     "evidence_method": "no_llm_symbols_linked + byte-identical replays"}]}
```

---

**MACS-012 — Committed judgment workflow (`release-sweep.ultracode.mjs`) with provenance stamping**
Depends: 002, 003, 011 | Level: **L1→L3** | Files: `contracts/workflows/release-sweep.ultracode.mjs`, `Makefile` target `release-sweep`, book case-study stub
RED tests (harness-side, scripted): `workflow::file_committed_and_lint_clean`, `workflow::reads_only_sweep_artifacts`, `workflow::stamps_PMAT_AGENT_env_on_every_subagent`, `workflow::refusal_path_emits_work_event`.

```json
{ "ticket": "MACS-012", "kind": "feature", "verification_level": "L3",
  "contract_yaml": "contracts/macs-sweep-v1.yaml#sweep_deterministic",
  "require": [
    {"claim": "workflow scripts are saveable/versionable (E7)", "evidence": "citation [4] pinned in spec §1"},
    {"claim": "anomaly artifact format frozen by MACS-011", "evidence": "jq -e '.anomalies' artifacts/qa/mcp-sweep.json"}],
  "ensure": [
    {"claim": "script committed; judgment layer only (consumes anomalies, never raw sweep)", "evidence": "node --check contracts/workflows/release-sweep.ultracode.mjs && grep -c 'mcp-sweep.json' contracts/workflows/release-sweep.ultracode.mjs # >= 1"},
    {"claim": "every subagent spawn sets PMAT_AGENT_HARNESS/WORKFLOW_ID", "evidence": "grep -c 'PMAT_AGENT_' contracts/workflows/release-sweep.ultracode.mjs # >= 2"},
    {"claim": "refusal handling calls `pmat work event --type refusal`", "evidence": "grep -c \"work event --type refusal\" contracts/workflows/release-sweep.ultracode.mjs # >= 1"}],
  "invariant": [
    {"claim": "session-bound resume never relied on: durable state = .pmat-work checkpoints (E7)", "evidence": "grep -c 'resume' contracts/workflows/release-sweep.ultracode.mjs # == 0"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "orchestration evaporates with the session (E7)",
     "implication": "the committed script is the deterministic artifact; receipts are the durable state",
     "evidence_method": "file_committed_and_lint_clean + checkpoint greps"}]}
```

---

**MACS-013 — `pmat roadmap sync`: canonical ROADMAP.yaml**
Depends: 002 | Level: **L3** | Files: `src/roadmap/{generator,commands}/...`, root `ROADMAP.yaml`, comply **CB-1655** (freshness)
RED tests: `roadmap::render_byte_stable_two_runs`, `roadmap::hash_covers_sources_not_wallclock`, `roadmap::gh_snapshot_pinned_by_id`, `comply::cb1655_red_when_ledger_newer_than_artifact`.

```json
{ "ticket": "MACS-013", "kind": "feature", "verification_level": "L3",
  "contract_yaml": "contracts/macs-artifacts-v1.yaml#roadmap_canonical",
  "require": [
    {"claim": "roadmap.yaml is marked HISTORICAL at baseline", "evidence": "head -12 roadmap.yaml | grep -c SUPERSEDED # == 1"},
    {"claim": "work store is readable source of truth", "evidence": "pmat work status --all --format json | jq -e 'length >= 1'"}],
  "ensure": [
    {"claim": "two renders byte-identical", "evidence": "cargo test roadmap::render_byte_stable_two_runs"},
    {"claim": "content hash excludes wall-clock, includes source snapshot ids", "evidence": "cargo test roadmap::hash_covers_sources_not_wallclock"},
    {"claim": "CB-1655 red when ledger newer than ROADMAP.yaml", "evidence": "cargo test comply::cb1655_red_when_ledger_newer_than_artifact"}],
  "invariant": [
    {"claim": "historical roadmap.yaml untouched (new artifact is ROADMAP.yaml)", "evidence": "git diff --name-only HEAD~1 | grep -cx roadmap.yaml # == 0"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "GitHub is a mutable remote (F6 five-whys Why4)",
     "implication": "renders may only consume a pinned snapshot id, never live queries",
     "evidence_method": "gh_snapshot_pinned_by_id"}]}
```

---

**MACS-014 — `pmat mcp manifest --write` + CB-1656 drift check**
Depends: 000 | Level: **L3** | Files: new `src/cli/handlers/mcp_manifest.rs`, `mcp.json`, comply wiring
RED tests: `manifest::generated_equals_tool_defs`, `manifest::byte_equal_or_cb1656_red`, `manifest::readme_tool_count_matches`.

```json
{ "ticket": "MACS-014", "kind": "feature", "verification_level": "L3",
  "contract_yaml": "contracts/macs-artifacts-v1.yaml#manifest_faithful",
  "require": [{"claim": "mcp.json declares 2 tools at baseline", "evidence": "python3 -c \"import json;print(len(json.load(open('mcp.json'))['mcp']['tools']))\" # == 2"}],
  "ensure": [
    {"claim": "regenerated manifest enumerates all tool defs", "evidence": "cargo test manifest::generated_equals_tool_defs"},
    {"claim": "CB-1656 red on any byte drift", "evidence": "cargo test manifest::byte_equal_or_cb1656_red"},
    {"claim": "README tool count sourced from manifest", "evidence": "cargo test manifest::readme_tool_count_matches"}],
  "invariant": [{"claim": "manifest generation is pure (defs -> bytes)", "evidence": "cargo test manifest::two_runs_identical"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "agents read manifests as ground truth",
     "implication": "manifest must be generated, never hand-edited; drift is a comply failure",
     "evidence_method": "byte_equal_or_cb1656_red"}]}
```

---

**MACS-015 — Doc drift purge + model registry + CB-1657**
Depends: 000 | Level: **L1→L3** | Files: `docs/features/refactor-auto.md`, new `docs/agent-models.md`, comply lint
RED tests: `comply::cb1657_red_on_denylist_hit`, `comply::cb1657_allowlist_only_in_registry_history`, `docs::refactor_auto_references_registry`.

```json
{ "ticket": "MACS-015", "kind": "bugfix", "verification_level": "L3",
  "contract_yaml": "contracts/macs-artifacts-v1.yaml#docs_current",
  "require": [{"claim": "stale ids present at baseline", "evidence": "grep -n 'claude-3-opus' docs/features/refactor-auto.md # line ~422"}],
  "ensure": [
    {"claim": "deny-list {claude-3-*, claude-2*, gpt-4-turbo} clean outside registry", "evidence": "cargo test comply::cb1657_red_on_denylist_hit"},
    {"claim": "refactor-auto points at docs/agent-models.md", "evidence": "cargo test docs::refactor_auto_references_registry"}],
  "invariant": [{"claim": "registry history table preserves old ids for archaeology", "evidence": "grep -c 'claude-3-opus' docs/agent-models.md # >= 1"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "pmat docs double as agent instructions (README positioning)",
     "implication": "stale model ids are executable misinformation; lint them like code",
     "evidence_method": "cb1657_red_on_denylist_hit"}]}
```

---

**MACS-016 — `pmat work ledger verify` (capstone) + provenance report**
Depends: 001, 002, 003 | Level: **L3** | Files: extend `work_ledger.rs` service, CLI subcommand
RED tests: `ledger::verify_recomputes_all_hashes_v1_and_v2`, `ledger::verify_red_on_tampered_entry`, `ledger::report_groups_by_model_effort_harness`, `ledger::rule_R1_order_violation_detected`.

```json
{ "ticket": "MACS-016", "kind": "feature", "verification_level": "L3",
  "contract_yaml": "contracts/macs-provenance-v1.yaml#hash_stability",
  "require": [{"claim": "F1 fields live end-to-end", "evidence": "cargo test work::ledger_entry_carries_agent"}],
  "ensure": [
    {"claim": "verify recomputes every hash under its schema_version", "evidence": "cargo test ledger::verify_recomputes_all_hashes_v1_and_v2"},
    {"claim": "tamper => non-zero exit + entry id", "evidence": "cargo test ledger::verify_red_on_tampered_entry"},
    {"claim": "report: receipts grouped by {model,effort,harness}", "evidence": "cargo test ledger::report_groups_by_model_effort_harness"},
    {"claim": "Rule R1 (ascending MACS order) checkable", "evidence": "cargo test ledger::rule_R1_order_violation_detected"}],
  "invariant": [{"claim": "verify is read-only on the ledger", "evidence": "cargo test ledger::verify_readonly"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "an unverifiable ledger is a story, not evidence",
     "implication": "the capstone re-derives every receipt hash and the build's own ordering (R1)",
     "evidence_method": "verify_red_on_tampered_entry + rule_R1_order_violation_detected"}]}
```

---

## 6. Determinism Guarantees

- **D1 — Total order.** Rule R1 (§5.1): ascending-ID execution, machine-checked post-hoc by MACS-016. Independence never licenses reordering.
- **D2 — Contract precedes code.** CB-1400 (existing) + every card's `contract_yaml` binding; an implementation diff without a prior contract commit is a P0.
- **D3 — Single blocking gate.** `pmat verify --format json` is the only commit gate (format → complexity → satd → clippy → tests, fail-fast) — "green here ⇒ green in CI" per the verify spec.
- **D4 — Canonical bytes.** All new serialized artifacts (receipts v2, ROADMAP.yaml, mcp.json, sweep reports) use sorted-key canonical JSON/YAML; hashes are SHA-256 over those bytes; wall-clock is excluded from *build-artifact* hashes and versioned inside *record* hashes (receipts).
- **D5 — Deterministic tests.** MACS-000 flips `proptest.toml` `deterministic = true` (baseline ships `false` — a live nondeterminism bug for a project whose gate promises replayability); goldens pin framing and hashes; CI matrix asserts cross-OS byte-stability (F1 `hash_stability`).
- **D6 — Toolchain pinning.** Builds run under the repo's Nix flake / locked Cargo graph; MACS adds no floating-version dependency (Sovereign 80/20 policy per CLAUDE.md — zero new external deps in MACS).
- **D7 — Single-writer ledger.** Sequential execution + the existing advisory-lock discipline (`src/services/metric_trends_core.rs` pattern) keeps `ledger.jsonl` append-only with a reproducible order; MACS-011 extends the zero-lock-error invariant to 8-way MCP concurrency.
- **D8 — Model nondeterminism fenced.** The model never touches an acceptance decision: it *proposes* (edits, CoT steps, anomaly judgments) and every proposal is either discharged (F3), gated (F2), stamped (F1), or rejected. Effort is pinned per skill (F4) so even the proposal distribution is configuration-stable; interruptions are recorded states (F1/E5), so a Fable→Opus switch or refusal can never masquerade as a green path.
- **D9 — No accidental fan-out.** §7 forbids ultracode/the trigger keyword during the MACS build (E1/E8); the one sanctioned orchestration is the committed script of MACS-012, which stamps provenance by construction.

---

## 7. Agent Execution Protocol (Fable 5)

**Session setup** (once):

```bash
# In Claude Code, model claude-fable-5. Default effort `high` is correct [1];
# do NOT enable ultracode for this build: the sequence is intentionally
# serial (R1), so fan-out buys variance and token burn (E6), not speed.
# Avoid the bare trigger keyword in prompts (E8). Do not run /effort max.
export PMAT_AGENT_MODEL=claude-fable-5
export PMAT_AGENT_EFFORT=high            # keep equal to the session setting
export PMAT_AGENT_HARNESS=claude-code
pmat verify --format json                 # must be green before MACS-000
```

**Per-ticket canonical loop** (identical for MACS-000…016; referenced by every card):

```bash
T=MACS-0NN
pmat work start "$T" --agent-model "$PMAT_AGENT_MODEL" \
                     --agent-effort "$PMAT_AGENT_EFFORT" \
                     --agent-harness "$PMAT_AGENT_HARNESS"   # flags exist from MACS-002 on;
                                                             # for 000–001 env vars alone suffice
# 1. CONTRACT  — copy the card's contract.json into .pmat-work/$T/contract.json;
#                commit contract-only: "contract(macs): $T"
pmat comply check
# 2. RED       — write the card's named failing tests; run them; confirm red.
# 3. GREEN     — minimal implementation. Debugging ONLY via `pmat five-whys` [11].
# 4. GATE      — loop until green:
pmat verify --format json   # fix exactly the file:line:rule violations reported
# 5. FALSIFY + CLOSE
pmat work falsify "$T"
pmat work complete "$T"     # blocked if: ladder shortfall (F2), unacked
                            # Refusal (F3/E5), or stale receipt
git commit -am "feat(macs): $T <card title>"   # atomic, one ticket per commit
pmat work checkpoint "$T"   # durable state lives here, never in the session (E7)
```

**Interruption handling** (E5): if a turn ends in a refusal or the harness
switches models, first command of the next turn:

```bash
pmat work event --type refusal --note "turn ended flagged"     # or model-switch --from ... --to ...
# investigate via five-whys; then either rephrase and continue, or
pmat work event --ack-event <id> --reason "<root cause + disposition>"
```

**Prohibited during the build**: skipping Step 1 (CB-1400 red); `grep`/`glob`
for code search (use `pmat query`, per CLAUDE.md); iterating on git hooks
(muda — one edit max, per project memory); enabling ultracode or typing its
trigger keyword; parallel ticket execution; hand-editing `mcp.json`,
`ROADMAP.yaml`, or any receipt.

---

## 8. Component-Level Acceptance (Given/When/Then + falsification)

- **Given** a fresh clone at the MACS merge commit, **when** `pmat comply
  check` runs, **then** CB-1640, CB-1650–1658 all report and all pass.
- **Given** the ledger after the build, **when** `pmat work ledger verify
  --report` runs, **then** every hash re-derives, every MACS receipt carries
  `agent.model = "claude-fable-5"`, and R1 ordering holds.
- **Given** a ticket claiming L4 with a failing Kani harness, **when**
  `pmat work complete` runs, **then** it exits with `LadderShortfall`.
- **Given** two runs of `pmat qa mcp-sweep --concurrency 8`, **when** their
  JSON outputs are diffed modulo time fields, **then** the diff is empty.
- **Given** an edit that reintroduces `claude-3-opus` outside the registry,
  **when** `pmat comply check` runs, **then** CB-1657 is red.

```yaml
falsification:   # component roll-up (contracts/macs-v1.yaml appendix)
  - {condition: "Any MACS receipt lacks agent provenance",             severity: P0, action: reject_push}
  - {condition: "Any ticket closes above its achieved ladder level",   severity: P0, action: reject_push}
  - {condition: "Any CoT step ships undischarged (CB-1640 bypass)",    severity: P0, action: reject_push}
  - {condition: "Any skill lacks a pinned effort (CB-1650 bypass)",    severity: P1, action: block_release}
  - {condition: "mcp-sweep replay differs beyond time fields",         severity: P0, action: reject_push}
  - {condition: "ROADMAP.yaml/mcp.json drift with green comply",       severity: P0, action: reject_push}
  - {condition: "Ledger shows MACS receipts out of ascending order",   severity: P1, action: block_release}
```

---

## 9. References

**External (accessed 2026-07-02):**
[1] Anthropic. *Model configuration.* Claude Code Docs. https://code.claude.com/docs/en/model-config — ultracode definition (xhigh + orchestration; session-only), effort frontmatter for skills/subagents, Fable 5 default `high`, flagged-request fallback/refusal semantics.
[2] Anthropic. *Effort.* Claude Platform Docs. https://platform.claude.com/docs/en/build-with-claude/effort — effort levels and model support (xhigh on Fable 5, Mythos 5, Opus 4.8/4.7, Sonnet 5); Fable 5 adaptive thinking; ultracode as harness permission, not API level.
[3] Anthropic. *Introducing dynamic workflows.* https://claude.com/blog/introducing-dynamic-workflows-in-claude-code — 10s–100s parallel subagents; independent verification before results surface; availability.
[4] Developers Digest. *Ultracode: Claude Code Multi-Agent Orchestration Mode Explained.* 2026-06. https://www.developersdigest.tech/blog/ultracode-effort-level-explained — v2.1.160 trigger rename; v2.1.172 subagent nesting (5 levels); saveable, versionable workflow scripts; xhigh model list per config docs.
[5] claudefa.st. *Ultracode in Claude Code: Effort Setting Explained.* 2026-07. https://claudefa.st/blog/guide/development/ultracode — session-only semantics; `CLAUDE_CODE_EFFORT_LEVEL`; practitioner cost reports (62 subagents / 18 min).
[6] AI Weekly (summarizing r/ClaudeAI). *Fable + Ultracode consumed full 5-hour usage limit in ~7 minutes via codebase-wide audit prompt.* 2026-06. https://aiweekly.co/node/2750.
[7] BitsMinds. *What is "ultracode" mode in Claude Code?* 2026-07. https://www.bitsminds.com/news/claude-code-ultra-code-mode-ultracode-explained-2026 — session-bound resume for workflow runs.
[8] Bruni, R., et al. (2026). *Agent Behavioral Contracts.* arXiv:2602.22302 — DbC extended to multi-agent systems (already normative via C10).
[9] Popper, K. (1959). *The Logic of Scientific Discovery.* Hutchinson — falsifiability as demarcation; basis of the receipt model.
[10] Meyer, B. (1997). *Object-Oriented Software Construction*, 2nd ed. Prentice Hall — Design by Contract (require/ensure/invariant).
[11] Ohno, T. (1988). *Toyota Production System: Beyond Large-Scale Production.* Productivity Press — Five-Whys root-cause method; Jidoka.

**Internal (repo, baseline `bab0f9a`):** `docs/specifications/components/agent-integration.md` (C10, CB-1400 family), `pmat-work-verification-ladder.md` (C28), `pmat-work-cot-proof-derivation.md` (C31, CB-1640 reservation), `docs/agent-instructions/autonomous-verify-loop.md`, `docs/agent-instructions/provable-contract-first-agents.md`, `contracts/benchmarking-v1.yaml` (contract schema exemplar), `contracts/binding.yaml`, `src/cli/handlers/work_ledger_receipt.rs`, `src/services/metric_trends_core.rs` (advisory-lock exemplar).

---

## Appendix A — `.pmat-work/<TICKET>/contract.json` schema (MACS profile)

```json
{ "$schema": "https://paiml.dev/schemas/pmat-work-contract-macs-1.json",
  "ticket": "MACS-0NN",
  "kind": "feature|bugfix|refactor|chore",
  "verification_level": "L0|L1|L2|L3|L4|L5",
  "contract_yaml": "contracts/<file>.yaml#<equation>[,<equation>...]",
  "require":  [{"claim": "…", "evidence": "<shell command>"}],
  "ensure":   [{"claim": "…", "evidence": "<shell command>"}],
  "invariant":[{"claim": "…", "evidence": "<shell command>"}],
  "chain_of_thought": [
    {"id": "CoT-1", "assumption": "…", "implication": "…",
     "evidence_method": "…", "discharged_by": "CoT-0|contract#eq|E<n>|null"}]}
```

Every `evidence` value is executable verbatim from the repo root; `pmat work
falsify` runs them and writes verdicts into the receipt.

## Appendix B — `contracts/binding.yaml` additions (land with each ticket)

```yaml
- {contract: macs-provenance-v1.yaml, equation: provenance_roundtrip,
   source_file: src/cli/handlers/work_ledger_types.rs,   function: AgentProvenance,        status: planned}
- {contract: macs-provenance-v1.yaml, equation: refusal_gates_completion,
   source_file: src/cli/handlers/work_event.rs,          function: handle_work_event,      status: planned}
- {contract: macs-ladder-v1.yaml,     equation: parse_total_strict,
   source_file: src/cli/handlers/work_contract_core.rs,  function: VerificationLevel::from_str, status: planned}
- {contract: macs-ladder-v1.yaml,     equation: gate_monotone,
   source_file: src/cli/handlers/work_handlers/core_handlers/resolution.rs, function: achieved_level, status: planned}
- {contract: macs-cot-v1.yaml,        equation: chain_integrity,
   source_file: src/models/work_cot.rs,                  function: check_chain,            status: planned}
- {contract: macs-cot-v1.yaml,        equation: derivation_complete,
   source_file: src/models/work_cot.rs,                  function: derive,                 status: planned}
- {contract: macs-skill-effort-v1.yaml, equation: skill_effort_pinned,
   source_file: src/cli/handlers/comply_handlers/check_handlers/check_skill_effort.rs, function: check_cb1650, status: planned}
- {contract: macs-sweep-v1.yaml,      equation: sweep_deterministic,
   source_file: src/cli/handlers/qa_mcp_sweep.rs,        function: handle_mcp_sweep,       status: planned}
- {contract: macs-artifacts-v1.yaml,  equation: roadmap_canonical,
   source_file: src/roadmap/generator.rs,                function: render_canonical,       status: planned}
- {contract: macs-artifacts-v1.yaml,  equation: manifest_faithful,
   source_file: src/cli/handlers/mcp_manifest.rs,        function: handle_manifest_write,  status: planned}
```

## Appendix C — Bootstrap block (run inside MACS-000; ids must come out ascending)

```bash
set -euo pipefail
add() { pmat work add "$1"; }   # visible aliases: new|create|a
add "MACS-000 Bootstrap: tickets, spec, contracts, deterministic proptest"
add "MACS-001 AgentProvenance/AgentEvent types + canonical serialization + schema_version"
add "MACS-002 Provenance capture: --agent-* flags + PMAT_AGENT_* env into receipts/ledger"
add "MACS-003 pmat work event: refusal/model-switch/session-restart + completion blocking"
add "MACS-004 VerificationLevel typed enum + strict parse + ticket migration"
add "MACS-005 Ladder completion gate: achieved_level >= claimed"
add "MACS-006 CB-1653 ladder drift check + receipts record claimed/achieved"
add "MACS-007 ChainOfThoughtStep v2 schema + parser + legacy migration"
add "MACS-008 Chain-integrity checker implementing CB-1640"
add "MACS-009 CoT auto-derivation: obligations + falsifiable claims + CB-1658"
add "MACS-010 Effort frontmatter pinning across skills + CB-1650 lint"
add "MACS-011 pmat qa mcp-sweep: LLM-free deterministic MCP harness"
add "MACS-012 Committed ultracode judgment workflow with provenance stamping"
add "MACS-013 pmat roadmap sync: canonical ROADMAP.yaml + CB-1655 freshness"
add "MACS-014 pmat mcp manifest --write + CB-1656 drift check"
add "MACS-015 Doc drift purge + docs/agent-models.md registry + CB-1657"
add "MACS-016 pmat work ledger verify: hashes, provenance report, R1 order"
pmat work status --all | grep -c 'MACS-0'   # expect 17
```

---

*Self-enforcement note: once MACS-008 lands, §3.1 of this document is parsed
as a CoT fixture (`cot::spec_section_3_1_passes`) — the spec is subject to the
check it specifies.*