mempal 0.7.0

Project memory for coding agents. Single binary, hybrid search, knowledge graph.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
# MIND MODEL DESIGN

**Date**: 2026-04-21
**Status**: P42 baseline implemented - future work remains explicit
**Scope**: Capture the mind-model decisions discussed in this conversation and map them to a practical system design.

## Implementation Checkpoint

P42 baseline means the core mind-model architecture is implemented enough to
operate as a governed memory system:

- Stage 1 typed drawers separate raw evidence from governed knowledge.
- `dao_tian -> dao_ren -> shu -> qi` runtime context assembly exists through
  `mempal context` and `mempal_context`.
- Stage 1 knowledge supports distill, gate, promote, demote, and outward anchor
  publication through CLI and MCP surfaces.
- Phase-2 `knowledge_cards`, `knowledge_evidence_links`, and
  `knowledge_events` exist in the same SQLite `palace.db`.
- Stage-1 knowledge drawers can be backfilled into Phase-2 cards through an
  explicit dry-run-first apply command.
- Phase-2 cards now have governed gate, promote, and demote lifecycle surfaces
  in CLI and MCP.

P42 baseline is not a claim that every future runtime integration is complete.
It marks the point where the design is no longer only a discussion capture: the
main storage, governance, and lifecycle surfaces exist and are test-backed.

## One-Sentence Thesis

The system should treat memory as a governed knowledge evolution layer where raw evidence is accumulated first, abstract knowledge is distilled second, and high-level `dao` is woken up before `shu` and `qi` at runtime.

## Goal

This design defines how to combine:

- memory
- skills
- external research tools
- runtime evaluation and promotion gates

into a coherent agent cognition system instead of treating them as unrelated parts.

The key idea is:

- memory is not just storage
- skills are not just static instructions
- research tools are not just retrieval utilities

Together they should form a disciplined loop:

1. gather evidence
2. distill candidate knowledge
3. promote only with sufficient evidence
4. demote when contradicted or obsolete

## Core Vocabulary

### Dao / Shu / Qi

This design adopts the following knowledge hierarchy:

- `dao`: high-level knowledge and governing principles
- `shu`: reusable methods, workflows, and procedural know-how
- `qi`: concrete tools, commands, interfaces, and tool-specific usage knowledge

`dao` itself has two levels:

- `dao_tian`: universal law; the most stable, cross-domain, objective knowledge
- `dao_ren`: domain law; stable patterns within a given field

Relationship:

- `dao_tian` shapes how the agent understands reality
- `dao_ren` shapes how the agent understands a specific field
- `shu` shapes how the agent acts
- `qi` shapes what the agent uses to act

### Memory Domains

These are independent from `dao / shu / qi`.

Memory domains answer: "who is this memory for?"

- `project`
- `agent`
- `skill`
- `global`

### Provenance

Every memory item should also record where it came from:

- `runtime`
- `research`
- `human`

## Crucial Orthogonality

`dao / shu / qi` is **not** the same axis as project memory, agent memory, or skill memory.

These are orthogonal coordinate systems:

1. `memory domain`
2. `knowledge tier`
3. `field`
4. `provenance`
5. `anchor`

Example:

- a debugging checklist may be `domain=skill`, `tier=shu`, `field=software-engineering`
- a model-specific CLI behavior note may be `domain=agent`, `tier=qi`, `field=tooling`
- a high-level principle like "evidence precedes assertion" may be `domain=global`, `tier=dao_tian`, `field=epistemics`

This orthogonality is required to avoid confusing:

- local project lessons with universal law
- temporary tool behavior with stable domain knowledge
- workflow tips with governing principles

## Anchor Model

Project identity should not be overloaded onto `wing`.

`wing` remains a semantic partitioning axis. It answers:

- what semantic area does this memory belong to?

It should not also answer:

- which checkout does this memory belong to?
- which branch experiment does this memory belong to?

That requires a separate anchor axis.

### Anchor Kinds

The recommended anchor model is:

- `global`
- `repo`
- `worktree`

Meaning:

- `global`: not tied to a repository checkout; used for cross-project memory, especially high-level `dao`
- `repo`: shared memory for the logical repository across branches and worktrees
- `worktree`: branch-local or experiment-local memory bound to one checkout path

### Why Worktree Must Exist

A repo-only project anchor is insufficient.

If memory is anchored only to the repo root:

- branch experiments contaminate each other
- temporary workflows and conclusions leak across unrelated checkouts
- failed experiments in one worktree pollute stable reasoning in another

Using the worktree path as a memory anchor preserves checkout-local memory.

### Why Worktree Alone Is Not Enough

A worktree-only anchor is also insufficient.

If memory is anchored only to worktree path:

- stable project knowledge fragments across checkouts
- verified project-wide `shu` and `dao_ren` become hard to share
- each new worktree starts too empty

Therefore the recommended model is dual-anchor, not worktree-only.

### Recommended Dual-Anchor Design

Every project-tied memory should be able to attach to:

- a `repo_anchor`
- optionally a `worktree_anchor`

This yields a useful separation:

- stable shared project memory lives at `repo`
- branch-local experiments, transient failures, and temporary heuristics live at `worktree`
- universal knowledge lives at `global`

### Anchor Does Not Replace Domain

Anchor and domain answer different questions.

- `domain` asks: who is this memory for?
- `anchor` asks: which persistence scope does it belong to?

Examples:

- a global epistemic law may be `domain=global`, `anchor=global`
- a project-wide build rule may be `domain=project`, `anchor=repo`
- a branch-local debugging lesson may be `domain=project`, `anchor=worktree`

### Stage-1 Anchor Fields

For the bootstrap drawer model, the minimum anchor fields should be:

- `anchor_kind`: `global | repo | worktree`
- `anchor_id`: normalized identifier for that anchor
- optional `parent_anchor_id`

The parent relationship is primarily for:

- `worktree -> repo`

This allows branch-local memory to inherit a stable project parent.

### Anchor Generation Rules

At stage 1, the anchor rules should be deterministic and filesystem-derived.

Recommended generation:

- `global`
  - fixed symbolic id, not derived from cwd
- `repo`
  - normalized repository identity
- `worktree`
  - normalized checkout identity

The critical rule is:

- do not derive anchor identity from `wing`

### Repo Anchor

The `repo_anchor` should identify the logical repository shared by all worktrees.

Recommended source:

- repository top-level path or a canonical repo identity derived from git metadata

The exact encoding can evolve, but the semantics should remain:

- all worktrees of the same repo map to the same `repo_anchor`

### Worktree Anchor

The `worktree_anchor` should identify a specific checkout path.

Recommended source:

- canonical worktree path

Semantics:

- different worktree paths produce different `worktree_anchor`s
- the main checkout and each extra git worktree are treated as separate worktree anchors

### Default Assignment Policy

At stage 1, memories should default to the following anchors:

- `dao_tian`
  - `global`
- stable `dao_ren`
  - usually `global` or `repo`, depending on whether it is cross-project or repo-specific
- project-shared `shu`
  - `repo`
- branch-local `qi` and experimental `shu`
  - `worktree`
- runtime observations from the current checkout
  - `worktree`
- external research evidence intended to support the current repo
  - `repo` by default, `worktree` when clearly branch-local

### Runtime Wake-Up with Anchors

When anchors are present, runtime assembly should prefer:

1. current `worktree`
2. current `repo`
3. `global`

This gives the agent:

- local experimental context first
- stable project memory second
- universal law last, but still available

This ordering complements the knowledge-tier wake-up order rather than replacing it.

In other words, anchor filtering and `dao / shu / qi` ordering are separate passes.

### Anchor Promotion Is Separate from Tier Promotion

The system needs two distinct upward movements:

1. knowledge-tier promotion
2. anchor-scope publication

These are not the same operation.

Examples:

- `qi @ worktree -> shu @ worktree` is a tier promotion
- `shu @ worktree -> shu @ repo` is an anchor publication

Recommended publication chain:

- `worktree -> repo -> global`

Meaning:

- new or experimental memory should usually start at `worktree`
- only verified, shareable project memory should move to `repo`
- only cross-project law should live at `global`

This yields the governing principle:

- write local first
- publish outward only after evidence justifies it

At stage 1, this can remain a disciplined workflow and data-model invariant even
if a full `publish_anchor` API does not exist yet.

## What Counts as Real Learning

The system should not consider "more stored text" to be the same as learning.

True learning happens only when:

1. observations accumulate in evidence memory
2. patterns are distilled into `qi` or `shu`
3. repeated and bounded patterns are promoted into `dao_ren`
4. only extremely stable, cross-domain knowledge is promoted into `dao_tian`

Therefore:

- `qi` can be accumulated quickly
- `shu` should be distilled with care
- `dao_ren` should be promoted rarely
- `dao_tian` should be promoted extremely rarely

## Layered Architecture

The system should be separated into four logical layers:

1. external tools
2. evidence memory
3. knowledge memory
4. runtime execution

### External Tools

Examples:

- `research-rs`
- CLI tools
- MCP tools
- test runners
- build tools

These belong to `qi`. They are capabilities, not high-level knowledge.

### Evidence Memory

This layer stores raw, source-backed observations.

Examples:

- research results
- runtime observations
- human explicit teachings
- concrete failures
- counterexamples
- contradictions

Evidence memory is allowed to contain inconsistent or conflicting facts.

That is expected. It reflects the world as observed.

### Knowledge Memory

This layer stores distilled, governed knowledge:

- `qi`
- `shu`
- `dao_ren`
- `dao_tian`

Knowledge memory should never be a raw dump of evidence. It is a controlled compilation layer built on top of evidence.

### Runtime Execution

This is where agents:

- wake up relevant knowledge
- choose the right skill
- bind to available tools
- act under constraints

## Evidence Memory vs Knowledge Memory

This split is mandatory.

If raw evidence and abstract knowledge are stored as the same thing forever, the system will quickly lose the distinction between:

- fact and conclusion
- observation and law
- candidate and canon

### Evidence Memory Principles

- raw-first
- source-backed
- append-friendly
- contradiction-tolerant
- high volume

### Knowledge Memory Principles

- distilled
- bounded
- stateful
- auditable
- lower volume

In short:

- evidence memory stores "what we saw"
- knowledge memory stores "what we therefore believe"

## Relationship Between Memory, Skills, and Research

### Memory

Memory is the governed persistence and wake-up system.

It should contain both:

- evidence memory
- knowledge memory

`dao` belongs here, not in external research tools.

### Skills

Skills primarily encode `shu`.

But a good skill should also expose the `dao` that justifies the workflow and the `qi` needed to execute it.

So a mature skill should be read as:

1. governing principle
2. workflow
3. tool binding

### Research Tools

External research tools do not define `dao`.

Their role is:

- fetch evidence
- structure evidence
- help verify or falsify existing knowledge

So `research-rs` is `qi`, and its output primarily feeds evidence memory.

## research-rs Boundary

`research-rs` is an external tool. It should not be given responsibility for maintaining `dao`.

Its appropriate role is close to the `LLM Wiki` pattern:

- raw sources
- wiki
- schema
- index
- log
- lint

But the outputs of `research-rs` should be treated as:

- evidence
- structured summaries
- candidate insights
- contradiction signals

They are not automatically `dao`.

Therefore:

- `research-rs` organizes the external world
- memory governs what is promoted from those results

P49 defines the research-rs ingestion path:

- raw/source research output enters as `memory_kind=evidence` with `provenance=research`
- structured summaries from research remain evidence unless explicitly distilled
- candidate knowledge only through distill from existing evidence refs
- contradiction signals become evidence or counterexamples for later demotion or
  gate evaluation

research must not directly create dao_tian. Research must not directly create
canonical or promoted knowledge. It must not bypass lifecycle gates. The highest
trust level research can supply by itself is source-backed evidence; memory owns
distillation, promotion, demotion, and canonicalization.

## Runtime Wake-Up Order

The runtime order should be explicit, not left to ad hoc semantic retrieval.

Recommended order:

1. `dao_tian`
2. `dao_ren`
3. `shu`
4. `qi`
5. `evidence`

Rationale:

- `dao_tian` calibrates the agent's worldview
- `dao_ren` calibrates the current field
- `shu` proposes methods
- `qi` binds execution to available tools
- `evidence` is used for grounding, exception handling, and proof

This order should not imply that the system always injects all layers.

Rather:

- use `dao_tian` sparingly and only when truly needed
- use `dao_ren` based on the active field
- use `shu` as the main skill trigger and execution layer
- use `qi` only when binding to concrete capabilities
- use evidence when verification or exception-handling is necessary

## Promotion Hierarchy

Knowledge should evolve through controlled promotion, not direct assertion.

Recommended conceptual path:

- `observation -> qi/shu`
- `shu -> dao_ren`
- `dao_ren -> dao_tian`

With the following meaning:

- `qi`: tool-bound knowledge
- `shu`: repeatable method
- `dao_ren`: domain law
- `dao_tian`: universal law

Higher promotion requires:

- fewer entries
- stronger evidence
- broader validity
- clearer boundaries
- stronger review

## Promotion Gate Philosophy

The system should never let "the agent found something interesting" equal "the system learned a law".

Instead:

- research and runtime can produce evidence quickly
- candidate knowledge can be distilled frequently
- promotion must be gated
- high-level law must be rare

This design strongly favors:

- fast evidence growth
- slow law growth

## Knowledge Lifecycle

The knowledge layer should support at least these states:

- `candidate`
- `promoted`
- `canonical`
- `demoted`
- `retired`

Meaning:

- `candidate`: not yet trusted for default runtime wake-up
- `promoted`: trusted enough for ordinary use
- `canonical`: highly stable and preferred
- `demoted`: weakened by stronger evidence or invalidation
- `retired`: no longer active, retained only for audit and history

Important rule:

High-level knowledge must be reversible. Promotion without demotion leads to knowledge pollution.

## Four Core Operations

The smallest viable lifecycle should be modeled through four operations:

1. `record`
2. `distill`
3. `promote`
4. `demote`

### record

Store raw evidence.

Examples:

- research result
- runtime failure
- human teaching
- observed contradiction

### distill

Create a candidate knowledge item from evidence.

Examples:

- tool usage note
- workflow heuristic
- domain pattern candidate

### promote

Move candidate knowledge into active runtime use once its gate is satisfied.

### demote

Reduce or retire knowledge when it is contradicted, superseded, or becomes outdated.

## Minimal Data Shape

### Evidence Memory

Evidence entries should be raw-first and source-backed.

Suggested fields:

- `id`
- `content`
- `domain`
- `field`
- `provenance`
- `source_ref`
- `timestamp`
- `tags`

### Knowledge Memory

Knowledge entries should be explicit and auditable.

Suggested fields:

- `id`
- `statement`
- `tier`
- `domain`
- `field`
- `status`
- `stability`
- `evidence_refs`
- `scope_constraints`
- `counterexamples`
- `promotion_history`

## Stage-1 Bootstrap Drawer Schema

Phase 1 should reuse the existing drawer system, but not by pretending all
drawers mean the same thing.

The bootstrap model should explicitly separate:

- `evidence drawer`
- `knowledge drawer`

### Shared Stage-1 Fields

Both drawer kinds should share the current base drawer fields and add:

- `memory_kind`: `evidence | knowledge`
- `domain`: `project | agent | skill | global`
- `field`
- `anchor_kind`: `global | repo | worktree`
- `anchor_id`
- optional `parent_anchor_id`

These fields should be explicit, not hidden inside JSON blobs, because they are
part of query-time filtering and runtime wake-up assembly.

### Evidence Drawer

The minimum stage-1 evidence drawer should add:

- `memory_kind='evidence'`
- `domain`
- `field`
- `provenance`: `runtime | research | human`
- `anchor_kind`
- `anchor_id`

Evidence drawers should *not* carry knowledge-governance fields such as:

- `tier`
- `status`
- `statement`
- `trigger_hints`
- role-separated knowledge refs

Evidence drawers record what was seen, taught, verified, or contradicted. They
can use tags to indicate whether they are supporting evidence, a boundary case,
or a counterexample, but they are not themselves promoted knowledge.

### Knowledge Drawer

The minimum stage-1 knowledge drawer should add:

- `memory_kind='knowledge'`
- `domain`
- `field`
- `statement`
- `tier`: `qi | shu | dao_ren | dao_tian`
- `status`: `candidate | promoted | canonical | demoted | retired`
- `supporting_refs`
- `counterexample_refs`
- `teaching_refs`
- `verification_refs`
- `scope_constraints`
- `trigger_hints`
- `anchor_kind`
- `anchor_id`
- optional `parent_anchor_id`

For knowledge drawers:

- `content` is the longer explanatory body
- `statement` is the short wake-up form

### Why Evidence Refs Must Be Role-Separated

Stage 1 should not collapse all evidence into one undifferentiated
`evidence_refs` list.

The minimum useful split is:

- `supporting_refs`
- `counterexample_refs`
- `teaching_refs`
- `verification_refs`

This matters because the runtime and future evaluator must be able to
distinguish:

- what supports a knowledge claim
- what limits it
- what was explicitly taught by a human
- what was actively re-verified rather than merely observed

### Minimal Trigger Hints

Stage 1 should allow a very small `trigger_hints` object for knowledge drawers,
but it must remain a bias layer, not a second skill registry.

The allowed structure should be limited to:

- `intent_tags`
- `workflow_bias`
- `tool_needs`

It should not directly name hard skill ids or become the authoritative trigger
mechanism.

### Statement vs Content

`statement` and `content` should have different jobs.

Recommended rule:

- `statement` is the short, directly wakeable knowledge proposition
- `content` is the explanatory body with rationale, boundaries, and clarifying detail

Therefore:

- `statement` should not contain extended justification, examples, or long scope notes
- `content` should not merely restate `statement`

This supports a clean runtime pattern:

1. wake by `statement`
2. drill into `content` only when explanation, review, or adjudication is needed

### Natural Status Distribution by Tier

The stage-1 model should expect different status distributions for each tier:

- `dao_tian`: usually `canonical` or `demoted`
- `dao_ren`: usually `candidate` or `promoted`
- `shu`: usually `promoted`
- `qi`: usually `candidate` or `promoted`

This is not merely stylistic. It reflects the intended rarity and stability of
each layer.

## Minimal Interface Surface

If exposed through memory APIs, the minimal operations should roughly map to:

- `record(content, domain, field, provenance, source_ref, tags?)`
- `distill(statement, tier_candidate, domain, field, evidence_refs, scope_constraints, rationale, counterexamples?)`
- `promote(knowledge_id, target_status, promotion_reason, validation_refs, reviewer)`
- `demote(knowledge_id, reason_type, reason, evidence_refs, target_status)`

Design principle:

The caller should not self-score high-level confidence directly. Confidence and stability should be system-derived or gate-derived whenever possible.

## Two-Phase Implementation Strategy

The recommended implementation path is two-stage.

### Phase 1: Drawer Bootstrap

Use two drawer types:

- `evidence drawer`
- `knowledge drawer`

Purpose:

- validate the value of the model quickly
- avoid a large rewrite
- reuse the current storage, ingest, and search paths

Limits:

- knowledge drawers will eventually become overloaded with lifecycle and governance metadata
- this is a bootstrap architecture, not the final form

Implemented Phase-1 runtime surface:

- `mempal context <query>` assembles a runtime context pack from typed drawers
- `mempal_context` exposes the same pack to MCP-connected agents
- knowledge sections are ordered as `dao_tian -> dao_ren -> shu -> qi`
- Stage-1 field taxonomy is guidance-only and read-only: `mempal field-taxonomy`
  and `mempal_field_taxonomy` expose recommended fields such as `general`,
  `epistemics`, `software-engineering`, `debugging`, `tooling`, `research`,
  `writing`, and `diary`, while custom field strings remain valid
- `dao_tian` is sparse by default in runtime context: `mempal context` and
  `mempal_context` inject at most 1 `dao_tian` item unless the caller explicitly
  sets `--dao-tian-limit` / `dao_tian_limit`; `0` disables `dao_tian`
- `wake-up` remains an L0/L1 memory refresh surface and does not assemble typed
  `dao_tian -> dao_ren -> shu -> qi` sections; typed operating guidance belongs
  to `mempal context` / `mempal_context`
- evidence remains opt-in via `--include-evidence`
- same-tier items prefer `worktree`, then current `repo`, then `repo://legacy`, then `global`
- `global` anchor candidates use `domain=global`, preserving the invariant that global anchors do not hold project-local domain memory
- `trigger_hints` are exposed as metadata only; they do not directly execute skills
- MCP protocol guidance consumes context in order: read `dao_tian` and `dao_ren` for judgment, use `shu` to bias workflow / skill choice, and use `qi` to bias concrete tool choice
- memory hints never override system, user, repo, or client-native skill rules
- bootstrap distill CLI creates candidate `dao_ren` / `qi` knowledge drawers from existing evidence refs without auto-promotion or LLM summarization
- `mempal_knowledge_distill` exposes the same deterministic distill operation to MCP-connected agents, letting runtime agents create candidate knowledge from evidence refs without shelling out
- bootstrap lifecycle CLI supports manual `promote` / `demote` on existing knowledge drawers by updating status plus verification / counterexample refs and writing audit entries
- lifecycle verification / counterexample refs are hardened to require existing evidence drawers, preserving the rule that promotion and demotion are justified by evidence rather than arbitrary ids or other knowledge claims
- promotion gate CLI provides a read-only advisory report before promotion, using deterministic evidence-count policy without mutating status, refs, vectors, schema, or audit history
- `mempal_knowledge_gate` exposes the same read-only promotion gate to MCP-connected agents, so runtime agents can check readiness without shelling out or mutating lifecycle state
- Stage-1 promotion policy is inspectable without a concrete drawer through `mempal knowledge policy` and `mempal_knowledge_policy`
- current Stage-1 thresholds are:
  - `dao_tian -> canonical`: 3 supporting refs, 2 verification refs, 1 teaching ref, human reviewer required, counterexamples block
  - `dao_ren -> promoted`: 2 supporting refs, 1 verification ref, counterexamples block
  - `shu -> promoted`: 1 supporting ref, 1 verification ref, counterexamples block
  - `qi -> promoted`: 1 supporting ref, 1 verification ref, counterexamples block
- `dao_tian -> canonical` always requires a human reviewer in Stage 1; evaluator-only canonization is intentionally out of scope
- `mempal_knowledge_promote` and `mempal_knowledge_demote` expose lifecycle mutation to MCP-connected agents; promotion is gate-enforced after appending supplied verification refs, while demotion requires counterexample evidence
- `mempal knowledge publish-anchor` implements explicit outward anchor publication for active knowledge (`worktree -> repo -> global`) as a metadata-only operation separate from tier/status promotion
- `mempal_knowledge_publish_anchor` exposes the same outward anchor publication operation to MCP-connected agents without changing content, vectors, tier, or status
- lifecycle updates are metadata-only in Stage 1; they do not rewrite content, re-embed vectors, or create Phase-2 knowledge cards

### Phase 2: Knowledge Card Extraction

Once the model proves useful, separate knowledge memory from evidence memory structurally.

Recommended objects:

- `drawers` for evidence
- `knowledge_cards`
- `knowledge_evidence_links`
- `knowledge_events`

Minimum schema v8 draft:

`knowledge_cards`:

- `id TEXT PRIMARY KEY`
- `statement TEXT NOT NULL`
- `content TEXT NOT NULL`
- `tier TEXT NOT NULL CHECK ('qi','shu','dao_ren','dao_tian')`
- `status TEXT NOT NULL CHECK ('candidate','promoted','canonical','demoted','retired')`
- `domain TEXT NOT NULL CHECK ('project','agent','skill','global')`
- `field TEXT NOT NULL DEFAULT 'general'`
- `anchor_kind TEXT NOT NULL CHECK ('global','repo','worktree')`
- `anchor_id TEXT NOT NULL`
- `parent_anchor_id TEXT`
- `scope_constraints TEXT`
- `trigger_hints TEXT`
- `created_at TEXT NOT NULL`
- `updated_at TEXT NOT NULL`

`knowledge_evidence_links`:

- `id TEXT PRIMARY KEY`
- `card_id TEXT NOT NULL`
- `evidence_drawer_id TEXT NOT NULL`
- `role TEXT NOT NULL CHECK ('supporting','verification','counterexample','teaching')`
- `note TEXT`
- `created_at TEXT NOT NULL`
- unique key: `(card_id, evidence_drawer_id, role)`

`knowledge_events`:

- `id TEXT PRIMARY KEY`
- `card_id TEXT NOT NULL`
- `event_type TEXT NOT NULL CHECK ('created','promoted','demoted','retired','linked','unlinked','updated','published_anchor')`
- `from_status TEXT`
- `to_status TEXT`
- `reason TEXT NOT NULL`
- `actor TEXT`
- `metadata TEXT`
- `created_at TEXT NOT NULL`

Minimum indexes:

- `knowledge_cards(tier, status)`
- `knowledge_cards(domain, field)`
- `knowledge_cards(anchor_kind, anchor_id)`
- `knowledge_evidence_links(card_id)`
- `knowledge_evidence_links(evidence_drawer_id)`
- `knowledge_events(card_id, created_at)`

This yields a cleaner separation:

- evidence says what happened
- knowledge says what is believed
- events say how that belief evolved

Storage decision:

- Phase-2 `knowledge_cards` should live in the same SQLite `palace.db`
- they should be separate tables from `drawers`, not overloaded drawer rows
- `drawers` remain the raw evidence and citation root
- `knowledge_evidence_links` should reference evidence drawers by `drawer_id`
- `knowledge_events` should be transactional with knowledge-card lifecycle
  changes and evidence-link mutations
- a separate persistence layer is out of scope unless future measured needs
  prove the single-file SQLite boundary insufficient

Rationale:

- mempal's product invariant is a local single-binary, single-file memory palace
- knowledge promotion/demotion must stay transactionally tied to evidence refs
- citations remain simpler and safer when evidence drawer ids are the durable root
- using a second database or service would add operational complexity before the
  Phase-2 model has proven it needs independent scaling

Implemented Phase-2 surface at P42 baseline:

- `knowledge_cards`, `knowledge_evidence_links`, and `knowledge_events` are
  schema v8 tables in `palace.db`
- Rust core APIs can create/read/update/list cards, link evidence, and append
  events
- `mempal knowledge-card` exposes create/get/list/link/event/events
- `mempal_knowledge_cards` exposes list/get/events to MCP-connected agents
- `mempal knowledge-card backfill-plan` reports Stage-1 knowledge drawers that
  are ready to become cards without writing
- `mempal knowledge-card backfill-apply` defaults to dry-run and only writes
  cards, links, and created events with `--execute`
- `mempal knowledge-card gate` evaluates card readiness from role-separated
  evidence links
- `mempal knowledge-card promote` and `mempal knowledge-card demote` mutate
  card status transactionally with role-specific evidence links and append-only
  events
- `mempal_knowledge_cards` also exposes `gate`, `promote`, and `demote` actions
  over the same core lifecycle logic

Phase-2 cards are governed objects, but they are not yet the default
context/search source. At P42, `mempal context`, `mempal_context`, and
`mempal_search` remains drawer/citation based. Cards now have an explicit
linked-evidence retrieval path, but they are still not returned by default
search.

### Phase-2 Card Retrieval Contract

P43 defines the contract for future card-aware runtime consumption without
implementing retrieval behavior yet.

A card retrieval item is a governed knowledge result, not a raw drawer result.
The minimum returned card fields are: `card_id`, `statement`, `content`, `tier`,
`status`, `domain`, `field`, `anchor_kind`, and `anchor_id`.

Each card retrieval item must expose role-separated evidence citations derived
from `knowledge_evidence_links`. The minimum evidence citation fields are:
`evidence_drawer_id`, `role`, and `source_file`.

Default runtime eligibility is status-gated:

- `promoted` and `canonical` cards are runtime-eligible by default
- `candidate`, `demoted`, and `retired` cards are excluded by default

This preserves the governance boundary:

- card records carry distilled belief
- linked evidence drawers remain the citation root
- inactive card states remain inspectable but are not injected into ordinary
  runtime context

P43 does not change `mempal context` or `mempal_context` behavior.
P43 does not change `mempal_search` behavior.
Card embeddings, ranking strategy, and card-aware context/search surfaces are
deferred to later specs.

P44 adds the first explicit card-aware context surface:

- `mempal context --include-cards`
- `mempal_context` with `include_cards=true`

This remains opt-in. Default context assembly is still drawer-only. When enabled,
active Phase-2 cards are appended inside the existing
`dao_tian -> dao_ren -> shu -> qi` sections and expose `card_id` plus
role-separated `evidence_citations`. Each citation keeps the evidence drawer as
the citation root through `evidence_drawer_id`, `role`, and `source_file`.

P44 does not change `mempal_search`, does not add card embeddings, and does not
make cards the default runtime source.

P45 chooses the first card retrieval strategy:

- `mempal knowledge-card retrieve <query>`
- `mempal_knowledge_cards` with `action="retrieve"`

The strategy is linked-evidence-first. It searches evidence drawers through the
existing BM25+vector drawer search path, follows `knowledge_evidence_links`, and
returns active cards linked to matched evidence. Returned card items include the
card record, a score derived from matched evidence, and role-separated evidence
citations with `evidence_drawer_id`, `role`, `source_file`, and score.

P45 intentionally does not add card embeddings, does not add card vector
storage, and does not make `mempal_search` return cards.

P46 keeps card-aware context opt-in. The default context remains drawer-only for
both `mempal context` and `mempal_context`; operators must still pass
`--include-cards` or `include_cards=true` to inject Phase-2 cards into the
typed context pack.

This is a deliberate default policy, not an unfinished implementation gap.
Cards are now retrievable and context-injectable, but default runtime context is
a high-trust path. It should not silently switch from drawer-backed active
knowledge to mixed drawer/card guidance until real runtime evidence shows the
change improves agent behavior without weakening citations.

Evidence required before default enablement:

- repeated runtime traces where card-aware context causes better skill/tool
  selection than drawer-only context
- no observed citation loss: every default card item must preserve linked
  evidence citations as the citation root
- no material context bloat: card items must not crowd out higher-priority
  `dao_tian`, `dao_ren`, `shu`, or `qi` guidance
- no lifecycle confusion: inactive cards must remain excluded and demoted cards
  must not re-enter default context through linked evidence
- explicit rollback criteria: a future default-on spec must define how to return
  to drawer-only defaults if card injection degrades runtime behavior

P47 keeps card-level embeddings deferred. P45 linked-evidence retrieval remains
the only implemented card retrieval strategy: cards are found through matched
evidence drawers, not through a separate card vector index.

This keeps the citation model simple. Card statements are distilled beliefs;
evidence drawers remain the source-backed material. A card embedding index would
make card statements directly retrievable, which may improve recall, but it also
adds a new stale-vector surface and can make unsupported belief text feel like a
primary source unless every result still carries linked evidence citations.

Evidence required before card embeddings:

- statement-match misses: repeated retrieval traces where linked-evidence search
  misses useful active cards because evidence wording does not match the query
  but the card statement does
- citation preservation: card-embedding results must still return linked
  evidence citations as the citation root
- measurable recall improvement over P45 linked-evidence retrieval without
  unacceptable precision loss
- schema and maintenance plan for card vector storage, reindexing, and
  stale-vector handling
- rollback behavior that can disable card-vector retrieval and fall back to P45
  linked-evidence retrieval without data loss

P48 keeps `knowledge_events` as the authoritative Phase-2 card audit trail, with
no default JSONL dual-write for card lifecycle mutations. This keeps card
promote/demote/backfill behavior transactionally bound to the same SQLite
database that owns `knowledge_cards` and `knowledge_evidence_links`.

Stage-1 drawer lifecycle continues to use `audit.jsonl` where already defined.
Phase-2 card lifecycle does not mirror those entries into `audit.jsonl` by
default because that would create two audit surfaces with different durability
and transaction semantics. The append-only `knowledge_events` table is the
source of truth for card lifecycle history.

If an external integration needs JSONL card history, it should be added as an
explicit export surface. JSONL export must be derived from `knowledge_events`,
must be reproducible, and must not become a second source of truth.

## Decision on Bootstrap vs Final Architecture

Current recommendation:

- start with two drawer types
- explicitly mark this as bootstrap-only
- plan for extraction into separate knowledge objects later

This gives the system a low-cost learning path without pretending the temporary structure is ideal.

## What Belongs Where

### In research-rs

- ingest and normalize external sources
- maintain research wiki/index/log/lint
- emit structured evidence and candidate insights

### In memory layer

- store evidence memory
- store knowledge memory
- maintain `dao / shu / qi`
- manage promotion and demotion lifecycle

### In skills

- encode reusable workflows
- expose relevant `dao`
- bind to `qi`

### In evaluator/gate

- validate promotions
- handle demotions
- guard against self-pollution

## Non-Goals

This design does not assume:

- fully automatic promotion to `dao`
- external research directly creating universal law
- replacing raw evidence with compressed knowledge
- collapsing evidence, knowledge, and workflow into one storage object forever

## Current Recommendation

Proceed with the following assumptions unless future evidence rejects them:

- `dao` belongs to the memory layer
- `research-rs` is an external `qi` tool, not a `dao` container
- evidence memory and knowledge memory should be explicitly separated
- runtime typed context should assemble `dao` before `shu`, and `shu` before
  `qi`; wake-up remains a refresh surface, not the typed assembler
- the implementation should begin with drawer bootstrap and evolve into a
  dedicated knowledge model inside the same SQLite `palace.db`

## Future Work After P42

P42 originally left one explicit follow-up:

- add evaluator-assisted promotion only behind deterministic gates and human
  review rules for high-level knowledge

P50 closes that item as policy. P50 defines evaluator-assisted promotion as advisory-only.
Evaluators are not lifecycle actors.

Evaluators may:

- recommend promotion or demotion candidates
- propose supporting, verification, teaching, and counterexample evidence refs
- produce contradiction candidates and risk notes
- explain why a knowledge item appears ready or unsafe

Evaluators must not directly mutate status or otherwise act as lifecycle writers:

- append lifecycle refs as authoritative gate input
- bypass deterministic promotion or demotion gates
- satisfy reviewer requirements by evaluator-only review
- create automatic promotion or demotion paths

The deterministic gates remain authoritative. Promotion and demotion still go
through the existing gate-enforced CLI/MCP lifecycle surfaces. `dao_tian`
canonicalization still requires a human reviewer; evaluator-only canonization is forbidden.
If a future implementation adds evaluator APIs, that work must be a separate
spec and preserve deterministic replay, evidence citation, and audit semantics.

No open Future Work remains in the P42 list.

## Completion Status After P50

P51 closure audit: the MIND-MODEL baseline is complete.

No open implementation tasks remain in the P12-P50 baseline. The completed
baseline includes:

- typed evidence and knowledge drawers
- `dao_tian`, `dao_ren`, `shu`, and `qi` governance boundaries
- worktree/repo/global anchor behavior
- wake-up/context separation
- context-guided skill/tool selection
- distill, gate, promote, demote, and anchor publication lifecycle surfaces
- Phase-2 knowledge card storage, lifecycle, retrieval, and opt-in context
- research ingestion and evaluator promotion policies

Completion does not mean every optional future enhancement is implemented. It
means the current design baseline has no known open implementation task. Future
evaluator APIs, card-level embeddings, default card context, research adapters,
or other expansions must start as new-stage specs with their own evidence,
rollback criteria, and acceptance checks.

## Phase 3 Intake Roadmap

P52 Phase-3 intake roadmap defines how work starts after baseline closure.
Phase 3 is new-stage work, not unfinished P12-P50 baseline work.

Candidate tracks:

- evaluator APIs
- card retrieval maturity
- research adapter ingestion
- runtime adoption evidence

Intake rules:

- each candidate must state evidence, rollback criteria, and acceptance checks before implementation begins
- default-enabling card context or card embeddings requires measured retrieval benefit
- Evaluator APIs must preserve the P50 advisory-only lifecycle boundary
- Research adapters must preserve the P49 evidence-first ingestion boundary
- card retrieval changes must preserve citation and audit semantics
- runtime adoption work must include rollback criteria for agent behavior changes

The first Phase-3 implementation should be selected only after one candidate has
enough evidence to justify implementation. Until then, Phase 3 remains an intake
queue, not an implementation commitment.

## Phase 3 Candidate Evidence Audit

P53 Phase-3 candidate evidence audit records current readiness. No Phase-3 candidate is ready for direct implementation yet.

Candidate readiness:

- Runtime adoption evidence: recommended first measurement track. It should collect concrete agent-behavior evidence before default policy changes.
- Card retrieval maturity: partial evidence from P43-P45, but it still needs measured retrieval misses and context impact before default context changes or card embeddings.
- Evaluator APIs: blocked on advisory output contracts and lifecycle replay requirements.
- Research adapter ingestion: blocked on an explicit external report/input contract.

Recommended first Phase-3 track: runtime adoption evidence. Runtime adoption evidence is the common measurement substrate for deciding whether card-aware context should become default, whether card embeddings are justified, and what evaluator advice is actually useful to agents.

This keeps Phase 3 evidence-first: implement measurement before implementing
new authority, new retrieval defaults, or new external ingestion adapters.

## Phase 3 Runtime Surfaces

P54 runtime adoption evidence adds schema v9 table `runtime_adoption_events`.
Events capture explicit agent/runtime signals with `track`, `signal`, `feature`,
optional query/context/card/evaluator/research references, note, metadata, and
timestamp.

P55 runtime adoption CLI exposes this evidence substrate:

- `mempal phase3 adoption record`
- `mempal phase3 adoption list`
- `mempal phase3 adoption stats`

P56 implements `mempal phase3 gate card-context-default`. Card context default gate remains read-only; `include_cards` remains opt-in. The gate requires
accepted `card_context` adoption evidence and zero rollback signals before a
future default-on spec can even be considered.

P57 implements `mempal phase3 gate card-embeddings`. The gate remains read-only
and adds no card vector schema. Card embeddings require repeated measured
`card_embedding` miss signals, and linked evidence remains the citation root.

P58 implements `mempal phase3 gate evaluator-api`. Evaluator API gate remains advisory-only and preserves the P50 advisory-only lifecycle boundary: evaluator
signals cannot mutate status, satisfy reviewer requirements, or bypass
deterministic gates.

P59 implements `mempal phase3 research-validate-plan`. The command validates an
external JSON report/input contract with `report_id`, `title`, `sources`,
`findings`, and optional `candidate_insights`. It only validates and plans;
research adapter ingestion still preserves the P49 evidence-first boundary and
does not create promoted/canonical knowledge.

P60 exposes the Phase-3 runtime evidence baseline to MCP-connected agents
through `mempal_phase3`. The MCP tool uses `action` values
`record/list/stats/gate/research_validate_plan`, mirroring the P54-P59 CLI
surfaces without adding new authority. Later Phase-3 actions extend this same
bounded MCP surface. `record` appends `runtime_adoption_events`; `list`,
`stats`, `gate`, and `research_validate_plan` remain read-only. MCP research
validation accepts a JSON report object and still does not ingest or promote
knowledge.

P61 adds a read-only runtime adoption recording protocol through
`mempal_phase3 action=guidance`. The guidance tells agents when to record
`used`, `accepted`, `rejected`, `miss`, `rollback`, `contradiction`, or
`neutral`, and exposes the required `track`, `signal`, and `feature` fields
plus optional context fields. This is a recording discipline, not automatic
instrumentation: it adds no hooks, no background writes, no schema migration,
and no default runtime behavior change.

P62 exposes the same recording protocol through CLI parity with
`mempal phase3 adoption guidance`. The CLI supports plain and JSON output and
shares the guidance implementation with `mempal_phase3 action=guidance`, so MCP
agents, humans, and non-MCP automation inspect the same semantics. This remains
read-only and does not append adoption events or change Phase-3 gate policy.

P63 adds a read-only record helper through
`mempal phase3 adoption prepare-record` and
`mempal_phase3 action=prepare_record`. The helper validates and normalizes
candidate record inputs, then returns the equivalent CLI `record` command and
MCP `record` payload with `writes=false`. It does not generate event ids unless
the caller supplied one, and it does not append runtime adoption events.

P64 adds a read-only record quality policy through
`mempal phase3 adoption check-record` and
`mempal_phase3 action=check_record`. The policy evaluates candidate runtime
adoption event quality before writing and returns `writes=false`, `valid`,
`quality`, `errors`, and `warnings`. It treats empty `feature` as an error,
warns when outcome-bearing signals lack concrete note/query context, and warns
when track-specific references such as `card_id`, `evaluator_id`, or
`research_report_id` are missing. This remains advisory only: it does not append
events and does not block the lower-level `record` command.

P65 adds a read-only runtime adoption review report through
`mempal phase3 adoption review` and `mempal_phase3 action=review`. The report
summarizes accumulated adoption evidence with applied filters, aggregate signal
counts, per-feature counts, an advisory conclusion, and reasons. It supports
track, feature, and signal filtering without schema changes; signal filtering is
applied after DB retrieval. This gives future default-on specs a compact
evidence artifact while preserving the Phase-3 boundary: review reports do not
write events, change gates, or authorize runtime default changes.

P66 adds a read-only card-context default readiness report through
`mempal phase3 readiness card-context-default` and
`mempal_phase3 action=readiness` with `candidate=card-context-default`. The
report reuses P65 review semantics filtered to `track=card_context` and
`feature=include_cards`, then returns `writes=false`, `ready`, `decision`, the
embedded review, and reasons. `ready=true` only means the surface is eligible
for a future default-on spec; it does not change `mempal context` defaults,
enable `include_cards`, mutate lifecycle state, or create card embeddings.

P67 adds explicit evidence-first research ingestion through
`mempal phase3 research-ingest-plan`. The command accepts the same P59 report
contract and defaults to dry-run with `writes=false`. With `--execute`, it
writes one `memory_kind=evidence` drawer per finding using
`provenance=research`, stable drawer ids, and idempotent skip-on-existing
behavior. `candidate_insights` are surfaced only as `mempal knowledge distill`
suggestions backed by the planned evidence refs; P67 does not create knowledge
drawers, promote research output, add a schema migration, or expose MCP write
access.

P68 exposes the P67 dry-run planning semantics through MCP as
`mempal_phase3 action=research_ingest_plan`. The action accepts an inline
`report` JSON object, returns planned research evidence drawer refs plus
candidate `mempal knowledge distill` suggestions, and always reports
`writes=false`. It shares the same pure planner as the CLI but deliberately does
not expose `--execute`, create drawers or vectors, mutate runtime adoption
events, or promote research output into knowledge.

P69 adds a quality-gated runtime adoption write path through
`mempal phase3 adoption record-checked` and
`mempal_phase3 action=record_checked`. The command/action runs the P64 record
quality policy immediately before writing. `quality=ready` records are written,
`quality=warning` records are blocked by default unless `allow_warnings` is
explicitly set, and `quality=invalid` records are always blocked. This reduces
low-signal self-evolution evidence without adding hooks, background
instrumentation, schema changes, or new authority for Phase-3 gates.

P70 self-evolution completion audit records the current state against the
larger objective: a complete self-evolving agent system. Complete
self-evolving agent system deliverables are:

- evidence substrate: the system can store cited raw evidence and runtime
  adoption outcomes without losing provenance
- knowledge governance: evidence can be distilled into governed knowledge and
  moved through lifecycle gates
- runtime retrieval: agents can request context/search/card/research guidance
  without changing defaults implicitly
- research bridge: external research output can enter as evidence and candidate
  insight suggestions, not as direct dao
- feedback loop: runtime use, acceptance, rejection, misses, rollbacks, and
  contradictions can be recorded and reviewed
- policy hardening path: stronger defaults require evidence, readiness checks,
  rollback criteria, and a new P-level spec

Prompt-to-artifact checklist:

- Evidence substrate -> P54 runtime_adoption_events plus P0-P13 raw drawer
  storage and cited search provide durable evidence records.
- Knowledge governance -> P12-P28 implement typed `dao_tian` / `dao_ren` /
  `shu` / `qi` drawers, context assembly, policy surfaces, distill, gate,
  promote, demote, and anchor publication.
- Knowledge cards -> P31-P45 implement Phase-2 card schema, core API, CLI,
  MCP read/lifecycle/retrieval, backfill, and explicit card-aware context.
- Research ingestion -> P49/P59/P67/P68 preserve evidence-first research
  boundaries: validate report, plan evidence refs, write research evidence only
  through explicit CLI `--execute`, and expose MCP dry-run planning.
- Runtime adoption -> P54-P69 implement event storage, CLI/MCP record/list/stats,
  guidance, prepare/check helpers, review, readiness, and quality-gated
  `record_checked` writes.
- Default hardening -> P56/P57/P58/P66 define read-only gates and readiness
  reports for card context, card embeddings, evaluator APIs, and
  card-context-default eligibility.

P70 conclusion: not complete. P12-P69 establish a governed self-evolution
substrate, but they do not yet prove a complete self-evolving agent system.
Remaining gaps before full self-evolution:

- no automatic or semi-automatic adoption capture around actual agent tool
  execution; evidence still depends on explicit CLI/MCP calls
- no end-to-end replay that demonstrates research -> evidence -> distill ->
  gated promotion -> runtime context -> checked adoption record in one audited
  scenario
- no evaluator advisory API with replayable output contracts; P50/P58 only keep
  evaluators advisory and gated
- no default-on card context or card embeddings; P66 readiness only reports
  eligibility for a future default-on spec
- no autonomous promotion authority; lifecycle changes still require deterministic
  gates, evidence refs, and human/reviewer boundaries
- no rollback executor for default/runtime policy changes; rollback criteria are
  policy requirements, not an automated runtime mechanism

Future P candidates:

- P71 self-evolution loop replay: implemented as a CLI E2E replay test that
  walks research -> evidence -> knowledge card -> gate/promote -> context ->
  checked adoption record. This proves the existing pieces can compose, but it
  does not add automatic runtime capture.
- P72 runtime adoption capture helper: implemented as explicit CLI/MCP
  `capture` surfaces that map `surface/outcome` observations into existing
  checked runtime adoption records. It is dry-run by default, writes only with
  explicit execute, and does not add background instrumentation.
- P73 evaluator advisory API: implemented as deterministic CLI/MCP advice
  surfaces through `mempal phase3 evaluator advise` and
  `mempal_phase3 action=evaluator_advise`. Advice output is replayable from
  request fields, returns `writes=false`, `lifecycle_authority=false`,
  `deterministic_gate_required=true`, reasons, and a `surface=evaluator`
  adoption capture plan. It cannot mutate lifecycle state, satisfy reviewer
  requirements, bypass gates, or call LLM/network evaluators.
- P74 card context default-on proposal: implemented as read-only CLI/MCP
  proposal surfaces through `mempal phase3 default-proposal card-context` and
  `mempal_phase3 action=default_proposal`. The proposal embeds P66 readiness,
  requires explicit rollback criteria, returns `writes=false`, and only marks
  `proposal_ready=true` when both readiness and rollback criteria are present.
  It deliberately keeps `mempal context` / `mempal_context` default
  `include_cards=false`; any actual default change still requires a future
  explicit spec.

P75 self-evolution completion audit revisits the full objective after P71-P74
landed on main.

P75 objective restatement:

- The system must preserve raw evidence and runtime outcome evidence with
  provenance.
- The system must distill evidence into governed knowledge and cards through
  deterministic gates.
- The system must expose retrieval/context surfaces that agents can use without
  silently changing runtime defaults.
- The system must accept external research only through evidence-first and
  evidence-backed candidate insight paths.
- The system must record, review, and quality-gate runtime adoption feedback.
- The system must provide evaluator advice without granting lifecycle authority.
- The system must provide a policy-hardening path where stronger defaults require
  evidence, readiness checks, rollback criteria, and a new explicit spec.

P75 prompt-to-artifact checklist:

- Evidence substrate: P0-P13 raw drawer storage and citation-bearing search,
  P54 `runtime_adoption_events`, and schema v9 tests prove durable evidence and
  runtime outcome storage.
- Knowledge governance: P12-P28 typed `dao_tian` / `dao_ren` / `shu` / `qi`
  drawers, policy surfaces, distill, gate, promote/demote, and anchor
  publication remain covered by `tests/knowledge_lifecycle.rs`.
- Knowledge cards: P31-P45 card schema, core API, CLI, MCP, retrieval, backfill,
  and lifecycle surfaces remain covered by `tests/knowledge_card_*` and explicit
  card-aware context tests.
- Research bridge: P49/P59/P67/P68 ensure research output enters as evidence or
  evidence-backed candidate insight suggestions; P71 proves this path in
  `tests/phase3_self_evolution_replay.rs`.
- Self-evolution replay: P71 `tests/phase3_self_evolution_replay.rs` walks
  research -> evidence -> card promotion -> context -> checked adoption record.
- Adoption capture: P72 `mempal phase3 adoption capture` and
  `mempal_phase3 action=capture` map concrete `surface/outcome` observations
  into checked records without background instrumentation.
- Evaluator advice: P73 `mempal phase3 evaluator advise` and
  `mempal_phase3 action=evaluator_advise` return replayable advisory output with
  `lifecycle_authority=false` and `deterministic_gate_required=true`.
- Default hardening proposal: P74 `mempal phase3 default-proposal card-context`
  and `mempal_phase3 action=default_proposal` combine P66 readiness with
  rollback criteria while preserving `include_cards=false`.
- Protocol and inventory evidence: `src/core/protocol.rs`, `AGENTS.md`, and
  `CLAUDE.md` list the Phase-3 actions through
  `capture/evaluator_advise/default_proposal`.
- Mainline verification evidence: PRs #63, #64, #65, and #66 were merged to
  main with green `fmt`, `default`, and `rest` CI checks.

P75 conclusion: not complete.

The governed self-evolution substrate is now substantially complete: evidence,
knowledge governance, cards, research ingestion, runtime adoption feedback,
replay, capture helpers, evaluator advice, and default-on proposal artifacts are
implemented and tested. However, the full "complete self-evolving agent system"
objective still has uncovered requirements if interpreted as autonomous runtime
self-evolution.

Remaining gaps after P75:

- no automatic live tool instrumentation: adoption capture still requires
  explicit CLI/MCP calls rather than wrapping actual agent tool execution
- no actual default-on runtime change: `include_cards` remains opt-in by design,
  and P74 only creates a proposal artifact
- no rollback executor: rollback criteria are recorded in proposals but are not
  executable runtime policy
- no autonomous promotion authority: lifecycle mutation still requires
  deterministic gates, evidence refs, and human/reviewer boundaries
- no card embedding implementation: P57/P47 keep card-level embeddings behind
  measured miss evidence and future rollback requirements

P76 spec completeness invariant records the process rule that every numbered P
must leave both a task contract and a matching plan. This includes
documentation-only, audit-only, policy-only, and code implementation work. The
rule exists because the P-series is no longer just a task list; it is the
auditable decision trail for the mind-model implementation. Every P must leave
a spec before it can be considered complete. A future spec-less P or missing
spec is explicitly incomplete and must be fixed before implementation or merge.

Updated recommended next P candidates after reserving P76 for governance:

P77 live adoption instrumentation boundary adds a read-only policy surface for
the live instrumentation gap. `mempal phase3 adoption instrumentation-policy`
and `mempal_phase3 action=instrumentation_policy` return `writes=false`,
`default_mode=manual_only`, allow only `opt_in_wrapper` as the semi-automatic
mode, and explicitly forbid `implicit_background_capture`, silent event append,
and quality gate bypass. This does not install hooks or wrappers; it defines the
safe boundary future instrumentation must obey: opt-in, user opt-out, checked
capture/record_checked writes, and rollback evidence when instrumentation
degrades behavior.

Updated recommended next P candidates after completing P77:

P78 card context default runtime flag implements the first actual default
runtime change path for cards. The default remains `false` unless local config
sets `context.include_cards_default=true`. `mempal context` and `mempal_context`
use that config only when the request omits explicit card flags; CLI
`--include-cards` still opts in, CLI `--no-include-cards` opts out, and MCP
`include_cards` overrides config when supplied. The only supported write path is
`mempal phase3 default-control card-context`: enabling requires the P74
proposal-ready conditions, including sufficient `card_context/include_cards`
runtime adoption evidence and rollback criteria; disabling is always allowed
and writes the flag back to false. The command writes local config only and does
not append runtime adoption events, change schema, or alter search defaults.

Updated recommended next P candidates after completing P78:

- P79 rollback executor policy implements the first concrete rollback executor
  for default/runtime policy changes. CLI `mempal phase3 rollback-control
  card-context` evaluates `card_context/include_cards` rollback evidence and,
  only with `--execute`, writes local config
  `context.include_cards_default=false`. MCP `mempal_phase3
  action=rollback_control` exposes the same rollback evidence check as a
  read-only agent surface. No runtime adoption events, knowledge lifecycle
  state, schema, or search defaults are changed by rollback control.

Updated recommended next P candidates after completing P79:

P80 autonomous promotion boundary audit resolves the last ambiguous "gap" from
P70/P75 as a governance boundary rather than a missing implementation.
Autonomous promotion is out of scope for the current complete self-evolution
design. mempal can autonomously preserve evidence, prepare candidate knowledge,
evaluate gates, produce evaluator advice, assemble context, propose default
changes, and execute explicit rollback controls, but lifecycle authority remains
human-gated.
P80 decision: autonomous promotion is out of scope.

human-gated lifecycle authority is the final governance boundary: promotion and
demotion of Stage-1 knowledge drawers or Phase-2 knowledge cards must remain
explicit human/operator-triggered lifecycle mutation surfaces. Deterministic
gates, evidence refs, reviewer rules, evaluator advice, runtime adoption
evidence, and research findings can support the decision, but none of them can
silently convert a candidate into promoted knowledge. This boundary keeps the
system self-evolving in the evidence/proposal/context/adoption loop while
avoiding an agent that can grant itself durable knowledge authority.

Updated recommended next P candidate after completing P80:

- P81 self-evolution completion audit: re-evaluate the active objective against
  the actual artifacts after P77-P80. If the governed, human-gated definition is
  accepted as the intended objective, the audit can close the goal; if the goal
  still requires fully autonomous lifecycle mutation, that requirement must be
  reopened as a separate explicit spec rather than inferred.

P81 self-evolution completion audit is the final audit for the active objective
`完整自进化 agent 系统`.

Objective restatement: the target is a governed human-gated complete
self-evolving agent system. "Complete" means the agent can gather external and
runtime evidence, preserve provenance, distill and structure knowledge, retrieve
the right knowledge/skills/tools at runtime, record feedback, evaluate stronger
defaults, apply explicit default/rollback controls, and keep durable knowledge
lifecycle mutation under deterministic gates plus human/operator intent.

Prompt-to-artifact checklist:

- Evidence substrate: P0-P13 raw drawer storage, citation-bearing search, and
  P54 `runtime_adoption_events` provide durable evidence and runtime outcome
  storage. Evidence remains raw and cited; search/context do not rewrite source
  content.
- Knowledge governance: P12-P28 typed `dao_tian` / `dao_ren` / `shu` / `qi`
  drawers, policy surfaces, distill, gate, promote/demote, and anchor
  publication provide governed knowledge lifecycle for Stage-1 drawers.
- Knowledge cards: P31-P45 implement card schema, evidence links, append-only
  events, CLI/MCP lifecycle, backfill, retrieval, and explicit card-aware
  context without making cards an implicit search/default source.
- Research bridge: P49/P59/P67/P68 ensure external research enters as evidence
  or evidence-backed candidate insight suggestions. Research output cannot
  directly define dao or bypass gates.
- Runtime feedback loop: P54-P69 provide runtime adoption event storage,
  guidance, prepare/check helpers, review/readiness/gate reports, and
  quality-gated `record_checked` writes.
- Self-evolution replay: P71 `tests/phase3_self_evolution_replay.rs` proves the
  composed path research -> evidence -> card promotion -> context -> checked
  adoption record.
- Live adoption boundary: P77 `instrumentation_policy` defines the safe boundary
  for future live wrappers: opt-in, preserve opt-out, no silent event append,
  and route writes through checked capture or `record_checked`.
- Runtime default control: P74/P78 provide proposal-ready and explicit
  `default-control` paths for card-aware context. Default change requires
  runtime evidence and rollback criteria; request-level overrides still win.
- Rollback and default control: P79 `rollback-control` turns rollback evidence
  into an explicit reversible config action, setting
  `context.include_cards_default=false` only with `--execute` and without
  writing runtime events or lifecycle state.
- Evaluator boundary: P50/P58/P73 keep evaluators advisory-only.
  `evaluator_advise` is replayable, returns `lifecycle_authority=false`, and
  cannot satisfy reviewer authority or bypass deterministic gates.
- Lifecycle authority boundary: P80 declares autonomous promotion out of scope.
  Human/operator-triggered promote/demote commands with evidence refs and gates
  remain the only durable lifecycle mutation path.
- Spec completeness: P76 requires every numbered P to leave a matching task spec
  and plan. P77-P81 follow this rule.
- Mainline verification: PR #68 through PR #72 are merged to main. Main CI runs
  `25805677837`, `25806999068`, `25808402185`, `25809830828`, and
  `25810588996` all completed with success across `fmt`, `default`, and `rest`
  jobs.

P81 conclusion: complete.

The active objective is complete under the governed human-gated definition. The
system now has an auditable loop from evidence intake to governed knowledge,
runtime context, feedback capture, policy evaluation, explicit default control,
and explicit rollback. It does not grant agents silent durable lifecycle
authority, and that is an intentional design boundary rather than a missing
implementation.

Residual boundary: fully autonomous lifecycle mutation remains out of scope. If
future work requires an agent to promote or demote durable knowledge without a
human/operator-triggered lifecycle command, that is a new objective and must be
opened as a separate P-level spec with its own evidence, rollback, safety, and
acceptance criteria.

## Post-P81 Opt-In Instrumentation

P82 implements the first concrete `opt_in_wrapper` allowed by P77. CLI
`mempal phase3 adoption wrap` explicitly runs one child command after `--`,
observes its exit status, maps `0` to `accepted` and non-zero to `rejected`
unless `--outcome` overrides the mapping, and returns a wrapper report with the
child exit code plus the nested capture report.

P82 preserves the governed runtime boundary:

- no hooks, daemons, background workers, or silent capture are installed
- no MCP-side shell command execution is added
- no runtime adoption event is written unless `--execute` is supplied
- all writes reuse P72 capture mapping and P69 checked-record quality gates
- warning-quality captures remain blocked unless `--allow-warnings` is supplied
- non-zero child exits are propagated after the wrapper report is emitted

This makes runtime evidence capture less manual while keeping instrumentation
explicit, opt-in, quality-gated, and reversible through existing rollback
evidence policies.

## Cognitive Brief

P83 adds `mempal brief`, the first deterministic cognitive brief surface. It
does not replace `mempal search` or `mempal context`; it uses the existing
context assembly path with evidence and cards enabled, then organizes the result
into a citation-first report.

The P83 brief contains:

- key facts from governed knowledge/context items
- cited evidence items
- active knowledge cards and their linked evidence citations
- simple entity cues extracted from cited text
- unresolved-item cues such as action items, blockers, or remaining work
- uncertainty signals such as missing evidence, missing governed knowledge,
  missing cards, unresolved work, or conflict/stale language
- deterministic next actions

P83 deliberately avoids LLM synthesis, MCP command execution, schema changes,
fact-check side effects, dream-cycle maintenance, or runtime adoption writes.
Its purpose is to make the system "read for you" in a safe first step: organize
cited memory into an actionable brief while preserving uncertainty instead of
hallucinating confidence.

## Multi-Agent Cowork Bus

P84 upgrades cowork from a two-tool Claude/Codex pair protocol into a
project-scoped multi-agent bus. The old P8 path routes to tool-family inboxes
such as `claude` or `codex`; that is insufficient when one project has one
Claude Code instance and multiple Codex instances, because both Codex instances
race on the same shared `codex` inbox.

The P84 bus introduces stable `agent_id` addressing:

- `cowork-register` records concrete instances such as `claude-main`,
  `codex-a`, and `codex-b`
- `cowork-send` targets one concrete `agent_id`
- `cowork-broadcast` fans out independent inbox copies to multiple targets
- `cowork-agent-drain` drains one concrete agent inbox
- `cowork-agents` lists registered agents and pending per-agent inbox state

State lives under `~/.mempal/cowork-bus/<encoded_project_identity>/`, outside
`palace.db`. P84 remains ephemeral and file-backed: it does not write drawers,
cards, runtime adoption events, audit entries, or schema state. The legacy
`cowork-drain --target claude|codex` and `mempal_cowork_push` path remains
available unchanged for backward compatibility.

P84 stores optional transport metadata but only inbox delivery is active. tmux
send/capture is intentionally left to the next P-level task so that instance
identity and per-agent routing are proven before adding a more invasive
terminal-injection transport.

## MCP Multi-Agent Cowork Bus

P85 exposes the P84 bus to agent runtimes through one MCP tool:
`mempal_cowork_bus`. This is the point where the bus becomes usable by agents
directly, not only by shell commands.

The MCP surface is action-based:

- `action=register` registers or updates a concrete `agent_id`
- `action=list` reports project bus agents and pending inbox counts
- `action=send` delivers one message to one concrete target
- `action=broadcast` fans out independent inbox copies to multiple targets
- `action=drain` returns and consumes one concrete agent's inbox

P85 deliberately does not infer concrete instances from MCP `client_info.name`.
Client names can identify a tool family such as Codex, but they cannot
distinguish `codex-a` from `codex-b`. The multi-agent bus therefore requires
explicit `agent_id` values and remains separate from legacy
`mempal_cowork_push`, which is still the simpler Claude<->Codex partner
handoff path.

The MCP tool uses the same file-backed bus state under
`~/.mempal/cowork-bus/<encoded_project_identity>/`. It does not write
`palace.db`, drawers, cards, runtime adoption events, or schema state. tmux
transport remains a later layer on top of the now-explicit agent registry and
per-agent routing model.

## Tmux Cowork Transport

P86 activates tmux as an explicit transport for the multi-agent bus. `inbox`
remains the default safe path. A target agent can opt into near-real-time pane
delivery by registering with `transport=tmux` and a concrete `tmux_target`.

Example:

```bash
mempal cowork-register \
  --agent-id codex-a \
  --tool codex \
  --cwd "$PWD" \
  --transport tmux \
  --tmux-target mempal:0.1
```

After that, `cowork-send`, `cowork-broadcast`, and
`mempal_cowork_bus action=send|broadcast` use the same transport-aware bus core.
For tmux targets, mempal invokes the local `tmux` binary directly with
`std::process::Command`; it does not execute through a shell. The delivered text
is a plain envelope containing source agent id, target agent id, and message
content.

P86 intentionally does not silently fall back to inbox if tmux fails. A tmux
transport target means "deliver to this pane"; if that pane or binary is
unavailable, the send fails and no inbox copy is written. This prevents
ambiguous double-delivery semantics where a pane may receive a message and then
later drain the same message from an inbox.

This gives mempal three cowork layers:

- legacy partner handoff: `mempal_cowork_push` for Claude<->Codex pairs
- multi-agent bus inbox: explicit `agent_id` routing with per-agent inbox files
- tmux transport: explicit pane delivery for registered concrete agents

## Cowork Bus Event Log

P87 adds an operational event log to the multi-agent bus. Communication is no
longer only "fire and inspect inbox"; every important bus action also appends a
JSON Lines event under:

```text
~/.mempal/cowork-bus/<encoded_project_identity>/events.jsonl
```

The event stream records:

- `register` events when a concrete agent id is registered or updated
- `send` and `broadcast` delivery events for successful inbox or tmux delivery
- `send` / `broadcast` failure events for tmux hard failures
- `drain` events when an agent drains its per-agent inbox

Replay is intentionally read-only. `mempal cowork-events --cwd <repo>` and
`mempal_cowork_bus action=events` list the operational event stream; they do
not redeliver messages, drain inboxes, trigger tmux, or ingest anything into
`palace.db`. Message bodies are represented as bounded `message_preview`
fields, so the event stream is an operational audit trail rather than a second
durable memory store.

This is the first runtime-ops layer above P84-P86. It gives later delivery
ack/status, presence, thread/channel, and tmux peek work a shared evidence
source for "what happened on the bus" without changing the core memory schema.

## Cowork Delivery Ack And Status

P88 derives delivery status from the P87 event stream. There is no mutable
status table and no database migration. The event id of each successful or
failed delivery is the `message_id` surfaced to CLI and MCP callers.

Status is computed by replaying `events.jsonl`:

- `pending`: delivery succeeded and has not been drained or acked
- `drained`: a later drain event consumed the target agent's inbox message
- `acked`: the target agent explicitly appended an ack event for that
  `message_id`
- `failed`: the original delivery event was a hard transport failure

The user-facing surfaces are:

```bash
mempal cowork-deliveries --cwd "$PWD" --agent-id codex-a
mempal cowork-ack --cwd "$PWD" --agent-id codex-a --message-id evt-...
```

The MCP surface reuses `mempal_cowork_bus` with `action=deliveries` and
`action=ack`. Ack is explicit and append-only: it does not mutate inbox files,
does not redeliver messages, and does not write `palace.db`. This keeps the
bus operationally observable while preserving the original ephemeral cowork
boundary.

## Cowork Agent Presence

P89 adds explicit heartbeat-based presence. Registration gives each concrete
agent a `last_seen_at`, and agents can refresh it with:

```bash
mempal cowork-heartbeat --cwd "$PWD" --agent-id codex-a
```

Presence is derived when listing agents:

- `online`: last seen within the default 10 minute stale threshold
- `stale`: last seen exists but is older than the stale threshold
- `never_seen`: legacy or hand-edited records without `last_seen_at`

The same semantics are exposed through `mempal_cowork_bus action=heartbeat`
and `action=list`. This remains an explicit signal: mempal does not install a
daemon, does not infer liveness from tmux panes, and does not silently record
background heartbeat events. That keeps presence useful for coordination
without pretending to know more than the agent instances have explicitly
reported.

## Cowork Threads And Channels

P90 adds two coordination scopes above raw agent addressing:

- `thread_id` separates work streams such as `p90-review` or `release-audit`
- `channel` names a group of concrete agents such as `review` or `frontend`

Direct `cowork-send` and `cowork-broadcast` can attach `thread_id` and
`channel` metadata. The metadata is carried into bus inbox messages, events,
and delivery status replay, so a receiver can see which work stream a message
belongs to when draining its inbox.

Channels are explicit registry entries, not inferred from tool family names:

```bash
mempal cowork-channel-set \
  --cwd "$PWD" \
  --channel review \
  --agent codex-a \
  --agent codex-b

mempal cowork-channel-send \
  --cwd "$PWD" \
  --from claude-main \
  --channel review \
  --thread-id p90-review \
  --message "review this patch"
```

`cowork-channel-set` replaces membership for one channel, and
`cowork-channel-send` fans out through the same delivery core as broadcast. The
MCP surface exposes the same behavior as `mempal_cowork_bus`
`action=channel_set|channel_list|channel_send`.

P90 still does not make channels durable memory. They are operational routing
state under `~/.mempal/cowork-bus/<project>/`, and they do not write
`palace.db`.

## Cowork Tmux Live Peek

P91 completes the tmux runtime-ops loop by adding explicit read-only pane
inspection for agents registered with `transport=tmux` and a concrete
`tmux_target`.

```bash
mempal cowork-tmux-peek \
  --cwd "$PWD" \
  --agent-id codex-a \
  --lines 80
```

The MCP surface is `mempal_cowork_bus action=tmux_peek`. Both CLI and MCP use
the same adapter: a direct `std::process::Command` invocation of the local
`tmux` binary with `capture-pane`. It is not executed through a shell, and it
does not discover panes automatically. The registered `tmux_target` is the
authority.

Peek is deliberately not delivery. It does not append `events.jsonl`, does not
write an inbox message, does not update channel or agent registry state, and
does not write `palace.db`. Capture failure is a hard error rather than a
fallback to inbox or legacy `mempal_peek_partner`.

This preserves the separation between:

- agent live observation: read-only tmux pane capture
- operational communication: inbox/tmux send through the cowork bus
- durable memory: explicit ingest into the palace database

## Multi-Agent Cowork Runbook

P92 makes the multi-agent runtime surfaces operationally usable by adding the
authoritative [COWORK-RUNBOOK](COWORK-RUNBOOK.md) plus a read-only CLI surface:

```bash
mempal cowork-runbook --format plain
mempal cowork-runbook --format json
```

The runbook describes concrete agent registration, direct send, broadcast,
channels, threads, drain, delivery status, ack, presence, tmux delivery, tmux
peek, doctor, sessions, handoff summaries, and explicit memory capture. Reading
the runbook does not touch `~/.mempal` or `palace.db`.

## Cowork Doctor

P93 adds deterministic runtime diagnostics:

```bash
mempal cowork-doctor --cwd "$PWD"
mempal cowork-doctor --cwd "$PWD" --probe-tmux --format json
```

The MCP equivalent is `mempal_cowork_bus action=doctor`. Doctor checks registry
size, channel count, session count, stale or never-seen agents, pending
deliveries, and optional tmux target reachability. tmux probing uses direct
`tmux has-session -t <target>` invocation, not a shell.

Doctor is read-only. It does not append events, drain inboxes, update
heartbeats, repair state, or write memory.

## Cowork Team Sessions

P94 adds runtime team sessions stored under:

```text
~/.mempal/cowork-bus/<encoded_project_identity>/sessions.json
```

Sessions bind a collaboration goal to concrete agents, optional channels, and
an optional thread id:

```bash
mempal cowork-session-create \
  --cwd "$PWD" \
  --session-id p94-review \
  --title "P94 review" \
  --agent claude-main \
  --agent codex-a \
  --thread-id p94-review

mempal cowork-sessions --cwd "$PWD"
mempal cowork-session-status --cwd "$PWD" --session-id p94-review --status paused
```

The MCP actions are `session_create`, `session_list`, and `session_status`.
Session changes append operational events, but they do not become durable
project memory and they do not change message delivery semantics.

## Cowork Handoff Summary

P95 adds deterministic handoff summaries:

```bash
mempal cowork-handoff --cwd "$PWD"
mempal cowork-handoff --cwd "$PWD" --thread-id p95-review --format json
```

The MCP action is `mempal_cowork_bus action=handoff`. A handoff summarizes
active sessions, agents and presence, pending deliveries, and recent events. It
supports `thread_id`, `channel`, `session_id`, and `limit` filters. It does not
call an LLM, drain inboxes, ack deliveries, or persist memory.

## Cowork Memory Capture

P96 adds the explicit bridge from runtime cowork state to durable evidence:

```bash
mempal cowork-capture \
  --cwd "$PWD" \
  --summary-source handoff \
  --session-id p95-review \
  --execute \
  --format json
```

The MCP action is `mempal_cowork_bus action=capture`. Capture defaults to
dry-run. With `--execute` / `execute=true`, it writes one evidence drawer under
wing `cowork-capture` by default. It does not capture raw tmux pane text, does
not promote knowledge, does not create knowledge cards, and does not alter
delivery status. This keeps runtime communication ephemeral unless an agent or
human explicitly crosses the memory boundary.

## Maintenance Runbook

P97 adds the authoritative [MAINTENANCE-RUNBOOK](MAINTENANCE-RUNBOOK.md) plus a
read-only CLI:

```bash
mempal maintenance-runbook --format plain
mempal maintenance-runbook --format json
```

The runbook stitches together research validation, research evidence ingest,
knowledge distill, card lifecycle gates, context adoption, runtime adoption
review, rollback, cowork handoff, and explicit cowork capture. It is a
checklist for dream-cycle style maintenance, not a daemon or scheduler.

## Release Install Doctor

P98 adds a read-only operator diagnostic:

```bash
mempal doctor --format plain
mempal doctor --format json
```

Doctor runs before normal database open/migration. It reports current binary
version, supported schema version, configured database path, direct SQLite
`PRAGMA user_version`, current executable path, the first `mempal` found on
`PATH`, warnings, and recommendations. This is specifically for the failure
mode where a long-lived MCP client or shell resolves an old binary against a
newer `palace.db` schema.

P99 exposes the same runtime diagnostic through MCP:

```text
mempal_doctor
```

The MCP response adds required server tool/action expectations so an agent can
tell whether its connected server advertises `mempal_context`, `mempal_brief`,
`mempal_phase3`, and `mempal_cowork_bus`.

## Guided Maintenance Run

P100 turns the static runbook into a deterministic dry-run command:

```bash
mempal maintenance guided-run --format plain
mempal maintenance guided-run --format json
```

The command is read-only (`writes=false`). It reports current drawer, knowledge
card, and runtime adoption event counts when a database exists, then emits an
ordered operator checklist for research validation, research evidence ingest,
knowledge distill, card lifecycle gate, context review, adoption review,
rollback review, cowork doctor, handoff, and explicit cowork capture. It never
executes the generated commands.

## Session Close Capture

P101 closes the loop for concrete multi-agent sessions:

```bash
mempal cowork-session-close --cwd "$PWD" --session-id p101-review
mempal cowork-session-close --cwd "$PWD" --session-id p101-review --capture --execute --format json
```

The MCP equivalent is `mempal_cowork_bus action=session_close`. Closing a
session updates the runtime session status to `closed` and appends the existing
session-status event. It does not write durable memory unless capture is
explicitly requested and `execute=true`; dry-run capture returns the handoff
payload without creating `palace.db`.

## MCP Cognitive Brief

P102 exposes the deterministic P83 brief to agent runtimes:

```text
mempal_brief
```

The tool accepts query, field, domain, cwd, max_items, and dao_tian_limit. It
returns the same citation-first shape as CLI brief: summary, key facts,
evidence, cards, entities, unresolved items, uncertainty, and next actions. It
does not call an LLM and does not write adoption evidence.

## Adoption Analytics

P103 adds a compact analytics view over runtime adoption events:

```bash
mempal phase3 adoption analytics --format plain
mempal phase3 adoption analytics --format json
```

MCP exposes the same report as `mempal_phase3 action=analytics`. Analytics is
read-only and groups events by `track` and `feature`, reporting used, accepted,
rejected, miss, rollback, contradiction, and neutral counts plus a
deterministic recommendation. This is the operator-facing bridge from raw
adoption evidence to default-change planning, without itself changing any
default or lifecycle state.

## Release Readiness

P104 adds the release checklist:

```bash
mempal release-readiness --format plain
mempal release-readiness --format json
```

The checklist is read-only (`writes=false`) and checks Cargo package metadata,
README presence, P98-P104 spec/plan inventory, runbooks, doctor availability,
and current schema support. It recommends concrete verification commands such
as `mempal doctor --format json`, `cargo test`, `cargo clippy -- -D warnings`,
and `cargo package`, but it does not run them automatically.

## Closing Summary

The proposed system is not "RAG plus skills."

It is a governed cognition stack:

- external tools gather and organize evidence
- memory stores both evidence and distilled knowledge
- skills operationalize methods under governing principles
- evaluators control what is allowed to harden into lasting law

That is the intended meaning of this design:

- `dao` is memory-level high-order knowledge
- `shu` is operational method
- `qi` is executable capability
- evidence is the substrate from which all of them must be justified

## P106: Context Distill Signal (read-only detector)

`mempal context` / `mempal_context` carry a read-only `distill_suggestions`
field. When assembling context, a deterministic detector groups active drawers
by `field` and flags each field where active **evidence** count is at least 5
AND there are zero active **promoted-or-canonical** knowledge drawers. It
returns at most 3 suggestions, ordered by descending evidence count then
ascending field; each carries `field`, `evidence_count`, up to 3
`sample_evidence_drawer_ids`, and `suggested_tier="dao_ren"`.

This is the "detector" layer of agent-driven mind-model construction: it makes
"this is worth distilling" a client-agnostic, pull-based signal that appears
where agents already look. It is purely observational — it performs no database
write, no LLM call, no auto-distill, and no auto-promotion. Acting on a
suggestion stays the agent's explicit `mempal_knowledge_distill` plus the
deterministic gate (governance per P77/P80 unchanged). The signal is on by
default (`include_distill_suggestions`, disable with
`--no-distill-suggestions`) and never alters the assembled tier sections.