#import "lib.typ": *
#page-header(
"Codex",
"Session archival, export, and retrieval."
)
== Overview
The codex is the permanent archive for Claude Code session transcripts. Every
time you interact with Claude, the conversation is recorded as a JSONL file
under `~/.claude/projects/`. Those files are ephemeral -- Claude can overwrite
or rotate them at any time. The codex captures them into a stable, searchable
archive before they disappear.
Each archived session produces a directory containing:
- `manifest.json` -- metadata (timestamps, message count, agent count, project path, checksum)
- `session.jsonl` -- the raw transcript (omitted in `--clean` mode)
- `conversation.md` -- a clean markdown rendering of the conversation (generated in `--clean` mode, or via `migrate --clean`)
- `agents/` -- sub-agent JSONL files (when sub-agents were used)
- `images/` -- extracted base64 images (pulled out during migration or archive)
Archives are stored under `$MX_CODEX_PATH` (defaults to `$MX_HOME/codex/`),
one directory per session, named with a timestamp and short session ID.
== Archiving sessions
#command(
"mx codex archive",
[Archive session transcripts to permanent storage. With no arguments,
archives the most recent non-agent session. With `--all`, walks
`~/.claude/projects/` and archives every session not already in the codex.
Already-archived sessions are skipped (idempotent).
After archiving, the by-project index is rebuilt so subsequent reads can
locate sessions by project name.],
flags: (
([`[PATH]`], [positional], [Path to a specific session JSONL file. Conflicts with `--all` and `--backfill`.]),
([`--all`], [flag], [Archive all unarchived sessions. Conflicts with `--backfill`.]),
([`--clean`], [flag], [Save only `conversation.md`, `manifest.json`, and extracted images. Omits the raw JSONL and agent files. Produces a smaller, human-readable archive.]),
([`--include-agents`], [flag], [Fold sub-agent transcripts into `conversation.md`. Requires `--clean` and requires `subagents` in `--include` (the default).]),
([`--include <LIST>`], [string], [Comma-separated list of source artifacts to capture. Recognized tokens: `subagents` (default), `mcp`, `tool-output`, `history`, `all`, `none`. See below.]),
([`--backfill [VAULT_PATH]`], [optional path], [Ingest legacy vault snapshots into the codex. See _Backfill_ section. Conflicts with `--all` and `[PATH]`.]),
),
examples: (
"mx codex archive",
"mx codex archive --all",
"mx codex archive --clean --include-agents",
"mx codex archive --all --clean --include subagents,mcp",
"mx codex archive /path/to/specific-session.jsonl",
),
)
=== The --include flag
The `--include` flag controls which optional source artifacts are captured
alongside the session transcript. Tokens are case-insensitive and
comma-separated.
#table(
columns: (auto, auto, auto),
table.header([*Token*], [*Default*], [*Description*]),
[`subagents`], [ON], [Copy sub-agent JSONL files into `agents/`.],
[`mcp`], [OFF], [Capture MCP server logs.],
[`tool-output`], [OFF], [Capture `/tmp` tool output snapshots.],
[`history`], [OFF], [Capture a slice of `~/.claude/history.jsonl`.],
[`all`], [--], [Enable all of the above.],
[`none`], [--], [Disable all of the above.],
)
Passing both `all` and `none` in the same value is rejected as a user error.
Unknown tokens print a warning and are skipped. The hyphenated `tool-output`
is the canonical spelling; `tool_output` (with underscore) is not recognized.
#note[The `--include` flag on `archive` governs which source files are _captured_
into the archive. The separate `--include` flag on `export` governs which
captured artifacts are _rendered_ into the output. They share token names but
serve different purposes.]
=== Clean mode
When `--clean` is passed, the archive stores a rendered `conversation.md`
instead of the raw session JSONL. The transcript is generated by:
1. Extracting `user` and `assistant` messages from the JSONL
2. Stripping `<system-reminder>` blocks from user messages
3. Dropping tool-use blocks, tool results, and non-conversation message types
4. Labeling speakers with configurable names (`MX_USER_NAME` env var, or git `user.name`, falling back to "User"; `MX_ASSISTANT_NAME` env var, falling back to "Orchestrator")
With `--include-agents`, sub-agent transcripts are appended to
`conversation.md` under `## Agent: <name>` headings, separated by horizontal
rules. Agent names are resolved from the parent session's `subagent_type`
field when available, falling back to the hex ID from the filename.
== Backfill
The `--backfill` flag migrates historical session data from the legacy vault
(`~/.wonka/vault/archives/`) into the codex.
#command(
"mx codex archive --backfill",
[Walk every `session-*` snapshot under the vault path and feed each session
JSONL through the standard archive pipeline. The vault path defaults to
`~/.wonka/vault/archives/` when no value is given.
Backfill is idempotent: re-running against the same vault produces the same
codex state. Sessions already present in the codex (matched by session ID
derived from the JSONL filename) are skipped. Per-session failures are
non-fatal -- errors are accumulated and reported at the end so a single
corrupt file does not abort the entire run.
The `--clean` and `--include` flags still apply during backfill, governing
what each per-session archive captures.],
flags: (
([`--backfill [VAULT_PATH]`], [optional path], [Path to the vault archives directory. Defaults to `~/.wonka/vault/archives/`.]),
),
examples: (
"mx codex archive --backfill",
"mx codex archive --backfill /custom/vault/path",
"mx codex archive --backfill --clean --include-agents",
),
)
The expected vault layout is:
```
<vault_path>/
session-YYYYMMDD-HHMMSS-NNNNNN/
projects/
<project-slug>/
<session-uuid>.jsonl
<session-uuid>/
subagents/
agent-*.jsonl
```
#tip[When a vault exists at the default path, every `mx codex` command prints a
reminder to run `mx codex archive --backfill`. The reminder is suppressed when
you are already running backfill.]
== Exporting
#command(
"mx codex export",
[Export archived sessions as Markdown or structured JSON. Content is read
exclusively from the codex -- live `~/.claude/projects/` data is never
ingested directly. If unarchived sessions are detected, a warning is printed
to stderr (unless `--archive-first` is passed).
With no selector flags, exports the most recent archived session. Selectors
are mutually exclusive: at most one of `--session`, `--project`, or `--date`
may be passed.],
flags: (
([`--session <UUID>`], [string], [Select by session UUID (full or unique prefix).]),
([`--project <QUERY>`], [string], [Filter by project: absolute path, cwd-encoded slug, or basename. Ambiguous basenames list collisions and exit non-zero.]),
([`--date <RANGE>`], [string], [Date selector. Accepts `YYYY-MM-DD`, `YYYY-MM-DD..YYYY-MM-DD`, or `YYYY-MM`.]),
([`--format <FMT>`], [string], [Output format: `markdown` (default), `json`, or `both`. `both` requires `--output`.]),
([`--include <LIST>`], [string], [Comma-separated content to render. Default: `subagents`. Tokens: `subagents`, `tools`, `system-reminders`, `mcp`, `tool-output`, `history`, `all`, `none`.]),
([`--archive-first`], [flag], [Run `mx codex archive --all` before exporting. Suppresses the unarchived-data warning.]),
([`-o, --output <PATH>`], [path], [Output file. Default: stdout. Required when `--format both`.]),
),
examples: (
"mx codex export",
"mx codex export --session abc123",
"mx codex export --project mx --format json",
"mx codex export --date 2026-04 -o april.md",
"mx codex export --date 2026-04-01..2026-04-15 --format both -o sessions",
"mx codex export --archive-first --include all -o full.md",
),
)
=== Format: both
When `--format both` is used with `--output`, two sidecar files are written:
- If the path ends in `.json`: JSON goes to the path, markdown goes to `<path>.md`
- If the path ends in `.md`: markdown goes to the path, JSON goes to `<path>.json`
- Otherwise: both extensions are appended (`<path>.json` and `<path>.md`)
Using `--format both` without `--output` is rejected -- there is no clean way
to multiplex two formats on stdout.
=== Export --include tokens
The export `--include` set is distinct from the archive `--include` set. It
controls which content is _rendered_, not which is _captured_. Two additional
tokens are available for export only:
#table(
columns: (auto, auto),
table.header([*Token*], [*Description*]),
[`tools`], [Render `tool_use` blocks (assistant tool invocations).],
[`system-reminders`], [Render `<system-reminder>` blocks.],
)
The remaining tokens (`subagents`, `mcp`, `tool-output`, `history`, `all`,
`none`) behave identically to the archive side.
== Browsing
=== List
#command(
"mx codex list",
[List archived sessions. By default, shows only the latest version of each
session (filtering out incremental re-archives). Output is a table with
columns: archive ID, archived timestamp, message count, agent count, and
size.],
flags: (
([`--all`], [flag], [Show all archives including incremental saves.]),
([`--json`], [flag], [Output as JSON array.]),
),
examples: (
"mx codex list",
"mx codex list --all",
"mx codex list --json",
),
)
=== Read
#command(
"mx codex read <ID>",
[Read an archived session by its short archive ID (from `mx codex list`).
The ID is matched as a substring against archive directory names.
By default, outputs the raw transcript. With `--clean`, outputs the
rendered `conversation.md` (errors if no clean transcript exists -- use
`mx codex migrate --clean` to generate one). With `--human`, pretty-prints
JSONL as labeled User/Assistant blocks.],
flags: (
([`<ID>`], [positional], [Archive ID (short UUID from `list`).]),
([`--human`], [flag], [Pretty-print JSONL in human-readable format. Conflicts with `--clean`.]),
([`--agents`], [flag], [Include agent transcripts in the output.]),
([`--grep <PATTERN>`], [string], [Filter output to lines matching the pattern.]),
([`--json`], [flag], [Output the manifest as JSON.]),
([`--clean`], [flag], [Read the clean markdown transcript (`conversation.md`). Conflicts with `--human`.]),
),
examples: (
"mx codex read abc12345",
"mx codex read abc12345 --clean",
"mx codex read abc12345 --clean --agents",
"mx codex read abc12345 --human",
"mx codex read abc12345 --grep \"migration\"",
"mx codex read abc12345 --json",
),
)
=== Search
#command(
"mx codex search <PATTERN>",
[Search all archived sessions for a text pattern. Scans both
`conversation.md` (preferred) and `session.jsonl` (fallback) in each
archive. Reports matching archive IDs and the lines that contain the
pattern. Archives with no transcript file are skipped with a count
reported to stderr.],
flags: (
([`<PATTERN>`], [positional], [Text pattern to search for.]),
([`--json`], [flag], [Output matches as JSON.]),
),
examples: (
"mx codex search \"retry logic\"",
"mx codex search migration --json",
),
)
== Migration
#command(
"mx codex migrate",
[Upgrade archive formats. Archives below the current schema version are
migrated forward.
Without `--clean`, the primary migration is image extraction: v1 archives
have base64 images embedded in the JSONL. Migration extracts them to
`images/` files and rewrites the JSONL with references. Older archives
also receive a metadata-only version bump to the current schema (v5).
Original files are backed up as `*.bak`.
With `--clean`, generates `conversation.md` for archives that have
`session.jsonl` but no clean transcript. This is useful for retroactively
adding human-readable transcripts to archives created before clean mode
existed.],
flags: (
([`--dry-run`], [flag], [Show what would be migrated without making changes.]),
([`--verbose`], [flag], [Show detailed progress for each archive.]),
([`--clean`], [flag], [Generate `conversation.md` for archives missing a clean transcript.]),
([`--include-agents`], [flag], [Include sub-agent transcripts in generated clean transcripts. Requires `--clean`.]),
),
examples: (
"mx codex migrate --dry-run",
"mx codex migrate --verbose",
"mx codex migrate --clean",
"mx codex migrate --clean --include-agents --verbose",
),
)
== Deprecated alias
#deprecated[`mx codex save` was renamed to `mx codex archive` in PR \#284
(issue \#273). The `save` subcommand still works and accepts all the same
flags, but it prints a deprecation notice to stderr on every invocation:
```
note: `mx codex save` is deprecated; use `mx codex archive` instead.
```
The `save` alias is hidden from `--help` output. It will be removed in a
future release. Update scripts and muscle memory to use `mx codex archive`.]