mx 0.1.174

A Swiss army knife for Claude Code and multi-agent toolkits
#import "lib.typ": *

#page-header(
  "Codex",
  "Session archival, export, and retrieval."
)

== Overview

The codex is the permanent archive for Claude Code session transcripts. Every
time you interact with Claude, the conversation is recorded as a JSONL file
under `~/.claude/projects/`. Those files are ephemeral -- Claude can overwrite
or rotate them at any time. The codex captures them into a stable, searchable
archive before they disappear.

Each archived session produces a directory containing:

- `manifest.json` -- metadata (timestamps, message count, agent count, project path, checksum)
- `session.jsonl` -- the raw transcript (omitted in `--clean` mode)
- `conversation.md` -- a clean markdown rendering of the conversation (generated in `--clean` mode, or via `migrate --clean`)
- `agents/` -- sub-agent JSONL files (when sub-agents were used)
- `images/` -- extracted base64 images (pulled out during migration or archive)

Archives are stored under `$MX_CODEX_PATH` (defaults to `$MX_HOME/codex/`),
one directory per session, named with a timestamp and short session ID.

== Archiving sessions

#command(
  "mx codex archive",
  [Archive session transcripts to permanent storage. With no arguments,
  archives the most recent non-agent session. With `--all`, walks
  `~/.claude/projects/` and archives every session not already in the codex.
  Already-archived sessions are skipped (idempotent).

  After archiving, the by-project index is rebuilt so subsequent reads can
  locate sessions by project name.],
  flags: (
    ([`[PATH]`], [positional], [Path to a specific session JSONL file. Conflicts with `--all` and `--backfill`.]),
    ([`--all`], [flag], [Archive all unarchived sessions. Conflicts with `--backfill`.]),
    ([`--clean`], [flag], [Save only `conversation.md`, `manifest.json`, and extracted images. Omits the raw JSONL and agent files. Produces a smaller, human-readable archive.]),
    ([`--include-agents`], [flag], [Fold sub-agent transcripts into `conversation.md`. Requires `--clean` and requires `subagents` in `--include` (the default).]),
    ([`--include <LIST>`], [string], [Comma-separated list of source artifacts to capture. Recognized tokens: `subagents` (default), `mcp`, `tool-output`, `history`, `all`, `none`. See below.]),
    ([`--backfill [VAULT_PATH]`], [optional path], [Ingest legacy vault snapshots into the codex. See _Backfill_ section. Conflicts with `--all` and `[PATH]`.]),
  ),
  examples: (
    "mx codex archive",
    "mx codex archive --all",
    "mx codex archive --clean --include-agents",
    "mx codex archive --all --clean --include subagents,mcp",
    "mx codex archive /path/to/specific-session.jsonl",
  ),
)

=== The --include flag

The `--include` flag controls which optional source artifacts are captured
alongside the session transcript. Tokens are case-insensitive and
comma-separated.

#table(
  columns: (auto, auto, auto),
  table.header([*Token*], [*Default*], [*Description*]),
  [`subagents`], [ON], [Copy sub-agent JSONL files into `agents/`.],
  [`mcp`], [OFF], [Capture MCP server logs.],
  [`tool-output`], [OFF], [Capture `/tmp` tool output snapshots.],
  [`history`], [OFF], [Capture a slice of `~/.claude/history.jsonl`.],
  [`all`], [--], [Enable all of the above.],
  [`none`], [--], [Disable all of the above.],
)

Passing both `all` and `none` in the same value is rejected as a user error.
Unknown tokens print a warning and are skipped. The hyphenated `tool-output`
is the canonical spelling; `tool_output` (with underscore) is not recognized.

#note[The `--include` flag on `archive` governs which source files are _captured_
into the archive. The separate `--include` flag on `export` governs which
captured artifacts are _rendered_ into the output. They share token names but
serve different purposes.]

=== Clean mode

When `--clean` is passed, the archive stores a rendered `conversation.md`
instead of the raw session JSONL. The transcript is generated by:

1. Extracting `user` and `assistant` messages from the JSONL
2. Stripping `<system-reminder>` blocks from user messages
3. Dropping tool-use blocks, tool results, and non-conversation message types
4. Labeling speakers with configurable names (`MX_USER_NAME` env var, or git `user.name`, falling back to "User"; `MX_ASSISTANT_NAME` env var, falling back to "Orchestrator")

With `--include-agents`, sub-agent transcripts are appended to
`conversation.md` under `## Agent: <name>` headings, separated by horizontal
rules. Agent names are resolved from the parent session's `subagent_type`
field when available, falling back to the hex ID from the filename.

== Backfill

The `--backfill` flag migrates historical session data from the legacy vault
(`~/.wonka/vault/archives/`) into the codex.

#command(
  "mx codex archive --backfill",
  [Walk every `session-*` snapshot under the vault path and feed each session
  JSONL through the standard archive pipeline. The vault path defaults to
  `~/.wonka/vault/archives/` when no value is given.

  Backfill is idempotent: re-running against the same vault produces the same
  codex state. Sessions already present in the codex (matched by session ID
  derived from the JSONL filename) are skipped. Per-session failures are
  non-fatal -- errors are accumulated and reported at the end so a single
  corrupt file does not abort the entire run.

  The `--clean` and `--include` flags still apply during backfill, governing
  what each per-session archive captures.],
  flags: (
    ([`--backfill [VAULT_PATH]`], [optional path], [Path to the vault archives directory. Defaults to `~/.wonka/vault/archives/`.]),
  ),
  examples: (
    "mx codex archive --backfill",
    "mx codex archive --backfill /custom/vault/path",
    "mx codex archive --backfill --clean --include-agents",
  ),
)

The expected vault layout is:

```
<vault_path>/
  session-YYYYMMDD-HHMMSS-NNNNNN/
    projects/
      <project-slug>/
        <session-uuid>.jsonl
        <session-uuid>/
          subagents/
            agent-*.jsonl
```

#tip[When a vault exists at the default path, every `mx codex` command prints a
reminder to run `mx codex archive --backfill`. The reminder is suppressed when
you are already running backfill.]

== Exporting

#command(
  "mx codex export",
  [Export archived sessions as Markdown or structured JSON. Content is read
  exclusively from the codex -- live `~/.claude/projects/` data is never
  ingested directly. If unarchived sessions are detected, a warning is printed
  to stderr (unless `--archive-first` is passed).

  With no selector flags, exports the most recent archived session. Selectors
  are mutually exclusive: at most one of `--session`, `--project`, or `--date`
  may be passed.],
  flags: (
    ([`--session <UUID>`], [string], [Select by session UUID (full or unique prefix).]),
    ([`--project <QUERY>`], [string], [Filter by project: absolute path, cwd-encoded slug, or basename. Ambiguous basenames list collisions and exit non-zero.]),
    ([`--date <RANGE>`], [string], [Date selector. Accepts `YYYY-MM-DD`, `YYYY-MM-DD..YYYY-MM-DD`, or `YYYY-MM`.]),
    ([`--format <FMT>`], [string], [Output format: `markdown` (default), `json`, or `both`. `both` requires `--output`.]),
    ([`--include <LIST>`], [string], [Comma-separated content to render. Default: `subagents`. Tokens: `subagents`, `tools`, `system-reminders`, `mcp`, `tool-output`, `history`, `all`, `none`.]),
    ([`--archive-first`], [flag], [Run `mx codex archive --all` before exporting. Suppresses the unarchived-data warning.]),
    ([`-o, --output <PATH>`], [path], [Output file. Default: stdout. Required when `--format both`.]),
  ),
  examples: (
    "mx codex export",
    "mx codex export --session abc123",
    "mx codex export --project mx --format json",
    "mx codex export --date 2026-04 -o april.md",
    "mx codex export --date 2026-04-01..2026-04-15 --format both -o sessions",
    "mx codex export --archive-first --include all -o full.md",
  ),
)

=== Format: both

When `--format both` is used with `--output`, two sidecar files are written:

- If the path ends in `.json`: JSON goes to the path, markdown goes to `<path>.md`
- If the path ends in `.md`: markdown goes to the path, JSON goes to `<path>.json`
- Otherwise: both extensions are appended (`<path>.json` and `<path>.md`)

Using `--format both` without `--output` is rejected -- there is no clean way
to multiplex two formats on stdout.

=== Export --include tokens

The export `--include` set is distinct from the archive `--include` set. It
controls which content is _rendered_, not which is _captured_. Two additional
tokens are available for export only:

#table(
  columns: (auto, auto),
  table.header([*Token*], [*Description*]),
  [`tools`], [Render `tool_use` blocks (assistant tool invocations).],
  [`system-reminders`], [Render `<system-reminder>` blocks.],
)

The remaining tokens (`subagents`, `mcp`, `tool-output`, `history`, `all`,
`none`) behave identically to the archive side.

== Browsing

=== List

#command(
  "mx codex list",
  [List archived sessions. By default, shows only the latest version of each
  session (filtering out incremental re-archives). Output is a table with
  columns: archive ID, archived timestamp, message count, agent count, and
  size.],
  flags: (
    ([`--all`], [flag], [Show all archives including incremental saves.]),
    ([`--json`], [flag], [Output as JSON array.]),
  ),
  examples: (
    "mx codex list",
    "mx codex list --all",
    "mx codex list --json",
  ),
)

=== Read

#command(
  "mx codex read <ID>",
  [Read an archived session by its short archive ID (from `mx codex list`).
  The ID is matched as a substring against archive directory names.

  By default, outputs the raw transcript. With `--clean`, outputs the
  rendered `conversation.md` (errors if no clean transcript exists -- use
  `mx codex migrate --clean` to generate one). With `--human`, pretty-prints
  JSONL as labeled User/Assistant blocks.],
  flags: (
    ([`<ID>`], [positional], [Archive ID (short UUID from `list`).]),
    ([`--human`], [flag], [Pretty-print JSONL in human-readable format. Conflicts with `--clean`.]),
    ([`--agents`], [flag], [Include agent transcripts in the output.]),
    ([`--grep <PATTERN>`], [string], [Filter output to lines matching the pattern.]),
    ([`--json`], [flag], [Output the manifest as JSON.]),
    ([`--clean`], [flag], [Read the clean markdown transcript (`conversation.md`). Conflicts with `--human`.]),
  ),
  examples: (
    "mx codex read abc12345",
    "mx codex read abc12345 --clean",
    "mx codex read abc12345 --clean --agents",
    "mx codex read abc12345 --human",
    "mx codex read abc12345 --grep \"migration\"",
    "mx codex read abc12345 --json",
  ),
)

=== Search

#command(
  "mx codex search <PATTERN>",
  [Search all archived sessions for a text pattern. Scans both
  `conversation.md` (preferred) and `session.jsonl` (fallback) in each
  archive. Reports matching archive IDs and the lines that contain the
  pattern. Archives with no transcript file are skipped with a count
  reported to stderr.],
  flags: (
    ([`<PATTERN>`], [positional], [Text pattern to search for.]),
    ([`--json`], [flag], [Output matches as JSON.]),
  ),
  examples: (
    "mx codex search \"retry logic\"",
    "mx codex search migration --json",
  ),
)

== Migration

#command(
  "mx codex migrate",
  [Upgrade archive formats. Archives below the current schema version are
  migrated forward.

  Without `--clean`, the primary migration is image extraction: v1 archives
  have base64 images embedded in the JSONL. Migration extracts them to
  `images/` files and rewrites the JSONL with references. Older archives
  also receive a metadata-only version bump to the current schema (v5).
  Original files are backed up as `*.bak`.

  With `--clean`, generates `conversation.md` for archives that have
  `session.jsonl` but no clean transcript. This is useful for retroactively
  adding human-readable transcripts to archives created before clean mode
  existed.],
  flags: (
    ([`--dry-run`], [flag], [Show what would be migrated without making changes.]),
    ([`--detailed`], [flag], [Show detailed progress for each archive.]),
    ([`--clean`], [flag], [Generate `conversation.md` for archives missing a clean transcript.]),
    ([`--include-agents`], [flag], [Include sub-agent transcripts in generated clean transcripts. Requires `--clean`.]),
  ),
  examples: (
    "mx codex migrate --dry-run",
    "mx codex migrate --detailed",
    "mx codex migrate --clean",
    "mx codex migrate --clean --include-agents --detailed",
  ),
)

== Deprecated alias

#deprecated[`mx codex save` was renamed to `mx codex archive` in PR \#284
(issue \#273). The `save` subcommand still works and accepts all the same
flags, but it prints a deprecation notice to stderr on every invocation:

```
note: `mx codex save` is deprecated; use `mx codex archive` instead.
```

The `save` alias is hidden from `--help` output. It will be removed in a
future release. Update scripts and muscle memory to use `mx codex archive`.]