mx 0.1.159

A Swiss army knife for Claude Code and multi-agent toolkits
#import "lib.typ": *

#page-header(
  "Filesystem Layout",
  "Paths, environment variables, and configuration."
)

This page is the canonical reference for where `mx` reads and writes files,
which environment variables control those locations, and what to expect when
upgrading from an earlier release.

== Table of contents

- #link(<the-principle>)[The principle]
- #link(<layout>)[Layout]
- #link(<env-vars>)[Environment variables]
- #link(<migration>)[Migration & legacy fallbacks]
- #link(<renamed-commands>)[Renamed and removed CLI commands]
- #link(<examples>)[Examples]
- #link(<contributors>)[Notes for contributors]


// ═══════════════════════════════════════════════════════════════════════
// THE PRINCIPLE
// ═══════════════════════════════════════════════════════════════════════

== The principle <the-principle>

Every path `mx` touches derives from a single base directory: `$MX_HOME`, which
defaults to `~/.mx/`. Each subsystem owns a subdirectory beneath that base.

Overrides are layered. From most specific to least specific:

+ *Per-file* -- e.g. `MX_KV_SCHEMA`, `MX_KV_DATA`. These point at exact files
  (and may include a `{agent}` placeholder).
+ *Per-subsystem* -- e.g. `MX_SURREAL_ROOT`, `MX_CODEX_PATH`,
  `MX_ISOLATE_FASTEMBED`. These move one subsystem's root.
+ *Base* -- `MX_HOME`. Moves the entire tree at once.
+ *Default* -- `~/.mx/`.

A more specific override always wins. Setting `MX_KV_DATA=/etc/foo.json` keeps
that one file at `/etc/foo.json` regardless of `MX_HOME` or any subsystem
override.

The base path is hardcoded in exactly one place:
#link("https://github.com/coryzibell/mx/blob/main/src/paths.rs")[`src/paths.rs`].
Everything else asks `paths.rs` for its location.


// ═══════════════════════════════════════════════════════════════════════
// LAYOUT
// ═══════════════════════════════════════════════════════════════════════

== Layout <layout>

```
~/.mx/                          # base; override $MX_HOME
├── kv/
│   ├── schema/{agent}.toml     # override MX_KV_SCHEMA
│   └── data/{agent}.json       # override MX_KV_DATA
├── state/
│   └── schemas/{id}.yaml       # default ID: "tensor"; CLI --schema flag
├── memory/
│   ├── surreal/                # override MX_SURREAL_ROOT
│   ├── embed/                  # only when MX_ISOLATE_FASTEMBED is set
│   └── seed/
│       ├── agents/             # *.md (frontmatter)
│       └── knowledge/          # *.jsonl (markdown ingest tracked in #257)
├── codex/                      # override MX_CODEX_PATH
├── cache/sync/{owner-repo}/
├── artifacts/
└── swap/
```

=== `kv/` <layout-kv>

The KV store: per-agent schema (TOML) and data (JSON). Used by `mx kv` and any
agent that needs fast local state. Each agent gets one schema file and one data
file, keyed off the #link(<env-other>)[`MX_CURRENT_AGENT`] environment variable.

- `kv/schema/{agent}.toml` -- TOML schema declaring keys, types, and defaults.
- `kv/data/{agent}.json` -- JSON-serialized current values, written atomically.

Override either file with #link(<env-paths>)[`MX_KV_SCHEMA`] /
#link(<env-paths>)[`MX_KV_DATA`]. Both env vars accept the literal string
`{agent}`, which is substituted with the active agent name at resolution time.

=== `state/schemas/` <layout-state>

YAML (or JSON) schemas for the emotional-state tensor system used by
`mx state`. The default schema ID is `tensor`, resolving to
`state/schemas/tensor.yaml`.

Pick a different schema with `mx state ... --schema {id|path}`. The flag
argument is classified as a path or an ID by a simple heuristic, then looked
up:

- *Bare ID* (e.g. `--schema tensor`): the loader tries
  `state/schemas/{id}.yaml` first, then `state/schemas/{id}.yml`, then
  `state/schemas/{id}.json`. The first file that exists wins; if none exist,
  the lookup fails with a "schema not found" error.
- *Direct path* (e.g. `--schema /tmp/foo.json`): the file is loaded
  directly. Any extension is accepted; the parser tries YAML first and falls
  back to JSON based on file contents, not the extension.

The argument is classified as a path if it contains a `/` *or* ends with
`.yaml`, `.yml`, or `.json`; otherwise it is classified as a bare ID. A dot
elsewhere in the name is irrelevant -- only those three suffixes flip the
classification. So `--schema my.schema` is treated as an *ID* (no slash,
no recognized extension) and, if no matching file exists under
`state/schemas/`, fails with a "schema not found" error. To force a path
lookup of a dotted name without one of the recognized extensions, prefix it
with `./` (e.g. `--schema ./my.schema`) -- the slash flips it to path mode.

There is no env-var override for the schema choice anymore -- the old
`MX_STATE_SCHEMA` was replaced by the CLI flag. (See
#link(<renamed-commands>)[Renamed and removed CLI commands].)

=== `memory/` <layout-memory>

The knowledge graph backend and its inputs.

- `memory/surreal/` -- SurrealKV embedded database files. Override the whole
  directory with #link(<env-paths>)[`MX_SURREAL_ROOT`]. For network-mode
  SurrealDB, the directory is unused -- see the
  #link(<env-surreal>)[SurrealDB connection vars].
- `memory/embed/` -- only created when #link(<env-paths>)[`MX_ISOLATE_FASTEMBED`]
  is set. Holds FastEmbed model weights that would otherwise live in the shared
  XDG cache (`$XDG_CACHE_HOME/fastembed/`). Use this when you want mx's model
  cache isolated from other tools that also use FastEmbed.
- `memory/seed/agents/` -- markdown files with YAML frontmatter, one per agent.
  Loaded by `mx memory seed agents`.
- `memory/seed/knowledge/` -- one or more `*.jsonl` files. Loaded by
  `mx memory seed knowledge`, which scans the directory for every `.jsonl` it
  finds.

=== `codex/` <layout-codex>

Session archives written by `mx codex archive` -- transcripts, extracted images,
and per-archive manifests. Override with #link(<env-paths>)[`MX_CODEX_PATH`].
This is typically the largest directory in `~/.mx/`; point it at a roomier disk
if you archive a lot of sessions.

=== `cache/sync/{owner-repo}/` <layout-cache>

Per-repo cache directory used by `mx sync` to track GitHub issues and
discussions across runs. The repo slug replaces the `/` in `owner/repo` with
`-` (so `coryzibell/mx` becomes `coryzibell-mx`). Safe to delete -- it will be
rebuilt on the next `mx sync pull`.

=== `artifacts/` <layout-artifacts>

Generic output directory for handlers that need to drop a file somewhere
predictable but don't have a more specific home. Treat as ephemeral.

=== `swap/` <layout-swap>

Scratch space for in-flight operations. Cleared opportunistically; do not store
anything you want to keep.


// ═══════════════════════════════════════════════════════════════════════
// ENVIRONMENT VARIABLES
// ═══════════════════════════════════════════════════════════════════════

== Environment variables <env-vars>

=== Path overrides <env-paths>

#table(
  columns: (auto, auto, auto, 1fr),
  table.header([*Variable*], [*Type*], [*Default*], [*Overrides*]),
  [`MX_HOME`], [path], [`~/.mx/`], [The base directory; moves the entire tree],
  [`MX_SURREAL_ROOT`], [path], [`$MX_HOME/memory/surreal/`], [SurrealKV embedded-database root],
  [`MX_CODEX_PATH`], [path], [`$MX_HOME/codex/`], [Codex archive directory],
  [`MX_KV_SCHEMA`], [path template], [`$MX_HOME/kv/schema/{agent}.toml`], [KV schema file; `{agent}` placeholder substituted],
  [`MX_KV_DATA`], [path template], [`$MX_HOME/kv/data/{agent}.json`], [KV data file; `{agent}` placeholder substituted],
  [`MX_ISOLATE_FASTEMBED`], [boolean flag], [unset], [When non-empty, redirects FastEmbed cache from XDG to `$MX_HOME/memory/embed/`],
)

Empty-string values for the path overrides in this table (`MX_HOME`,
`MX_SURREAL_ROOT`, `MX_CODEX_PATH`, `MX_KV_SCHEMA`, `MX_KV_DATA`,
`MX_ISOLATE_FASTEMBED`) are treated as unset and fall back to the default.

The same is *not* uniformly true of other `MX_*` env vars. In particular,
the #link(<env-surreal>)[SurrealDB connection vars] below do not all filter
empty strings: setting `MX_SURREAL_USER=""` produces an empty username, not the
default `root`. Among the connection vars only `MX_SURREAL_PASS` /
`MX_SURREAL_PASS_FILE` are empty-filtered. To restore a default, *leave the
variable unset entirely* rather than setting it to an empty string.

The boolean flag (`MX_ISOLATE_FASTEMBED`) is "on" for any non-empty value
(`1`, `true`, `yes` -- it doesn't parse, it just checks for non-emptiness).

=== SurrealDB connection <env-surreal>

These configure the SurrealDB driver. They affect which database mx talks to,
not where files live on disk (with the exception of `MX_SURREAL_ROOT` above,
which is the embedded-mode storage location).

#table(
  columns: (auto, auto, auto, 1fr),
  table.header([*Variable*], [*Type*], [*Default*], [*Purpose*]),
  [`MX_SURREAL_MODE`], [enum], [`embedded`], [`embedded` for local SurrealKV, `network` for WebSocket],
  [`MX_SURREAL_URL`], [URL], [`ws://localhost:8000`], [WebSocket URL (network mode only)],
  [`MX_SURREAL_USER`], [string], [`root`], [Username],
  [`MX_SURREAL_PASS`], [secret], [unset], [Password (literal value)],
  [`MX_SURREAL_PASS_FILE`], [path], [unset], [Path to a file containing the password (e.g. an agenix secret); read when `MX_SURREAL_PASS` is unset],
  [`MX_SURREAL_NS`], [string], [`memory`], [Namespace],
  [`MX_SURREAL_DB`], [string], [`knowledge`], [Database name],
  [`MX_SURREAL_AUTH_LEVEL`], [enum], [`root`], [One of `root`, `namespace` (or `ns`), `database` (or `db`)],
)

=== GitHub App auth (sync) <env-github>

Optional. Only needed when `mx sync` runs against a private repo via a GitHub
App rather than via your personal `gh` token.

#table(
  columns: (auto, auto, auto, 1fr),
  table.header([*Variable*], [*Type*], [*Default*], [*Purpose*]),
  [`MX_GITHUB_APP_ID`], [string], [unset], [GitHub App ID],
  [`MX_GITHUB_INSTALLATION_ID`], [string], [unset], [App installation ID for the target org/user],
  [`MX_GITHUB_PRIVATE_KEY`], [secret (PEM)], [unset], [App private key, PEM-encoded],
)

=== Identity & display <env-other>

#table(
  columns: (auto, auto, auto, 1fr),
  table.header([*Variable*], [*Type*], [*Default*], [*Purpose*]),
  [`MX_CURRENT_AGENT`], [string], [unset], [Active agent identity. Required for `mx memory wake` and any command that reads/writes per-agent KV. Also the default for `--source-agent` on `mx memory add`],
  [`MX_USER_NAME`], [string], [`git config user.name`, else `"User"`], [Display name for "user" turns in codex transcripts. Resolution order: env var > git config > literal `"User"`],
  [`MX_ASSISTANT_NAME`], [string], [`"Orchestrator"`], [Display name for "assistant" turns in codex transcripts. No git fallback -- the default is the literal string `Orchestrator`],
)

=== Tuning <env-tuning>

#table(
  columns: (auto, auto, auto, 1fr),
  table.header([*Variable*], [*Type*], [*Default*], [*Purpose*]),
  [`MX_WAKE_CHUNK_BYTES`], [integer], [`28000`], [Maximum bytes per chunk during the wake-ritual presentation step. Values that fail to parse or are zero fall back silently to the default],
)

=== Removed <env-removed>

#table(
  columns: (auto, 1fr),
  table.header([*Variable*], [*Replacement*]),
  [`MX_MEMORY_PATH`], [Use #link(<env-paths>)[`MX_SURREAL_ROOT`] for just the database, or #link(<env-paths>)[`MX_HOME`] to move everything together. Setting the old name now emits a one-line stderr note and is otherwise ignored],
  [`MX_STATE_SCHEMA`], [Use the `mx state ... --schema {id|path}` CLI flag. The default schema ID also changed: it is now `tensor` (was `crewu`)],
)


// ═══════════════════════════════════════════════════════════════════════
// MIGRATION & LEGACY FALLBACKS
// ═══════════════════════════════════════════════════════════════════════

== Migration & legacy fallbacks <migration>

The path-alignment refactor (\#255, merged via PR \#259) moved several files
without breaking older installs. For one release cycle, mx will read from the
old locations as a soft fallback and emit a one-line `note:` to stderr telling
you what moved. *No data is lost. The warnings are informative, not errors.*

#table(
  columns: (1fr, 1fr, 1fr),
  table.header([*Legacy location*], [*New location*], [*Behavior*]),
  [`~/.crewu/kv/{agent}.schema.toml`, `~/.crewu/kv/{agent}.data.json`],
    [`$MX_HOME/kv/schema/{agent}.toml`, `$MX_HOME/kv/data/{agent}.json`],
    [Read-only fallback; consolidated stderr note fires once per process],
  [`$MX_HOME/agents/` (agent seed `*.md`)],
    [`$MX_HOME/memory/seed/agents/`],
    [Read-only fallback; stderr note when used],
  [`$MX_HOME/memory/index.jsonl` (knowledge seed)],
    [`$MX_HOME/memory/seed/knowledge/*.jsonl`],
    [Read-only fallback. This is a _shape_ change, not a rename: the old location was a single hardcoded file (`index.jsonl`); the new location is a directory scanned for every `*.jsonl` it finds. Stderr note when the legacy file is read],
  [`MX_MEMORY_PATH` env var],
    [`MX_SURREAL_ROOT` env var],
    [Old var *not honored*; setting it just triggers a rename note],
)

To silence the warnings, move the files (or rename the env var). The fallbacks
will be removed in a future release. To track the removal in source, grep for
`TODO(*-migration)` and `TODO(memory-path-rename-note)` in the codebase.


// ═══════════════════════════════════════════════════════════════════════
// RENAMED AND REMOVED CLI COMMANDS
// ═══════════════════════════════════════════════════════════════════════

== Renamed and removed CLI commands <renamed-commands>

#table(
  columns: (auto, auto, 1fr),
  table.header([*Old*], [*New*], [*Notes*]),
  [`mx agents seed`], [`mx memory seed agents`], [Old form still parses but bails with a one-line pointer to the new command],
  [`mx memory import`], [`mx memory seed knowledge`], [Now scans a directory; loads every `*.jsonl` it finds rather than a single hardcoded file],
  [`mx memory rebuild`], [(removed)], [Reindexing moved out of the user-facing surface; see issue \#258 for `mx doctor memory rebuild`],
  [`mx state ... --env-MX_STATE_SCHEMA`], [`mx state ... --schema {id|path}`], [CLI flag replaces the env var; accepts a bare schema ID or a direct path],
  [`codex save`], [`codex archive`], [Renamed for clarity],
  [`session export`], [`codex export`], [Moved under the codex subcommand],
)

#deprecated[The old command names in this table still parse in current builds
but emit a pointer to their replacement. They will be removed in a future
release.]


// ═══════════════════════════════════════════════════════════════════════
// EXAMPLES
// ═══════════════════════════════════════════════════════════════════════

== Examples <examples>

Move the entire mx tree to a different disk:

```bash
export MX_HOME=/data/mx
```

Keep mx's defaults but put the SurrealDB store on a fast SSD:

```bash
export MX_SURREAL_ROOT=/mnt/ssd/mx-surreal
```

Use a custom KV schema for the `inkwell` agent (without overriding the data
file):

```bash
export MX_KV_SCHEMA=/etc/mx/inkwell-schema.toml
```

Use a path template that resolves per-agent (one variable, many agents):

```bash
export MX_KV_SCHEMA='/etc/mx/schemas/{agent}.toml'
export MX_KV_DATA='/var/lib/mx/{agent}.json'
```

Isolate the FastEmbed model cache so it doesn't share with other tools:

```bash
export MX_ISOLATE_FASTEMBED=1
# Models will now download into $MX_HOME/memory/embed/
```

Encode a tensor. The first form uses the default `tensor` schema; the second
points at an explicit schema file:

```bash
mx state encode --dimensions "temp=0.8 entropy=0.75 agency=0.4"
mx state encode --schema /tmp/myschema.yaml -d "temp=0.5"
```

To target a non-default schema by ID, drop a YAML file at
`$MX_HOME/state/schemas/{id}.yaml` and pass `--schema {id}`. The bare-ID
form is what the lookup helper handles; for an absolute or relative file
path, just pass the path directly (see
#link(<layout-state>)[`state/schemas/`] for the path-vs-ID heuristic).

Point SurrealDB at a remote network instance:

```bash
export MX_SURREAL_MODE=network
export MX_SURREAL_URL=ws://surreal.internal:8000
export MX_SURREAL_USER=mx
export MX_SURREAL_PASS_FILE=/run/agenix/mx-surreal-pass
```


// ═══════════════════════════════════════════════════════════════════════
// NOTES FOR CONTRIBUTORS
// ═══════════════════════════════════════════════════════════════════════

== Notes for contributors <contributors>

Every path in mx routes through
#link("https://github.com/coryzibell/mx/blob/main/src/paths.rs")[`src/paths.rs`].
New helpers follow the `_with(env_val: Option<&str>, home: &Path)` test-seam
pattern -- see `paths::codex_dir_with` for the canonical example. The pattern
keeps resolution logic pure: tests call the `_with` variant directly with
explicit arguments and never mutate process env state, so the suite runs safely
in parallel.

Two rules:

+ Do not call `dirs::home_dir()` outside `src/paths.rs`. If you need a
  home-relative path, add a helper to `paths.rs` and call it from your module.
  `paths.rs` itself is the _only_ legitimate caller of `dirs::home_dir()` in
  the tree. It uses it for: `mx_home()` (the `~/.mx/` default),
  `legacy_crewu_kv_schema_path` and `legacy_crewu_kv_data_path` (the legacy
  fallbacks for the kv migration), and `claude_projects_dir` /
  `claude_config_path` (read-only locations owned by another tool, Claude).
  Anything new that needs `home_dir()` -- including helpers for paths owned
  by other tools -- belongs in `paths.rs` too, so this rule stays absolute
  everywhere else. Do not "fix" the existing calls for consistency; they are
  the carve-out.
+ Do not read `MX_*` env vars in handlers if a path helper already encapsulates
  that override. Add the env-var read inside the helper instead, behind the
  `_with` seam.