zim-studio 1.5.0

A Terminal-Based Audio Project Scaffold and Metadata System
Documentation
# Design: `zim pdf` — Markdown Tree → PDF

## Intent

Add a `zim pdf` subcommand that walks the current directory, gathers every `.md`
file, and produces a single nicely-typeset PDF with a table of contents.
Primary use case: turning a zim audio project (sidecar `.md` files plus any
`README.md` / liner-notes) into a printable artifact. Secondary: any tree of
markdown docs.

The whole pipeline runs inside one `zim` invocation. The user never executes a
separate prep script — text munging, LaTeX assembly, and the LaTeX call all
happen behind one command.

## Constraints

- **One command, no helper scripts.** All markdown collection, frontmatter
  stripping, and LaTeX assembly is done in Rust inside the `pdf` handler. Only
  external call is the LaTeX engine itself.
- **Page breaks at directory boundaries only.** Files inside the same directory
  flow continuously. Each new directory starts on a fresh page.
- **Stable, deterministic ordering** so re-runs produce diff-friendly output:
  directories sorted alphabetically (depth-first), files within a directory
  sorted alphabetically, with `README.md` (if present) hoisted to the top of
  its directory.
- **Assumed installed:** MacTeX (or any TeX Live) providing `xelatex`. If not
  found on `PATH`, fail with a clear "install MacTeX" message and exit non-zero.
- **Out of scope (for v1):** custom themes, cover-page images, watermarking,
  windows/linux-specific install hints, embedding audio waveform renders,
  parallel directory processing.

## Approach

Pure-Rust pipeline: parse markdown with `pulldown-cmark`, emit LaTeX from a
small visitor, then shell out to `xelatex` once for compilation. Only external
dependency for the user is MacTeX. (Pandoc was considered and rejected — it
would add a second install.)

### Hierarchy mapping

The user's mental model drives the structure:

- **Directory → `\section`.** Named after the directory's `README.md` H1 if
  present, else the directory name. The README's *body* (with its own H1
  consumed as the section title) flows directly under the section heading —
  this is where concept/history prose lives.
- **Sidecar / non-README `.md``\subsection`.** Title pulled from YAML
  frontmatter `title:` if present, else the file stem. The sidecar body flows
  under the subsection heading, with any in-body headings demoted by two
  levels so they don't compete with the subsection.
- **Nested subdirectories** are flattened into their own top-level `\section`s
  with path-style names (e.g. `mixes/v2`). Avoids subsubsection sprawl and
  keeps the TOC two levels deep.

### Pipeline

1. **Walk** the tree single-threaded for deterministic ordering. Honor
   `.zimignore`. Skip hidden dirs (`.git`, `target`). Group entries by parent
   directory; sort directories alphabetically (depth-first), files
   alphabetically with `README.md` pulled out as the section's lead content.
2. **Per-file prep:**
   - Strip YAML frontmatter (`---`-delimited block at top); capture `title`
     for the section/subsection name.
   - Parse with `pulldown-cmark`.
   - Emit LaTeX via a visitor: headings (demoted), lists → `itemize` /
     `enumerate`, code → `Verbatim` (fancyvrb), inline code → `\texttt`, links
     `\href` from `hyperref`, images → `\includegraphics` if the path
     resolves relative to the source file, else alt text in italic.
3. **Document assembly** (Rust, written to a temp dir):
   ```
   \documentclass[11pt]{article}
   \usepackage{hyperref, graphicx, fancyvrb, geometry, parskip}
   \title{<--title or top-level README H1 or cwd basename>}
   \author{<--author or config.default_artist>}
   \begin{document}
   \maketitle
   \tableofcontents
   <root dir's files first, no clearpage — they sit under the TOC>
   <for each subdirectory, in order:>
     \clearpage
     \section{<README H1 or dir name>}
     <README body, if any>
     <for each non-README .md:>
       \subsection{<frontmatter title or file stem>}
       <converted body>
   \end{document}
   ```
   TOC populated automatically from `\section`/`\subsection`.
4. **Compile:** `xelatex -interaction=nonstopmode -halt-on-error
   -output-directory=<tmp> doc.tex`, run twice (second pass resolves TOC page
   numbers). Stream stderr on failure.
5. **Deliver:** copy `doc.pdf` to `--output` (default `<cwd-name>.pdf`).
   Clean up tempdir unless `--keep-tex` is set.

### CLI surface

```
zim pdf [PATH]                 # default: "."
  --output, -o <FILE>          # default: <basename(PATH)>.pdf
  --title <STR>                # default: directory basename
  --author <STR>               # default: config.default_artist
  --keep-tex                   # leave the .tex file next to the PDF
```

Wired in `src/main.rs` alongside other subcommands; handler at
`src/cli/pdf.rs`. New module `src/pdf/` holds the walk + LaTeX emitter so the
file stays under the ~200-line guideline.

## Domain Events

- **Consumes:** `.md` files on disk, `.zimignore` rules, `~/.config/zim/config.toml`
  (for default author).
- **Produces:** one `.pdf` artifact at `--output`. Optionally one `.tex` if
  `--keep-tex`. Stdout: progress spinner ("Walking…", "Rendering…",
  "Compiling LaTeX (pass 1/2)…"), then final `Wrote <path> (<n> files,
  <m> directories)`.
- **What must follow:** none. This is a pure read→write artifact; it does not
  mutate sidecars, the index, or config. No event is published for other
  commands to react to.

## Checkpoints

1. `zim pdf` in an empty tree errors clearly ("no .md files found").
2. `zim pdf` on a `mixes/` directory containing one `README.md` plus several
   track sidecars produces: one section titled from the README's H1, the
   README's prose immediately below it, then one subsection per track in
   alphabetical order — all on one continuous page block, no internal page
   breaks.
3. `zim pdf` on a multi-directory project (`master/`, `mixes/`, `bounces/`)
   produces one section per directory, each starting on a fresh page; the
   top-level README sits directly under the TOC with no leading clearpage.
4. Sidecar YAML frontmatter does not appear in rendered text; sidecar
   `title:` fields drive subsection names.
5. In-body headings inside a sidecar render below their `\subsection` (i.e.
   demoted), not at the same visual level.
6. Re-running on the same tree produces a diff-clean PDF — walk order is
   deterministic.
7. With MacTeX absent, the error names the missing binary and points to the
   install path; exit code is non-zero.