Skip to main content

Module transcript

Module transcript 

Source
Expand description

Claude Code session transcript parser.

Claude Code persists every session as line-delimited JSON under ~/.claude/projects/<sanitized-cwd>/<session-id>.jsonl. The exact schema isn’t formally documented and evolves between Claude Code releases; we therefore use a permissive parser that:

  1. Reads each line as a generic serde_json::Value.
  2. Maps known shapes onto TranscriptEntry variants.
  3. Skips unknown / malformed lines with a stderr warn (mirroring LifecycleStore::read_all policy).

§Recognized shapes (Claude Code 2026-04+)

  • {"type":"user","message":{"role":"user","content":<str|arr>}}
  • {"type":"assistant","message":{"role":"assistant","content":<str|arr>}}
  • {"type":"tool_use","name":<str>,"input":<obj>} (and the nested-in-assistant-content variant)
  • {"type":"tool_result","content":<str|arr>}
  • everything else → TranscriptEntry::Other (preserved verbatim so distill heuristics can still see the raw shape if needed)

Each entry exposes a normalized text() view (concatenated string content) so heuristics don’t have to re-walk the message tree.

§What we deliberately do NOT do

  • We don’t try to reconstruct turn boundaries (the assistant may stream multiple assistant rows for one turn; heuristics handle that).
  • We don’t merge tool_use / tool_result pairs — the distill layer does, after redaction.
  • We don’t load the whole file into memory upfront for huge sessions — we provide a streaming iterator (stream) too.

Enums§

TranscriptEntry
One parsed line from a transcript file.

Functions§

find_latest_for_cwd
Find the most recently modified .jsonl transcript under project_dir_for(cwd, home). Returns None when:
parse_line
Parse a single JSONL line. Returns None for malformed JSON; returns Some(Other) for parseable JSON we don’t recognize so the caller can still inspect it.
project_dir_for
Resolve the directory Claude Code uses for transcripts of cwd.
read_all
Parse the entire transcript at path into memory. Returns Ok with the parsed prefix even when corrupt lines are encountered (those are skipped and reported to stderr).
read_tail
Parse at most max_lines raw lines from the end of the transcript. Useful for distill heuristics that only care about recent turns — avoids loading multi-MB transcripts in full.