clawgs 0.2.0

Extract structured JSON snapshots from Claude Code and Codex JSONL transcripts
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
# clawgs

<div align="center">

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/build000r/clawgs/blob/main/LICENSE)
[![Rust](https://img.shields.io/badge/Rust-1.85%2B-orange.svg)](https://www.rust-lang.org/)
[![crates.io](https://img.shields.io/crates/v/clawgs.svg)](https://crates.io/crates/clawgs)
[![Protocol](https://img.shields.io/badge/protocol-clawgs.emit.v2-blue.svg)](https://github.com/build000r/clawgs/blob/main/references/emit-protocol-v2.md)

</div>

Turn Claude Code and Codex transcripts into stable JSON snapshots, then replay the status protocol locally with a built-in zero-config demo.

<div align="center">

**Quick Start**

```bash
cargo install clawgs
clawgs demo extract --tool codex --pretty
```

</div>

## TL;DR

### The Problem

Agent transcripts are useful, but they are verbose, tool-specific, and usually trapped inside private JSONL logs, tmux panes, or one-off shell glue. That makes them awkward to inspect manually and annoying to integrate into status views, dashboards, or downstream automations.

### The Solution

`clawgs` normalizes Claude Code and Codex session logs into a small, stable JSON contract, including deterministic `action_cues` for transcript-backed attention facts, and it exposes the live thought-emission protocol over NDJSON. The new `demo` command makes the whole thing legible from a clean machine with embedded, sanitized examples in [examples/demo](https://github.com/build000r/clawgs/tree/main/examples/demo).

### Why Use `clawgs`?

| Feature | What It Does |
| --- | --- |
| `demo extract` | Replays a built-in transcript corpus and shows the exact normalized `clawgs.v2` output without needing private logs |
| `demo emit` | Shows a real `hello -> sync -> sync_result` exchange without model credentials or tmux |
| `extract` | Normalizes Claude/Codex JSONL into one compact machine-readable snapshot with parser-derived `action_cues` |
| `emit --stdio` | Speaks a small NDJSON protocol for downstream status reporters, including live `action_cues` |
| `tmux-emit` | Scans live tmux panes, infers context, and emits only changed thoughts |

## Quick Example

Install from crates.io:

```bash
cargo install clawgs

# See the built-in Codex transcript -> snapshot pair
clawgs demo extract --tool codex --pretty

# See the built-in Claude transcript -> snapshot pair
clawgs demo extract --tool claude --pretty

# See the canonical emit protocol exchange, no backend creds required
clawgs demo emit --pretty

# Parse a real local transcript by discovery
clawgs extract --tool auto --cwd "$PWD"

# Run the live stdio daemon
clawgs emit --stdio
```

Or build from source:

```bash
git clone https://github.com/build000r/clawgs && cd clawgs
# Requires Rust 1.85 or newer.
bash scripts/install.sh
bash scripts/check.sh
target/release/clawgs demo extract --tool codex --pretty
```

Protocol details live in [references/emit-protocol-v2.md](https://github.com/build000r/clawgs/blob/main/references/emit-protocol-v2.md), with a machine-validatable JSON Schema at [references/clawgs.emit.v2.schema.json](references/clawgs.emit.v2.schema.json). The extract schema lives in [references/schema-v2.md](https://github.com/build000r/clawgs/blob/main/references/schema-v2.md), with JSON Schema at [references/clawgs.v2.schema.json](references/clawgs.v2.schema.json).

## Design Philosophy

### 1. Public Before Private

If a stranger cannot understand the project without your personal logs, the project is not really open source yet. `clawgs demo` exists to make the core value visible without your environment.

### 2. Stable Contracts Beat Ad Hoc Log Scraping

The point is not to expose raw transcripts. The point is to collapse them into a compact contract that downstream tools can depend on.

### 3. Honest Surface Area

`clawgs` is a parser, protocol, and tmux bridge. It is not pretending to be a general observability platform, a transcript database, or a hosted service.

### 4. Real Examples Over Abstract Claims

The checked-in corpus in [examples/demo](https://github.com/build000r/clawgs/tree/main/examples/demo) and the reference docs in [references](https://github.com/build000r/clawgs/tree/main/references) are part of the product surface, not afterthoughts.

## Comparison

| Approach | Zero-config demo | Stable schema | Live protocol | tmux bridge | Good fit |
| --- | --- | --- | --- | --- | --- |
| `clawgs` | Yes | Yes | Yes | Yes | You want snapshots plus status emission from agent sessions |
| Raw JSONL files | No | No | No | No | You only want archival logs and are fine hand-parsing them |
| Ad hoc `jq` / `rg` scripts | No | Partial | No | Partial | You need a one-off local script and do not care about reuse |
| Custom tmux hook glue | No | No | Partial | Yes | You only need pane polling and are willing to maintain bespoke scripts |

**When to use `clawgs`:**
- You need a consistent JSON snapshot from Claude/Codex logs.
- You want a replayable contract for demos, tests, or downstream tools.
- You want tmux polling and thought emission without rewriting the parsing logic.

**When `clawgs` is not the right tool:**
- You need full transcript storage, search, or analytics.
- You need a hosted backend or multi-user service.
- You want Homebrew, npm, PyPI, or a curl installer today; only `cargo install` is wired up so far.

For the deeper thesis, see [docs/VISION.md](https://github.com/build000r/clawgs/blob/main/docs/VISION.md): mission, vision, values, competitive fit, and why this project intentionally stops short of becoming a dashboard, a platform, or a general-purpose agent framework.

## Installation

### From crates.io

```bash
cargo install clawgs
```

That gets you a `clawgs` binary on your `PATH` with no repo checkout required.

### From Source

```bash
git clone https://github.com/build000r/clawgs && cd clawgs
bash scripts/install.sh
bash scripts/check.sh
```

That builds `target/release/clawgs` and verifies the binary with a smoke test. You can also run `cargo install --path .` from inside a checkout.

### What Is Not Published Yet

There is no Homebrew formula, no npm package, no PyPI package, and no curl installer yet. crates.io is currently the only published distribution channel.

## Quick Start

1. Clone the repo and build the release binary.

```bash
bash scripts/install.sh
```

2. Prove the project works from a clean machine shape.

```bash
target/release/clawgs demo extract --tool codex --pretty
target/release/clawgs demo emit --pretty
```

3. Parse a real transcript file directly.

```bash
target/release/clawgs extract --tool codex --input tests/fixtures/codex-sample.jsonl --pretty
```

4. Let `clawgs` auto-discover the newest log for the current project.

```bash
target/release/clawgs extract --tool auto --cwd "$PWD"
```

5. If you want live pane updates, wire tmux to the checked-in hook snippet.

```tmux
source-file "/path/to/clawgs/references/tmux-clawgs.conf"
```

That snippet lives in [references/tmux-clawgs.conf](https://github.com/build000r/clawgs/blob/main/references/tmux-clawgs.conf).

## Commands

### `clawgs demo extract`

Shows the embedded corpus plus the extracted output it produces.

```bash
target/release/clawgs demo extract --tool codex --pretty
target/release/clawgs demo extract --tool claude --pretty
```

### `clawgs demo emit`

Shows a canonical `hello`, `sync`, and `sync_result` exchange with no backend setup.

```bash
target/release/clawgs demo emit --pretty
```

### `clawgs extract`

Parses a real JSONL transcript into one `clawgs.v2` document.

```bash
target/release/clawgs extract --tool auto --cwd "$PWD"
target/release/clawgs extract --tool codex --input tests/fixtures/codex-sample.jsonl --pretty
target/release/clawgs extract --tool claude --include-raw --input tests/fixtures/claude-sample.jsonl
```

### `clawgs emit --stdio`

Runs the live NDJSON daemon over stdin/stdout.

```bash
target/release/clawgs emit --stdio
```

Send one JSON `sync` message per line on stdin and read `sync_result` lines from stdout.

### `clawgs tmux-emit`

Scans live tmux panes, reconciles snapshots, and emits the same NDJSON envelope used by `emit --stdio`.

```bash
target/release/clawgs tmux-emit --once
target/release/clawgs tmux-emit --interval-ms 60000
```

### `clawgs tmux-notify`

Pokes the tmux daemon socket so a hook can trigger an immediate rescan.

```bash
target/release/clawgs tmux-notify --event session-created
```

### `clawgs defaults`

Prints resolved daemon defaults as JSON.

```bash
target/release/clawgs defaults
```

## Configuration

### Extract Tuning Flags

`extract` and `demo extract` share the same output-shaping flags:

```bash
target/release/clawgs demo extract --tool codex --max-actions 5 --max-task-chars 120 --max-detail-chars 60
```

### Thought Config JSON

`tmux-emit --config-json` accepts the same thought-config shape used by the stdio protocol:

```json
{
  "enabled": true,
  "model": "",
  "backend": "",
  "cadence_hot_ms": 15000,
  "cadence_warm_ms": 45000,
  "cadence_cold_ms": 120000,
  "agent_prompt": null,
  "terminal_prompt": null
}
```

### Environment Variables

| Variable | Purpose |
| --- | --- |
| `CLAWGS_MODEL_BACKEND` | Selects `openrouter`, `claude`, or `codex` for live emit calls |
| `OPENROUTER_API_KEY` | Enables the OpenRouter backend |
| `SWIMMERS_THOUGHT_MODEL`, `SWIMMERS_THOUGHT_MODEL_2`, `SWIMMERS_THOUGHT_MODEL_3` | Override live thought models in priority order |
| `CLAWGS_CODEX_BIN` | Override the `codex` binary path |
| `CLAWGS_CODEX_REASONING_EFFORT` | Override Codex CLI reasoning effort |
| `CLAWGS_CODEX_VERBOSITY` | Override Codex CLI verbosity |
| `CLAWGS_CODEX_WORKDIR` | Override the workdir used for Codex CLI calls |
| `CLAWGS_CLAUDE_BIN` | Override the `claude` binary path |
| `CLAWGS_CLAUDE_MAX_BUDGET` | Override the Claude CLI max budget |
| `CLAWGS_TMUX_BIN` | Override the `tmux` binary path |
| `CLAWGS_TMUX_SOCKET` | Override the tmux notify socket path |

The demo commands do not require any of the variables above.

## Architecture

```text
┌───────────────────────────────────────────────────────────────┐
│ Inputs                                                        │
│ - embedded demo corpus                                        │
│ - Claude/Codex JSONL logs                                     │
│ - live tmux panes                                             │
└───────────────────────────────────────────────────────────────┘
                             │
                             ▼
┌───────────────────────────────────────────────────────────────┐
│ Normalization Layer                                            │
│ - Claude parser                                                │
│ - Codex parser                                                 │
│ - discovery / source resolution                               │
└───────────────────────────────────────────────────────────────┘
                             │
                             ▼
┌───────────────────────────────────────────────────────────────┐
│ Stable Contracts                                               │
│ - `clawgs.v2` extract snapshot                                │
│ - `clawgs.emit.v2` hello/sync/sync_result protocol            │
└───────────────────────────────────────────────────────────────┘
                             │
            ┌────────────────┴────────────────┐
            ▼                                 ▼
┌──────────────────────────────┐ ┌──────────────────────────────┐
│ Human-facing demo surface    │ │ Live status surface          │
│ - `demo extract`             │ │ - `emit --stdio`             │
│ - `demo emit`                │ │ - `tmux-emit` / notify hooks │
└──────────────────────────────┘ └──────────────────────────────┘
```

## Troubleshooting

### `no Claude or Codex transcript JSONL found`

Auto-discovery only works if the expected session logs exist in your local tool directories.

```bash
target/release/clawgs demo extract --tool codex --pretty
target/release/clawgs extract --tool codex --input path/to/session.jsonl --pretty
```

### `emit requires --stdio`

`emit` is intentionally protocol-only.

```bash
target/release/clawgs emit --stdio
```

### `OPENROUTER_API_KEY not set`

That only affects live emit backends. The demo path does not need credentials.

```bash
target/release/clawgs demo emit --pretty
target/release/clawgs defaults
```

### `--max-actions must be greater than 0`

The extract output-shaping limits must stay positive.

```bash
target/release/clawgs demo extract --tool codex --max-actions 5
```

### `tmux list-panes failed`

Make sure tmux is installed and a server is running before using the live pane scanner.

```bash
tmux ls
target/release/clawgs tmux-emit --once
```

## Limitations

- `clawgs` only understands the Claude and Codex transcript shapes it has been taught so far.
- The built-in demo corpus is representative, not exhaustive.
- `tmux-emit` and `tmux-notify` are Unix/tmux-centric; they are not a cross-platform pane abstraction.
- crates.io is the only published distribution channel so far; Homebrew, npm, PyPI, and a curl installer are not wired up yet.
- Live thought emission can use external backends if you choose them; the repo does not pretend those calls are offline.

## Release Process

`clawgs` publishes through crates.io. Release history lives in
[CHANGELOG.md](CHANGELOG.md), and the tag/publish contract lives in
[RELEASE.md](RELEASE.md). The GitHub Actions release workflow verifies format,
clippy, tests, package contents, and `cargo publish --dry-run` before publishing
tagged `v*.*.*` releases with `CARGO_REGISTRY_TOKEN`.

## FAQ

### Is `clawgs` a log viewer?

No. It is a normalizer and protocol layer. It turns verbose session state into smaller, more stable contracts.

### Does `clawgs demo` use my local transcripts?

No. It uses the checked-in sanitized corpus in [examples/demo](https://github.com/build000r/clawgs/tree/main/examples/demo).

### Does `demo emit` call OpenRouter, Claude, or Codex?

No. The demo replay is local and self-contained. Live `emit` behavior is where backend selection matters.

### Can I use this without tmux?

Yes. `extract`, `demo`, `emit --stdio`, and `defaults` do not require tmux.

### Can I inspect the exact schema and protocol?

Yes. See [references/schema-v2.md](https://github.com/build000r/clawgs/blob/main/references/schema-v2.md), [references/clawgs.v2.schema.json](references/clawgs.v2.schema.json), [references/emit-protocol-v2.md](https://github.com/build000r/clawgs/blob/main/references/emit-protocol-v2.md), and [references/clawgs.emit.v2.schema.json](references/clawgs.emit.v2.schema.json).

### Is the demo corpus the same thing as the tests?

They are related, but the public corpus in [examples/demo](https://github.com/build000r/clawgs/tree/main/examples/demo) exists for onboarding and documentation, not just for internal regression coverage.

## About Contributions

*About Contributions:* Please don't take this the wrong way, but I do not accept outside contributions for any of my projects. I simply don't have the mental bandwidth to review anything, and it's my name on the thing, so I'm responsible for any problems it causes; thus, the risk-reward is highly asymmetric from my perspective. I'd also have to worry about other "stakeholders," which seems unwise for tools I mostly make for myself for free. Feel free to submit issues, and even PRs if you want to illustrate a proposed fix, but know I won't merge them directly. Instead, I'll have Claude or Codex review submissions via `gh` and independently decide whether and how to address them. Bug reports in particular are welcome. Sorry if this offends, but I want to avoid wasted time and hurt feelings. I understand this isn't in sync with the prevailing open-source ethos that seeks community contributions, but it's the only way I can move at this velocity and keep my sanity.

## License

MIT. See [LICENSE](https://github.com/build000r/clawgs/blob/main/LICENSE).