prx 0.5.9

Praxis — agent-native Unix tools. Single binary replacing grep, cat, find, sed, diff for AI coding agents.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
# AGENTS.md -- prx (Praxis)

> Machine-readable project reference for AI coding assistants.
> Last updated: 2026-05-29

## Project Identity

**prx** (Praxis) is a busybox-style Rust binary providing agent-native
replacements for core Unix tools. Every subcommand returns structured JSON output with token
budgets, structural awareness, and content hashing -- designed for AI coding
agents, not humans.

- **Repository:** `github.com/civitas-io/prx`
- **License:** Apache 2.0
- **Language:** Rust (edition 2024)
- **Status:** v0.5.5 released

### What prx Is

- A single static binary (~49 MB) with zero runtime dependencies
- Replaces grep, cat, find, sed, diff with agent-native equivalents
- Embeds a 32M-parameter retrieval-optimized embedding model for semantic code search
- Returns structured JSON with labeled fields, token counts, and content hashes
- Works offline, in sandboxes, in containers -- no internet, no daemon, no setup

### What prx Is NOT

- NOT a wrapper around existing tools (unlike RTK, squeez, LeanCTX)
- NOT search-only — covers the full read-search-edit-diff loop
- NOT a framework or SDK -- it is a CLI tool that agents invoke
- NOT an AI/LLM -- the tool is deterministic; the agent calling it is the LLM
- NOT a replacement for LSP -- prx provides structural awareness without a server

---

## Using prx (for agents consuming this tool)

### Installation

prx ships as a single binary. No package manager, no runtime, no dependencies.

```bash
# Download prebuilt binary (Linux x86_64 example)
curl -L https://github.com/civitas-io/prx/releases/latest/download/prx-linux-x86_64 -o prx
chmod +x ag

# Or install via cargo
cargo install prx
```

### Quick Reference

```bash
# Search -- find code by meaning, not just text
prx search "authentication flow" src/          # semantic (auto-detected)
prx search --literal "authenticate(" src/      # exact match, ripgrep-speed
prx search --structural 'fn $NAME($$$) { $$$ }' src/   # AST pattern matching

# Read -- structured file access
prx read src/auth.ts                           # full file with metadata
prx read src/auth.ts --skeleton                # signatures and exports only
prx read src/auth.ts --lines 42-67 --snap fn   # expand to enclosing function
prx read src/auth.ts --outline                 # symbol table
prx read src/auth.ts --if-changed abc123...    # skip if unchanged (returns cached stub)
prx read src/auth.ts --mode aggressive         # strip comments (1-19% savings)
prx read src/auth.ts --mode diff               # changed lines vs git HEAD (80-97%)
prx read schema.rs --mode entropy              # filter repetitive code (5-87%)

# Find -- codebase mapping
prx find src/ --pattern "*.ts" --depth 3       # bounded file discovery
prx find src/ --changed-since HEAD~3           # recently modified files

# Edit -- safe, verified modifications
prx edit src/auth.ts --find "old_call()" --replace "new_call()"
prx edit src/auth.ts --find "old_call()" --replace "new_call()" --apply

# Diff -- semantic change summaries
prx diff src/auth.ts --since HEAD~1            # structured diff
prx diff --stat-only                           # summary in ~30 tokens

# Run -- structured command output (95-99% token savings on tests)
prx run cargo test                             # parsed test results
prx run cargo clippy                           # parsed warnings/errors
prx run pytest                                 # parsed test results
prx run cargo build                            # parsed build errors

# Utilities
prx exists "pattern" src/                      # O(1) existence check
prx outline src/auth.ts                        # symbol outline
prx batch < commands.jsonl                     # parallel batch execution
```

### Output Format

All output is JSON. Every response follows this envelope:

```json
{
  "version": "0.2.0",
  "command": "search",
  "status": "ok",
  "tokens": 487,
  "data": { ... }
}
```

Errors are also JSON, on stdout, never stderr:

```json
{
  "version": "0.2.0",
  "command": "read",
  "status": "error",
  "error": {
    "code": "file_not_found",
    "message": "File not found: src/missing.ts",
    "suggestion": "Use `prx find` to discover files."
  }
}
```

Use `--plain` for human-readable output. Use `--budget N` to cap token usage.

### Workflow Guidance

1. Start with `prx search` to find relevant code. Prefer semantic queries for
   unfamiliar codebases, literal queries for known identifiers.
2. Use `prx read --skeleton` or `prx read --outline` before reading full files.
   This costs ~10% of the tokens and tells you what is in the file.
3. Use `prx read --snap function` to read specific functions without pulling the
   entire file into context.
4. Re-reading a file? Pass `--if-changed <previous_hash>` to skip if unchanged.
   The hash is in `meta.hash` from the previous response. Cache hits cost ~50 bytes.
5. Use `prx exists "pattern"` before full searches when you just need a yes/no.
6. Use `prx edit` to preview changes (dry-run is the default). Add `--apply` to write.
7. Use `prx diff --stat-only` for cheap change detection (~30 tokens).
8. Use `prx run cargo test` instead of raw `cargo test` — returns only failures,
   saves 95-99% tokens on passing test suites.
9. Use `prx batch` to combine multiple independent queries in one round-trip.
10. Use `--budget N` on every content-returning command to control token cost.

### Integration Strategy

prx supports three integration tiers. Use all three for full coverage.

**Tier 1: CLI on PATH (universal, works everywhere)**

Install the binary and add the usage snippet to your project's AGENTS.md or
CLAUDE.md. This is the primary integration — it works for top-level agents,
sub-agents, scripts, CI, and humans. Run `prx init --agents-md` to append the
snippet automatically.

**Tier 2: MCP server (richer integration, top-level agents only)**

```json
{
  "mcpServers": {
    "prx": {
      "command": "prx",
      "args": ["mcp"]
    }
  }
}
```

Works with Claude Code, Cursor, Codex, OpenCode. Provides typed tool
parameters and auto-discovery. However, sub-agents cannot call MCP tools
(confirmed limitation in Claude Code and Codex CLI).

**Tier 3: Agent definition (Claude Code sub-agents)**

```bash
prx init --agent claude-code
```

Writes `.claude/agents/prx-search.md`, creating a dedicated sub-agent that
uses prx via bash with optimized workflow guidance.

**Quick setup for any framework:**

```bash
prx init           # auto-detect frameworks, write all configs
```

### Version Compatibility

CLI flags and JSON output schemas may change between minor versions. All
breaking changes are documented in CHANGELOG.md with migration guides. Use the
`version` field in JSON output for programmatic detection.

---

## Developing prx (for agents contributing to this codebase)

### Conventions

| Convention | Standard |
|---|---|
| Language | Rust, edition 2024 |
| MSRV | 1.85 |
| Formatter | `cargo fmt` (rustfmt defaults) |
| Linting | `cargo clippy -- -D warnings` |
| Testing | `cargo test` |
| Build | `cargo build --release` |
| License | Apache 2.0 |
| Package layout | `src/` (single crate) |

### Code Style

- **Formatter / linter:** `cargo fmt` + `cargo clippy`. No custom rustfmt config.
- **Line length:** 100 (rustfmt default).
- **Error handling:** Use `thiserror` for library errors, `anyhow` for CLI.
  Never `unwrap()` in library code. `unwrap()` acceptable only in tests.
- **Unsafe:** Forbidden without explicit justification in a code comment.
- **Comments:** only when the WHY is non-obvious.
- **Public API:** all pub functions and types must have doc comments (`///`).
- **Dependencies:** minimize. Every new dependency must justify its inclusion.

### Coding Discipline

Based on [Karpathy's guidelines](https://github.com/multica-ai/andrej-karpathy-skills)
for reducing LLM coding mistakes. Biases toward caution over speed. For trivial
tasks (typo fixes, obvious one-liners), use judgment.

**1. Think before coding.** Do not assume. Do not hide confusion. Surface
tradeoffs.

- State assumptions explicitly. If uncertain, ask.
- If multiple interpretations exist, present them -- do not pick silently.
- If a simpler approach exists, say so. Push back when warranted.
- If something is unclear, stop. Name what is confusing. Ask.

**2. Simplicity first.** Minimum code that solves the problem. Nothing
speculative.

- No features beyond what was asked.
- No abstractions for single-use code.
- No "flexibility" or "configurability" that was not requested.
- No error handling for impossible scenarios.
- If you write 200 lines and it could be 50, rewrite it.
- The test: would a senior engineer say this is overcomplicated? If yes,
  simplify.

**3. Surgical changes.** Touch only what you must. Clean up only your own mess.

When editing existing code:
- Do not "improve" adjacent code, comments, or formatting.
- Do not refactor things that are not broken.
- Match existing style, even if you would do it differently.
- If you notice unrelated dead code, mention it -- do not delete it.

When your changes create orphans:
- Remove imports/variables/functions that YOUR changes made unused.
- Do not remove pre-existing dead code unless asked.

The test: every changed line should trace directly to the request.

**4. Goal-driven execution.** Define success criteria. Loop until verified.

Transform tasks into verifiable goals:
- "Add validation" becomes "write tests for invalid inputs, then make them pass"
- "Fix the bug" becomes "write a test that reproduces it, then make it pass"
- "Refactor X" becomes "ensure tests pass before and after"

For multi-step tasks, state a brief plan:

```
1. [Step] -> verify: [check]
2. [Step] -> verify: [check]
3. [Step] -> verify: [check]
```

Strong success criteria enable independent work. Weak criteria ("make it work")
require constant clarification.

**These guidelines are working if:** fewer unnecessary changes in diffs, fewer
rewrites due to overcomplication, and clarifying questions come before
implementation rather than after mistakes.

### Testing

- **Unit tests:** inline `#[cfg(test)] mod tests` in each module.
- **Integration tests:** `tests/integration/` -- test CLI binary end-to-end.
- **Benchmarks:** `benches/` -- criterion benchmarks for search, chunking, ranking.
- Test data: `tests/fixtures/` -- small sample files in multiple languages.
- Coverage target: >= 80%.

### Repository Layout

```
ag/
├── AGENTS.md                    # This file
├── CLAUDE.md                    # Thin entrypoint (delegates to AGENTS.md)
├── README.md                    # Public-facing overview
├── CHANGELOG.md                 # Release history
├── Cargo.toml                   # Dependencies, features, metadata
├── Cargo.lock                   # Locked dependency versions
├── docs/
│   ├── vision/
│   │   ├── PRD.md               # Product requirements
│   │   └── ROADMAP.md           # Phased delivery plan
│   ├── architecture/
│   │   ├── SYSTEM.md            # System architecture
│   │   └── SEARCH.md            # Search subsystem architecture
│   ├── design/
│   │   ├── CLI.md               # CLI interface specification
│   │   ├── OUTPUT.md            # JSON output format specification
│   │   ├── BENCHMARKS.md        # Benchmarking plan and methodology
│   │   ├── SYSTEM-DESIGN.md     # Detailed design for all 20 subsystems
│   │   ├── IMPLEMENTATION.md   # Step-by-step implementation plan
│   │   ├── TESTING.md          # Testing plan (unit, integration, benchmarks)
│   │   └── CRATE-REFERENCE.md  # Crate versions, APIs, and compatibility
│   ├── research/
│   │   ├── LANDSCAPE.md         # Competitive landscape analysis
│   │   └── PLATFORM.md          # Cross-platform compatibility audit
│   └── assets/                  # SVG diagrams
├── CONTRIBUTING.md              # Developer setup and workflow guide
├── src/
│   ├── main.rs                  # CLI entry point, clap dispatch
│   ├── lib.rs                   # Library surface (public API)
│   ├── output.rs                # JSON envelope, error formatting
│   ├── tokens.rs                # Token counting (tokenizers crate)
│   ├── hash.rs                  # Content hashing (xxh3)
│   ├── walk.rs                  # File walking (ignore crate, .gitignore/.prxignore)
│   ├── workspace.rs             # Shared utilities (relative_path, is_test_file)
│   ├── fallback.rs              # Graceful fallback to grep/cat/find on internal errors
│   │
│   ├── commands/                # Subcommand handlers
│   │   ├── mod.rs
│   │   ├── search.rs            # prx search
│   │   ├── read.rs              # prx read
│   │   ├── find.rs              # prx find
│   │   ├── edit.rs              # prx edit
│   │   ├── diff.rs              # prx diff
│   │   ├── batch.rs             # prx batch
│   │   ├── bench.rs             # prx bench (synthetic benchmarks)
│   │   ├── context.rs           # prx context (module context package)
│   │   ├── impact.rs            # prx impact (reverse dependency analysis)
│   │   ├── index.rs             # prx index
│   │   ├── init.rs              # prx init
│   │   ├── mcp.rs               # prx mcp
│   │   ├── outline.rs           # prx outline
│   │   ├── exists.rs            # prx exists
│   │   ├── stats.rs             # prx stats
│   │   ├── run.rs               # prx run (structured command runner)
│   │
│   ├── search/                  # Search engine
│   │   ├── mod.rs
│   │   ├── fusion.rs            # RRF fusion, adaptive alpha
│   │   ├── graph.rs             # Import graph (BFS, persistence, suffix resolution)
│   │   ├── semantic.rs          # Model2Vec embedding search
│   │   ├── literal.rs           # Regex/literal search
│   │   ├── structural.rs        # ast-grep pattern search
│   │   ├── tokenize.rs          # Identifier tokenization (camelCase/snake_case)
│   │   └── symbols.rs           # Symbol index (definition lookup, reference counting)
│   │
│   ├── chunking/                # Code chunking
│   │   ├── mod.rs
│   │   └── treesitter.rs        # Tree-sitter AST chunking
│   │
│   ├── ranking/                 # Result ranking
│   │   ├── mod.rs
│   │   ├── boosting.rs          # Definition boost, stem matching, coherence
│   │   ├── penalties.rs         # Noise penalties, saturation decay
│   │   ├── proximity.rs         # Import graph proximity boost
│   │   └── weighting.rs         # Alpha weight resolution
│   │
│   ├── index/                   # Index management
│   │   ├── mod.rs
│   │   ├── dense.rs             # Model2Vec embeddings
│   │   ├── sparse.rs            # BM25 sparse matrix
│   │   └── bloom.rs             # Bloom filter for exists
│   │
│   └── parsing/                 # Tree-sitter integration
│       ├── mod.rs
│       ├── imports.rs           # Tree-sitter AST import extraction (10 language families)
│       ├── languages.rs         # Language detection, grammar loading
│       ├── outline.rs           # Symbol extraction
│       ├── snap.rs              # Structural snapping (function/class boundaries)
│       └── strip.rs             # Tree-sitter comment stripping (--mode aggressive)
│   └── runner/                  # prx run parsers
│       ├── mod.rs               # Runner framework, tool detection
│       ├── cargo_test.rs        # cargo test output parser
│       ├── cargo_build.rs       # cargo build/clippy output parser
│       ├── pytest.rs            # pytest output parser
│       ├── go_test.rs           # go test output parser
│       ├── jest.rs              # jest/npm test/vitest output parser
│       ├── tsc.rs               # TypeScript compiler output parser
│       ├── eslint.rs            # eslint output parser
│       └── fallback.rs          # Unknown command fallback
├── models/                      # Embedding model weights (build-time)
│   └── potion-retrieval-32M.safetensors  # Included via include_bytes!
├── skills/
│   └── agents.md                # Agent-facing skill guide (install, usage, integration)
├── tests/
│   ├── integration/             # CLI integration tests
│   └── fixtures/                # Sample source files for testing
└── benches/                     # Criterion benchmarks
```

> This layout is authoritative. If you add or remove a module, update this section.

---

## Key Architectural Decisions

These are settled decisions. Do not revisit without discussion.

| # | Decision | Rationale |
|---|---|---|
| 1 | **Single binary, busybox-style** | clap multicall. `prx search` or hardlink `prx-search`. Zero install friction -- download one file, run it. |
| 2 | **Model weights embedded in binary** | `include_bytes!` with float16 potion-retrieval-32M model (~32 MB). No internet required, works in sandboxes and air-gapped environments. |
| 3 | **Pure Rust Model2Vec inference** | No ONNX Runtime dependency. Inference is tokenize + lookup + mean pool + normalize (~50 lines). ONNX Runtime dropped x86_64 macOS support; pure Rust works everywhere. |
| 4 | **JSON output by default** | Agents parse structured data, not column-aligned text. `--plain` flag for human fallback. Errors in stdout, never stderr. |
| 5 | **Tree-sitter for structural code parsing** | Powers chunking, --snap, --skeleton, --outline, syntax validation, structural search. Import extraction uses tree-sitter AST queries (10 language families). No LSP server required. |
| 6 | **Token budgets, not truncation** | `--budget N` returns the best N tokens of results, ranked by relevance. Not `head -N` arbitrary cutoff. |
| 7 | **Dry-run edits by default** | `prx edit` previews changes. `--apply` commits. Agents see what will change before it happens. |
| 8 | **Content hashes in every response** | Enables cheap "has this changed?" checks. Eliminates ~50% of redundant file re-reads. |
| 9 | **No daemon for basic usage** | All commands work statelessly. Optional `prx index --watch` for warm caching. |
| 10 | **6-stage reranking pipeline** | Definition boost, stem matching, file coherence, import graph proximity, noise penalties, saturation decay. Quality comes from ranking, not just retrieval. |
| 11 | **BM25 with compound identifier tokenization** | camelCase/snake_case splitting without stemming. Code identifiers are semantically distinct -- "HTTPResponse" and "HTTP" mean different things. |
| 12 | **RRF fusion with adaptive alpha** | Symbol queries (Foo::bar) lean BM25 (alpha=0.3). Natural language queries stay balanced (alpha=0.5). Auto-detected. |
| 13 | **Parallel indexing via rayon** | All 5 indexing stages (read/hash/chunk, BM25, embedding, import graph, symbol index) run in parallel. No shared mutable state, no Arc, no Mutex — pure `par_iter` on thread-safe immutable data. 7.6x speedup on 10-core (11K files: 410s → 54s). BLAS thread limits prevent oversubscription. |
| 14 | **Zero-copy memory-mapped embeddings** | `embeddings.bin` is mmap'd via `memmap2` and cast to `&[f32]` with `bytemuck::cast_slice` (zero allocation, zero deserialization). OS page cache keeps index warm across queries. Falls back to owned `Array2<f32>` if mmap fails. `Embeddings` enum abstracts both paths behind a single `view() -> ArrayView2<f32>` API. |

---

## Anti-Patterns

### 1. Returning unstructured text

All output must go through the JSON envelope in `src/output.rs`. Never
`println!()` directly to stdout from command handlers.

### 2. Using stderr for errors

Agents read stdout. Errors are structured JSON on stdout with a non-zero
exit code. stderr is reserved for debug logging only (behind `RUST_LOG`).

### 3. Unbounded output

Every command that returns file content or search results must respect
`--budget`. Default is unlimited, but the infrastructure must support it.

### 4. Platform-specific behavior

No `#[cfg(target_os)]` in command logic. Platform differences are isolated
to `src/parsing/languages.rs` (grammar loading) and the notify crate
(file watching). Everything else is pure cross-platform Rust.

### 5. Unwrap in library code

`unwrap()` and `expect()` are forbidden outside `#[cfg(test)]` modules.
Use `?` propagation with typed errors.

### 6. Adding dependencies without justification

Every crate added to Cargo.toml must have a comment explaining why it is
needed and why an existing dependency cannot serve the purpose.

---

## Git Workflow

**No direct pushes to `main`.** All work happens on `dev/vX.Y.Z` branches.

**Version semantics:** `v0.X.0` = features (new capabilities). `v0.X.Y` = fixes/improvements only.

```
git checkout -b dev/v0.4.1 main   # cut branch
# ... develop, commit, test ...
# >>> GET HUMAN SIGN-OFF <<<       # mandatory before merge
git checkout main && git merge --no-ff dev/v0.4.1   # merge
git tag -a v0.4.1 -m "..."        # tag
git push origin main && git push origin v0.4.1      # push + release
git branch -d dev/v0.4.1          # cleanup
```

Full workflow: `docs/design/GIT-WORKFLOW.md`.

---

## Pre-Merge Checklist

- [ ] On a `dev/vX.Y.Z` branch (not `main`)
- [ ] `cargo fmt --check` passes
- [ ] `cargo clippy -- -D warnings` passes
- [ ] `cargo test` passes
- [ ] `cargo deny check` passes
- [ ] `cargo build --release` succeeds
- [ ] No `unwrap()` in non-test code
- [ ] Public functions have `///` doc comments
- [ ] JSON output matches schemas in docs/design/OUTPUT.md
- [ ] AGENTS.md updated if layout or conventions changed
- [ ] CHANGELOG.md updated for user-visible changes (used as GitHub release notes)
- [ ] Cargo.toml version bumped