prx 0.5.9 - Docs.rs

# AGENTS.md -- prx (Praxis)

> Machine-readable project reference for AI coding assistants.
> Last updated: 2026-05-29

## Project Identity

**prx** (Praxis) is a busybox-style Rust binary providing agent-native
replacements for core Unix tools. Every subcommand returns structured JSON output with token
budgets, structural awareness, and content hashing -- designed for AI coding
agents, not humans.

- **Repository:** `github.com/civitas-io/prx`
- **License:** Apache 2.0
- **Language:** Rust (edition 2024)
- **Status:** v0.5.5 released

### What prx Is

- A single static binary (~49 MB) with zero runtime dependencies
- Replaces grep, cat, find, sed, diff with agent-native equivalents
- Embeds a 32M-parameter retrieval-optimized embedding model for semantic code search
- Returns structured JSON with labeled fields, token counts, and content hashes
- Works offline, in sandboxes, in containers -- no internet, no daemon, no setup

### What prx Is NOT

- NOT a wrapper around existing tools (unlike RTK, squeez, LeanCTX)
- NOT search-only — covers the full read-search-edit-diff loop
- NOT a framework or SDK -- it is a CLI tool that agents invoke
- NOT an AI/LLM -- the tool is deterministic; the agent calling it is the LLM
- NOT a replacement for LSP -- prx provides structural awareness without a server

---

## Using prx (for agents consuming this tool)

### Installation

prx ships as a single binary. No package manager, no runtime, no dependencies.

```bash
# Download prebuilt binary (Linux x86_64 example)
curl -L https://github.com/civitas-io/prx/releases/latest/download/prx-linux-x86_64 -o prx
chmod +x ag

# Or install via cargo
cargo install prx
```

### Quick Reference

```bash
# Search -- find code by meaning, not just text
prx search "authentication flow" src/          # semantic (auto-detected)
prx search --literal "authenticate(" src/      # exact match, ripgrep-speed
prx search --structural 'fn $NAME($$$) { $$$ }' src/   # AST pattern matching

# Read -- structured file access
prx read src/auth.ts                           # full file with metadata
prx read src/auth.ts --skeleton                # signatures and exports only
prx read src/auth.ts --lines 42-67 --snap fn   # expand to enclosing function
prx read src/auth.ts --outline                 # symbol table
prx read src/auth.ts --if-changed abc123...    # skip if unchanged (returns cached stub)
prx read src/auth.ts --mode aggressive         # strip comments (1-19% savings)
prx read src/auth.ts --mode diff               # changed lines vs git HEAD (80-97%)
prx read schema.rs --mode entropy              # filter repetitive code (5-87%)

# Find -- codebase mapping
prx find src/ --pattern "*.ts" --depth 3       # bounded file discovery
prx find src/ --changed-since HEAD~3           # recently modified files

# Edit -- safe, verified modifications
prx edit src/auth.ts --find "old_call()" --replace "new_call()"
prx edit src/auth.ts --find "old_call()" --replace "new_call()" --apply

# Diff -- semantic change summaries
prx diff src/auth.ts --since HEAD~1            # structured diff
prx diff --stat-only                           # summary in ~30 tokens

# Run -- structured command output (95-99% token savings on tests)
prx run cargo test                             # parsed test results
prx run cargo clippy                           # parsed warnings/errors
prx run pytest                                 # parsed test results
prx run cargo build                            # parsed build errors

# Utilities
prx exists "pattern" src/                      # O(1) existence check
prx outline src/auth.ts                        # symbol outline
prx batch < commands.jsonl                     # parallel batch execution
```

### Output Format

All output is JSON. Every response follows this envelope:

```json
{
  "version": "0.2.0",
  "command": "search",
  "status": "ok",
  "tokens": 487,
  "data": { ... }
}
```

Errors are also JSON, on stdout, never stderr:

```json
{
  "version": "0.2.0",
  "command": "read",
  "status": "error",
  "error": {
    "code": "file_not_found",
    "message": "File not found: src/missing.ts",
    "suggestion": "Use `prx find` to discover files."
  }
}
```

Use `--plain` for human-readable output. Use `--budget N` to cap token usage.

### Workflow Guidance

1. Start with `prx search` to find relevant code. Prefer semantic queries for
   unfamiliar codebases, literal queries for known identifiers.
2. Use `prx read --skeleton` or `prx read --outline` before reading full files.
   This costs ~10% of the tokens and tells you what is in the file.
3. Use `prx read --snap function` to read specific functions without pulling the
   entire file into context.
4. Re-reading a file? Pass `--if-changed <previous_hash>` to skip if unchanged.
   The hash is in `meta.hash` from the previous response. Cache hits cost ~50 bytes.
5. Use `prx exists "pattern"` before full searches when you just need a yes/no.
6. Use `prx edit` to preview changes (dry-run is the default). Add `--apply` to write.
7. Use `prx diff --stat-only` for cheap change detection (~30 tokens).
8. Use `prx run cargo test` instead of raw `cargo test` — returns only failures,
   saves 95-99% tokens on passing test suites.
9. Use `prx batch` to combine multiple independent queries in one round-trip.
10. Use `--budget N` on every content-returning command to control token cost.

### Integration Strategy

prx supports three integration tiers. Use all three for full coverage.

**Tier 1: CLI on PATH (universal, works everywhere)**

Install the binary and add the usage snippet to your project's AGENTS.md or
CLAUDE.md. This is the primary integration — it works for top-level agents,
sub-agents, scripts, CI, and humans. Run `prx init --agents-md` to append the
snippet automatically.

**Tier 2: MCP server (richer integration, top-level agents only)**

```json
{
  "mcpServers": {
    "prx": {
      "command": "prx",
      "args": ["mcp"]
    }
  }
}
```

Works with Claude Code, Cursor, Codex, OpenCode. Provides typed tool
parameters and auto-discovery. However, sub-agents cannot call MCP tools
(confirmed limitation in Claude Code and Codex CLI).

**Tier 3: Agent definition (Claude Code sub-agents)**

```bash
prx init --agent claude-code
```

Writes `.claude/agents/prx-search.md`, creating a dedicated sub-agent that
uses prx via bash with optimized workflow guidance.

**Quick setup for any framework:**

```bash
prx init           # auto-detect frameworks, write all configs
```

### Version Compatibility

CLI flags and JSON output schemas may change between minor versions. All
breaking changes are documented in CHANGELOG.md with migration guides. Use the
`version` field in JSON output for programmatic detection.

---

## Developing prx (for agents contributing to this codebase)

### Conventions

| Convention | Standard |
|---|---|
| Language | Rust, edition 2024 |
| MSRV | 1.85 |
| Formatter | `cargo fmt` (rustfmt defaults) |
| Linting | `cargo clippy -- -D warnings` |
| Testing | `cargo test` |
| Build | `cargo build --release` |
| License | Apache 2.0 |
| Package layout | `src/` (single crate) |

### Code Style

- **Formatter / linter:** `cargo fmt` + `cargo clippy`. No custom rustfmt config.
- **Line length:** 100 (rustfmt default).
- **Error handling:** Use `thiserror` for library errors, `anyhow` for CLI.
  Never `unwrap()` in library code. `unwrap()` acceptable only in tests.
- **Unsafe:** Forbidden without explicit justification in a code comment.
- **Comments:** only when the WHY is non-obvious.
- **Public API:** all pub functions and types must have doc comments (`///`).
- **Dependencies:** minimize. Every new dependency must justify its inclusion.

### Coding Discipline

Based on [Karpathy's guidelines](https://github.com/multica-ai/andrej-karpathy-skills)
for reducing LLM coding mistakes. Biases toward caution over speed. For trivial
tasks (typo fixes, obvious one-liners), use judgment.

**1. Think before coding.** Do not assume. Do not hide confusion. Surface
tradeoffs.

- State assumptions explicitly. If uncertain, ask.
- If multiple interpretations exist, present them -- do not pick silently.
- If a simpler approach exists, say so. Push back when warranted.
- If something is unclear, stop. Name what is confusing. Ask.

**2. Simplicity first.** Minimum code that solves the problem. Nothing
speculative.

- No features beyond what was asked.
- No abstractions for single-use code.
- No "flexibility" or "configurability" that was not requested.
- No error handling for impossible scenarios.
- If you write 200 lines and it could be 50, rewrite it.
- The test: would a senior engineer say this is overcomplicated? If yes,
  simplify.

**3. Surgical changes.** Touch only what you must. Clean up only your own mess.

When editing existing code:
- Do not "improve" adjacent code, comments, or formatting.
- Do not refactor things that are not broken.
- Match existing style, even if you would do it differently.
- If you notice unrelated dead code, mention it -- do not delete it.

When your changes create orphans:
- Remove imports/variables/functions that YOUR changes made unused.
- Do not remove pre-existing dead code unless asked.

The test: every changed line should trace directly to the request.

**4. Goal-driven execution.** Define success criteria. Loop until verified.

Transform tasks into verifiable goals:
- "Add validation" becomes "write tests for invalid inputs, then make them pass"
- "Fix the bug" becomes "write a test that reproduces it, then make it pass"
- "Refactor X" becomes "ensure tests pass before and after"

For multi-step tasks, state a brief plan:

```
1. [Step] -> verify: [check]
2. [Step] -> verify: [check]
3. [Step] -> verify: [check]
```

Strong success criteria enable independent work. Weak criteria ("make it work")
require constant clarification.

**These guidelines are working if:** fewer unnecessary changes in diffs, fewer
rewrites due to overcomplication, and clarifying questions come before
implementation rather than after mistakes.

### Testing

- **Unit tests:** inline `#[cfg(test)] mod tests` in each module.
- **Integration tests:** `tests/integration/` -- test CLI binary end-to-end.
- **Benchmarks:** `benches/` -- criterion benchmarks for search, chunking, ranking.
- Test data: `tests/fixtures/` -- small sample files in multiple languages.
- Coverage target: >= 80%.

### Repository Layout

```
ag/
├── AGENTS.md                    # This file
├── CLAUDE.md                    # Thin entrypoint (delegates to AGENTS.md)
├── README.md                    # Public-facing overview
├── CHANGELOG.md                 # Release history
├── Cargo.toml                   # Dependencies, features, metadata
├── Cargo.lock                   # Locked dependency versions
│
├── docs/
│   ├── vision/
│   │   ├── PRD.md               # Product requirements
│   │   └── ROADMAP.md           # Phased delivery plan
│   ├── architecture/
│   │   ├── SYSTEM.md            # System architecture
│   │   └── SEARCH.md            # Search subsystem architecture
│   ├── design/
│   │   ├── CLI.md               # CLI interface specification
│   │   ├── OUTPUT.md            # JSON output format specification
│   │   ├── BENCHMARKS.md        # Benchmarking plan and methodology
│   │   ├── SYSTEM-DESIGN.md     # Detailed design for all 20 subsystems
│   │   ├── IMPLEMENTATION.md   # Step-by-step implementation plan
│   │   ├── TESTING.md          # Testing plan (unit, integration, benchmarks)
│   │   └── CRATE-REFERENCE.md  # Crate versions, APIs, and compatibility
│   ├── research/
│   │   ├── LANDSCAPE.md         # Competitive landscape analysis
│   │   └── PLATFORM.md          # Cross-platform compatibility audit
│   └── assets/                  # SVG diagrams
│
├── CONTRIBUTING.md              # Developer setup and workflow guide
│
├── src/
│   ├── main.rs                  # CLI entry point, clap dispatch
│   ├── lib.rs                   # Library surface (public API)
│   ├── output.rs                # JSON envelope, error formatting
│   ├── tokens.rs                # Token counting (tokenizers crate)
│   ├── hash.rs                  # Content hashing (xxh3)
│   ├── walk.rs                  # File walking (ignore crate, .gitignore/.prxignore)
│   ├── workspace.rs             # Shared utilities (relative_path, is_test_file)
│   ├── fallback.rs              # Graceful fallback to grep/cat/find on internal errors
│   │
│   ├── commands/                # Subcommand handlers
│   │   ├── mod.rs
│   │   ├── search.rs            # prx search
│   │   ├── read.rs              # prx read
│   │   ├── find.rs              # prx find
│   │   ├── edit.rs              # prx edit
│   │   ├── diff.rs              # prx diff
│   │   ├── batch.rs             # prx batch
│   │   ├── bench.rs             # prx bench (synthetic benchmarks)
│   │   ├── context.rs           # prx context (module context package)
│   │   ├── impact.rs            # prx impact (reverse dependency analysis)
│   │   ├── index.rs             # prx index
│   │   ├── init.rs              # prx init
│   │   ├── mcp.rs               # prx mcp
│   │   ├── outline.rs           # prx outline
│   │   ├── exists.rs            # prx exists
│   │   ├── stats.rs             # prx stats
│   │   ├── run.rs               # prx run (structured command runner)
│   │
│   ├── search/                  # Search engine
│   │   ├── mod.rs
│   │   ├── fusion.rs            # RRF fusion, adaptive alpha
│   │   ├── graph.rs             # Import graph (BFS, persistence, suffix resolution)
│   │   ├── semantic.rs          # Model2Vec embedding search
│   │   ├── literal.rs           # Regex/literal search
│   │   ├── structural.rs        # ast-grep pattern search
│   │   ├── tokenize.rs          # Identifier tokenization (camelCase/snake_case)
│   │   └── symbols.rs           # Symbol index (definition lookup, reference counting)
│   │
│   ├── chunking/                # Code chunking
│   │   ├── mod.rs
│   │   └── treesitter.rs        # Tree-sitter AST chunking
│   │
│   ├── ranking/                 # Result ranking
│   │   ├── mod.rs
│   │   ├── boosting.rs          # Definition boost, stem matching, coherence
│   │   ├── penalties.rs         # Noise penalties, saturation decay
│   │   ├── proximity.rs         # Import graph proximity boost
│   │   └── weighting.rs         # Alpha weight resolution
│   │
│   ├── index/                   # Index management
│   │   ├── mod.rs
│   │   ├── dense.rs             # Model2Vec embeddings
│   │   ├── sparse.rs            # BM25 sparse matrix
│   │   └── bloom.rs             # Bloom filter for exists
│   │
│   └── parsing/                 # Tree-sitter integration
│       ├── mod.rs
│       ├── imports.rs           # Tree-sitter AST import extraction (10 language families)
│       ├── languages.rs         # Language detection, grammar loading
│       ├── outline.rs           # Symbol extraction
│       ├── snap.rs              # Structural snapping (function/class boundaries)
│       └── strip.rs             # Tree-sitter comment stripping (--mode aggressive)
│
│   └── runner/                  # prx run parsers
│       ├── mod.rs               # Runner framework, tool detection
│       ├── cargo_test.rs        # cargo test output parser
│       ├── cargo_build.rs       # cargo build/clippy output parser
│       ├── pytest.rs            # pytest output parser
│       ├── go_test.rs           # go test output parser
│       ├── jest.rs              # jest/npm test/vitest output parser
│       ├── tsc.rs               # TypeScript compiler output parser
│       ├── eslint.rs            # eslint output parser
│       └── fallback.rs          # Unknown command fallback
│
├── models/                      # Embedding model weights (build-time)
│   └── potion-retrieval-32M.safetensors  # Included via include_bytes!
│
├── skills/
│   └── agents.md                # Agent-facing skill guide (install, usage, integration)
│
├── tests/
│   ├── integration/             # CLI integration tests
│   └── fixtures/                # Sample source files for testing
│
└── benches/                     # Criterion benchmarks
```

> This layout is authoritative. If you add or remove a module, update this section.

---

## Key Architectural Decisions

These are settled decisions. Do not revisit without discussion.

| # | Decision | Rationale |
|---|---|---|
| 1 | **Single binary, busybox-style** | clap multicall. `prx search` or hardlink `prx-search`. Zero install friction -- download one file, run it. |
| 2 | **Model weights embedded in binary** | `include_bytes!` with float16 potion-retrieval-32M model (~32 MB). No internet required, works in sandboxes and air-gapped environments. |
| 3 | **Pure Rust Model2Vec inference** | No ONNX Runtime dependency. Inference is tokenize + lookup + mean pool + normalize (~50 lines). ONNX Runtime dropped x86_64 macOS support; pure Rust works everywhere. |
| 4 | **JSON output by default** | Agents parse structured data, not column-aligned text. `--plain` flag for human fallback. Errors in stdout, never stderr. |
| 5 | **Tree-sitter for structural code parsing** | Powers chunking, --snap, --skeleton, --outline, syntax validation, structural search. Import extraction uses tree-sitter AST queries (10 language families). No LSP server required. |
| 6 | **Token budgets, not truncation** | `--budget N` returns the best N tokens of results, ranked by relevance. Not `head -N` arbitrary cutoff. |
| 7 | **Dry-run edits by default** | `prx edit` previews changes. `--apply` commits. Agents see what will change before it happens. |
| 8 | **Content hashes in every response** | Enables cheap "has this changed?" checks. Eliminates ~50% of redundant file re-reads. |
| 9 | **No daemon for basic usage** | All commands work statelessly. Optional `prx index --watch` for warm caching. |
| 10 | **6-stage reranking pipeline** | Definition boost, stem matching, file coherence, import graph proximity, noise penalties, saturation decay. Quality comes from ranking, not just retrieval. |
| 11 | **BM25 with compound identifier tokenization** | camelCase/snake_case splitting without stemming. Code identifiers are semantically distinct -- "HTTPResponse" and "HTTP" mean different things. |
| 12 | **RRF fusion with adaptive alpha** | Symbol queries (Foo::bar) lean BM25 (alpha=0.3). Natural language queries stay balanced (alpha=0.5). Auto-detected. |
| 13 | **Parallel indexing via rayon** | All 5 indexing stages (read/hash/chunk, BM25, embedding, import graph, symbol index) run in parallel. No shared mutable state, no Arc, no Mutex — pure `par_iter` on thread-safe immutable data. 7.6x speedup on 10-core (11K files: 410s → 54s). BLAS thread limits prevent oversubscription. |
| 14 | **Zero-copy memory-mapped embeddings** | `embeddings.bin` is mmap'd via `memmap2` and cast to `&[f32]` with `bytemuck::cast_slice` (zero allocation, zero deserialization). OS page cache keeps index warm across queries. Falls back to owned `Array2<f32>` if mmap fails. `Embeddings` enum abstracts both paths behind a single `view() -> ArrayView2<f32>` API. |

---

## Anti-Patterns

### 1. Returning unstructured text

All output must go through the JSON envelope in `src/output.rs`. Never
`println!()` directly to stdout from command handlers.

### 2. Using stderr for errors

Agents read stdout. Errors are structured JSON on stdout with a non-zero
exit code. stderr is reserved for debug logging only (behind `RUST_LOG`).

### 3. Unbounded output

Every command that returns file content or search results must respect
`--budget`. Default is unlimited, but the infrastructure must support it.

### 4. Platform-specific behavior

No `#[cfg(target_os)]` in command logic. Platform differences are isolated
to `src/parsing/languages.rs` (grammar loading) and the notify crate
(file watching). Everything else is pure cross-platform Rust.

### 5. Unwrap in library code

`unwrap()` and `expect()` are forbidden outside `#[cfg(test)]` modules.
Use `?` propagation with typed errors.

### 6. Adding dependencies without justification

Every crate added to Cargo.toml must have a comment explaining why it is
needed and why an existing dependency cannot serve the purpose.

---

## Git Workflow

**No direct pushes to `main`.** All work happens on `dev/vX.Y.Z` branches.

**Version semantics:** `v0.X.0` = features (new capabilities). `v0.X.Y` = fixes/improvements only.

```
git checkout -b dev/v0.4.1 main   # cut branch
# ... develop, commit, test ...
# >>> GET HUMAN SIGN-OFF <<<       # mandatory before merge
git checkout main && git merge --no-ff dev/v0.4.1   # merge
git tag -a v0.4.1 -m "..."        # tag
git push origin main && git push origin v0.4.1      # push + release
git branch -d dev/v0.4.1          # cleanup
```

Full workflow: `docs/design/GIT-WORKFLOW.md`.

---

## Pre-Merge Checklist

- [ ] On a `dev/vX.Y.Z` branch (not `main`)
- [ ] `cargo fmt --check` passes
- [ ] `cargo clippy -- -D warnings` passes
- [ ] `cargo test` passes
- [ ] `cargo deny check` passes
- [ ] `cargo build --release` succeeds
- [ ] No `unwrap()` in non-test code
- [ ] Public functions have `///` doc comments
- [ ] JSON output matches schemas in docs/design/OUTPUT.md
- [ ] AGENTS.md updated if layout or conventions changed
- [ ] CHANGELOG.md updated for user-visible changes (used as GitHub release notes)
- [ ] Cargo.toml version bumped