Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
prx (Praxis)
AI coding agents burn most of their context window re-discovering code they've already seen. prx fixes that at the source.
prx is a single Rust binary that replaces the Unix tools coding agents lean on most — grep, cat, find, sed, diff — with structured JSON output, hard token budgets, and an embedded semantic search model. One call returns a ranked, budgeted answer instead of a wall of text the agent has to read, parse, and re-read. No shell spawning, no post-hoc compression, no model server.
The problem
Every coding agent runs some version of this loop:
1. grep "authenticate" src/ → file paths, line numbers
2. cat src/auth/handler.ts → entire file (thousands of tokens)
3. grep "authenticate" src/ -A 5 → same noise, wider context
Most of those tokens are waste: whole files read to use ten lines, the same file loaded twice in a session, test logs dumped in full to find one failure. The tools aren't broken — they were built for humans reading a terminal, not for an agent paying for every token and working inside a fixed context window. That mismatch is the tax prx removes.
The token-waste figures previously cited here are being re-sourced. Rather than quote a number we can't currently point you to a verifiable reference for, we let the per-command savings table below — measured on real sessions — speak for itself.
The fix
Ranked results, metadata included, under a token budget you control. The agent gets the answer, not the haystack.
What makes prx different
It replaces the tools, it doesn't wrap them. Compression tools shell out to grep/cat and squeeze the output afterward. prx does the search, reading, and diffing itself — no subprocess, no re-parsing, no lossy post-processing.
It covers the whole loop, not just search. Retrieval-only tools still leave your agent to read, edit, diff, and run tests with the old noisy tools. prx handles search, structured reads, safe edits, semantic diffs, and parsed test/build output behind one consistent JSON envelope.
It has no runtime dependencies. One static binary, ~49 MB, no Python, no package manager, no network at runtime. It runs in containers and sandboxes as-is.
The semantic model is built in. A 32M-parameter retrieval-optimized embedding model (potion-retrieval-32M, stored as float16) is compiled directly into the binary. Semantic search runs on CPU in milliseconds — no model server, no vector database, no setup step.
It's fast. Indexing runs on all CPU cores in parallel (7.6x speedup on 10 cores). Embeddings are memory-mapped with zero-copy access — no heap allocation, no deserialization. A 50-query benchmark suite runs in 0.23 seconds.
Token savings
Measured across real agent sessions on production codebases. Run the numbers on your own repo with prx stats --compare and prx bench ..
| Feature | Scenario | Savings |
|---|---|---|
read --if-changed (cache hit) |
Re-reading an unchanged file | ~99% |
read --mode diff |
File with local changes | 98–99% |
read --skeleton |
Full file reduced to signatures | ~90% |
run |
Passing test suites | 95–99% |
read --mode entropy |
Generated / highly repetitive code | ~86% |
search |
vs grep + follow-up reads | ~35% |
Performance
Indexing: 7.6x parallel speedup
prx index builds a persistent search index — BM25, semantic embeddings, import graph, and symbol definitions — in a single parallel pass. All five stages run on all available CPU cores via rayon.
| Codebase | Files | Chunks | Time |
|---|---|---|---|
| Flask (Python, 15K LOC) | 259 | 1,225 | 0.3s |
| ripgrep (Rust, 25K LOC) | 239 | 2,465 | 0.6s |
| fastify (TypeScript, 15K LOC) | 417 | 2,529 | 0.6s |
| cargo (Rust, 150K LOC) | 2,815 | 12,118 | 5s |
| terraform (Go, 2M LOC) | 5,323 | 22,798 | 10s |
| django (Python, 300K LOC) | 5,690 | 30,944 | 32s |
| kafka (Java, 500K LOC) | 7,231 | 63,740 | 114s |
| vscode (TypeScript, 1M LOC) | 14,643 | 136,056 | 340s |
Measured on 10-core Apple Silicon with rayon parallelism (944% CPU utilization). On CI runners (4 cores), expect ~3-4x speedup over sequential. Incremental rebuilds skip unchanged files entirely.
Search: zero-copy memory-mapped embeddings
Embedding vectors are memory-mapped directly from disk via memmap2 and cast to &[f32] with zero allocation using bytemuck. The OS page cache keeps the index warm across queries — no heap allocation, no deserialization, no repeated file reads.
On an 11K-file codebase with 54 MB of embeddings, this means:
- Zero bytes allocated for embedding data (OS manages the pages)
- Queries after the first hit warm cache — sub-millisecond embedding access
- Falls back to owned allocation automatically if mmap isn't available (network FS, etc.)
Benchmarking: 55x speedup with load-once
prx bench-ndcg measures search quality (NDCG@10) against labeled datasets. It loads the index once and runs all queries against cached data:
| Benchmark | Before (v0.5.5) | After (v0.5.6) | Speedup |
|---|---|---|---|
| 50-query NDCG suite | 12.76s | 0.23s | 55x |
Use --plain for human-readable output in the terminal.
The commands agents actually orchestrate around
Most tools stop at "better grep." The two commands below are why prx is useful for agents working inside a tight context window — they answer questions that would otherwise take a dozen grep/cat calls to reconstruct.
prx context — understand a module in one call
Returns a single structured package for a directory: summary stats, doc/README content, entrypoints, per-file skeletons (signatures without bodies), and the import edges connecting the files. Instead of the agent running find, then cat README, then outline on each file, then chasing imports by hand, it gets the whole mental model of a module in one budgeted response — ideal for the "load just enough to start a task" step in an agent loop.
prx impact — know what breaks before you touch it
Reverse-dependency analysis built on prx's import graph: it answers "what depends on this file?" so an agent (or a human) can scope a refactor before making it. Edges are extracted from the AST (see How search works); when an import name is ambiguous across many files, resolution falls back to a directory-proximity heuristic and returns the most likely candidates rather than guessing blindly. Treat its output as a high-quality map, not a formal proof of completeness.
All commands
| Command | Replaces | What it does |
|---|---|---|
prx search |
grep, rg | Hybrid search: literal + semantic + structural. Ranked, token-budgeted. |
prx read |
cat, head, tail | Structured reading. --if-changed cache, --skeleton, --mode, --snap. |
prx find |
find, ls, tree | Codebase mapping. Tree or flat output, inline metadata, semantic scoring. |
prx edit |
sed, awk | Safe edits. Literal matching, dry-run by default, tree-sitter syntax validation. |
prx diff |
diff, git diff | Semantic diffs with function-level attribution and natural-language summaries. |
prx run |
— | Parsed test/build/lint output. 22 parsers; --auto-json for tools with structured output. |
prx context |
— | Module context package: stats, docs, entrypoints, skeletons, import edges. |
prx impact |
— | Reverse dependency analysis: what depends on a given file. |
prx outline |
ctags | Symbol table for a file or directory. |
prx exists |
grep -q | Fast bloom-filter existence check, near-zero tokens. |
prx index |
— | Parallel persistent index: 11K files in ~55s (7.6x speedup via rayon). |
prx mcp |
— | MCP server over stdio for direct agent integration. |
prx batch |
xargs | Parallel JSONL batch execution. |
prx init |
— | Detects agent frameworks and generates integration configs. |
prx stats |
— | Token-savings dashboard, with --compare. |
prx bench |
— | Side-by-side benchmark: prx vs grep+cat. |
prx bench-ndcg |
— | NDCG search quality benchmark against labeled datasets. |
17 commands total. Full reference with examples in the documentation site.
Quick start
# Search by meaning, not just text
# Get a module's whole shape in one call
# See what depends on a file before refactoring it
# File structure without the bodies (~10% of the tokens)
# Read just the function you need
# Skip re-reading a file that hasn't changed
# Safe edit with a preview before applying
# Run tests, get only failures and a summary
How search works
prx fuses three retrieval methods into one ranked result:
- Literal — regex matching at ripgrep speed.
- Semantic — the embedded potion-retrieval-32M Model2Vec model (PCA-reduced to 256 dims, float16); runs on CPU in milliseconds, no server.
- Structural — AST pattern matching via tree-sitter, e.g.
fn $NAME($$$) { $$$ }to match all function definitions.
Results are combined with Reciprocal Rank Fusion and reranked through a multi-stage pipeline: definition boost, identifier-stem matching, file coherence, import-graph proximity (favoring files in the dependency neighborhood of strong hits), noise penalties, and saturation decay.
The import graph is extracted from the AST (tree-sitter) across 10 language families that have an import concept. Search quality is tracked with NDCG@10 on labeled datasets — see Search quality for the honest numbers and methodology.
prx run — structured command output
Test runners emit thousands of tokens an agent doesn't need:
running 164 tests
test test_one ... ok
test test_two ... ok
[... 162 more lines ...]
test result: ok. 164 passed; 0 failed
prx run parses that and returns only the signal:
22 parsers cover Rust, Python, Go, JavaScript/TypeScript, Java, .NET, Docker, Terraform, kubectl, Maven, Gradle, npm, mypy, git, common coverage tools, and a generic fallback for unrecognized commands.
Agent integration
MCP server
Exposes prx over stdio to any MCP-compatible agent. (prx also works equally well as a plain CLI on PATH — see the tiers below.)
Config generation
Integration tiers
| Tier | How | Best for |
|---|---|---|
| CLI on PATH | prx search ... |
Any agent, CI, scripts — the simplest and most portable path |
| MCP server | prx mcp |
Agents that prefer structured tool calls mid-task |
| Agent definition | prx init --agent claude-code |
A dedicated retrieval sub-agent |
For AI agents
If you're an agent evaluating prx, read skills/agents.md: installation, per-command examples, and measured token savings written for machine consumption.
Reliability
If an internal operation fails, prx falls back to the equivalent Unix command and returns results in the same JSON envelope, flagged so the caller can tell a fallback occurred. Errors are logged to ~/.prx/errors.jsonl. The intent is that prx never hard-breaks an agent's workflow — but because a fallback silently trades semantic search for plain matching, agents that depend on retrieval quality should check the flag rather than assume every result is a full-quality prx result.
Install
Prebuilt binary (recommended)
Download the binary for your platform from GitHub Releases:
| Platform | File |
|---|---|
| Linux x86_64 | prx-x86_64-unknown-linux-gnu.tar.gz |
| Linux aarch64 | prx-aarch64-unknown-linux-gnu.tar.gz |
| macOS Apple Silicon | prx-aarch64-apple-darwin.tar.gz |
| Windows x86_64 | prx-x86_64-pc-windows-msvc.zip |
# Example: Linux x86_64
|
The prebuilt binary already contains the embedded model — nothing else to install.
Build from source
Requirements: Rust ≥ 1.85, a C compiler (for tree-sitter grammars), and network access on first build (the build script downloads model weights automatically).
First build takes 1-2 minutes (model download + compilation). Subsequent builds are fast. The model weights are baked into the binary via include_bytes! — no downloads at runtime. Set PRX_MODELS_DIR to point to pre-downloaded weights for offline/air-gapped builds.
See the Contributing guide for the full developer setup.
Platform support
| Platform | Status |
|---|---|
| Linux x86_64 | Supported |
| Linux aarch64 | Supported |
| macOS Apple Silicon | Supported |
| Windows x86_64 | Supported |
Single static binary. No runtime dependencies. No network required after build.
Current status
| Commands | 17 |
| Tests | 442 unit + 80 E2E + 8 MCP |
| Run parsers | 22 (cargo, pytest, go, jest, eslint, tsc, kubectl, terraform, docker, + 13 more) |
| Languages (parsing) | 15 tree-sitter grammars |
| Import graph | 10 language families, tree-sitter AST extraction |
| Symbol index | Definition lookup + reference counting |
| Indexing | Parallel via rayon — 11K files in 54s on 10 cores (7.6x speedup). Zero-copy mmap embeddings. |
| Embedded model | potion-retrieval-32M (Model2Vec, float16, PCA→256 dims) |
| Release binary | ~49 MB |
| CI | GitHub Actions: Linux x86_64 / aarch64, macOS arm64, Windows |
See the Roadmap for what's planned next.
Search quality
NDCG@10 measured on 200 labeled queries across 8 public repositories (6 languages, 3 size tiers). All repos pinned by commit SHA. Ground truth in benchmarks/repos/. Methodology in docs/design/SEARCH-QUALITY.md.
| Repo | Language | Files | NDCG@10 | Symbol | Semantic |
|---|---|---|---|---|---|
| Flask | Python | 259 | 0.710 | 0.805 | 0.662 |
| ripgrep | Rust | 239 | 0.493 | 0.810 | 0.356 |
| fastify | TypeScript | 417 | 0.432 | 0.822 | 0.321 |
| cargo | Rust | 2,815 | 0.379 | 0.705 | 0.285 |
| kafka | Java | 7,231 | 0.354 | 0.934 | 0.191 |
| django | Python | 5,690 | 0.262 | 0.495 | 0.211 |
| terraform | Go | 5,323 | 0.287 | 0.238 | 0.319 |
| vscode | TypeScript | 14,643 | 0.208 | 0.639 | 0.080 |
Symbol search is consistently strong (avg 0.681) across all sizes. Semantic search degrades at scale — the 32M embedded model works best on codebases under 3K files. For larger repos, code-specific model tiers are planned (see Roadmap).
These are honest numbers on codebases we didn't write and don't tune for.
Contributing
See the Contributing guide for setup, workflow, and how to add commands, languages, and run parsers.
License
Apache 2.0
Part of the Civitas ecosystem — open infrastructure for AI agent tooling.