ripgrepx (rgx)
Instant ripgrep for codebases you search over and over.
rgx is ripgrep's matcher fronted by a Russ Cox–style
trigram index — the candidate-index idea behind Google
Code Search and zoekt. The index narrows which files to
scan; ripgrep still does the matching, so results are byte-for-byte rg's, just faster. A stale
index can only cost a little speed, never a missed or invented match. It searches content (full
ripgrep regex) and locates files by name (find/fd-style), from the terminal or an AI agent over MCP.
Warm, rgx answers most queries in well under 60 ms where rg takes 100 ms to 2.5 s — a 15–50×
speedup on the kind of symbol searches a developer actually runs, up to 128× on the most
selective. See the benchmarks for the full numbers.
Install
rgx is one self-contained ~3 MB binary — ripgrep's engine is linked in, so you do not need rg
installed. Pick whichever channel you prefer:
# curl | sh — prebuilt binary, no toolchain (macOS/Linux; re-run to update)
|
# npm — fetches the right prebuilt binary
# pipx (or pip) — prebuilt wheel that bundles the binary
# Cargo — prebuilt via binstall, or compiled from source
Or download a prebuilt archive (Windows included) from the
latest release and put rgx on your PATH.
On Windows, use npm, pipx, Cargo, or the release .zip (x86_64-pc-windows-msvc /
aarch64-pc-windows-msvc).
For AI agents
rgx is built first for AI coding agents: fast, token-frugal code search an agent calls over MCP
or as a CLI. After installing the binary above, one command wires it into your agent:
install and uninstall print the exact changes and ask before touching anything (--yes skips
the prompt, and is required when stdin isn't a TTY; --dry-run only previews). install is
deliberately non-intrusive: it writes only where rgx owns the namespace (Claude's
skill dir, a Gemini extension), and for shared files it edits idempotently — a removable marked
block in AGENTS.md / copilot-instructions.md, or a merged "rgx" key in .cursor/mcp.json /
.vscode/mcp.json. It never blind-appends to a file you authored, and uninstall reverses it
exactly. MCP registration that belongs to a host's own CLI (claude/codex mcp add) is printed
for you to run, not executed.
| Agent | What it installs | Scope (--user / --project) |
|---|---|---|
| Claude Code | …/.claude/skills/rgx/SKILL.md + prints claude mcp add |
user (default) or project |
| Codex | marked block in …/.codex/AGENTS.md + prints codex mcp add |
user (default) or project |
| Gemini CLI | …/.gemini/extensions/rgx/ (manifest + context, carries MCP) |
user (default) or project |
| Cursor | .cursor/rules/rgx.mdc + "rgx" in .cursor/mcp.json |
project only |
| VS Code (Copilot) | "rgx" in .vscode/mcp.json + block in .github/copilot-instructions.md |
project (default) or user |
Scope defaults to user-global for tools that support it, so a personal preference doesn't land in a
teammate's repo; pass --project to commit it, or --user to keep Cursor/VS Code out of the tree.
For an agent not listed, rgx --agent skill prints the raw markdown and the MCP config is just
{ "mcpServers": { "rgx": { "command": "rgx", "args": ["--agent", "mcp"] } } } (VS Code uses the key
"servers" instead of "mcpServers").
Token savings (--compact)
Like rtk, rgx can compact search output to save agent tokens:
--compact groups matches by file (the path is printed once), pages the result behind an opaque
cursor, and trims very long lines around the match. Unlike a lossy filter, nothing is dropped —
the match set is exactly rg's, the header reports the full total so you know what you have not seen,
and because the index is warm, fetching the next page is cheap, so every match stays reachable.
[matches 1-50 of 142 in 18 files]
src/server.rs
210: fn content_search(...) -> Result<()> {
src/main.rs
168: fn content_cmd(args: &[String]) -> ExitCode {
next: rgx --compact --cursor '9d13ff881'
The cursor records the entire query (pattern + every flag) plus a keyset resume position, so the next
page is always the same search — never a different one — and a result set that changed between pages
is flagged with a note: line. The token you echo back is a short id: the daemon parks the cursor for
a couple of minutes and hands you the id in its place. It's single-use; if it expires (or the daemon
was stopped) you get pagination expired — re-run the search.
MCP or CLI
- MCP —
rgx --agent mcpexposescontent_search(returns the--compactpaged view by default; pass the responsecursorto advance, orfiles_only/countto orient),file_search, andstatus. Seedocs/mcp.md. - CLI — a near-drop-in for
rg:rgx <pattern>takes the same command line and just runs faster. A barergx <pattern>is plain (accelerated) ripgrep;rgx --find <name>locates files;--servermanages the daemon. Seedocs/cli.md.
State (index + daemon socket) lives outside the repo under $RGX_CACHE_DIR, else the config file's
cache_dir, else $XDG_CACHE_HOME/rgx, else ~/.cache/rgx — a rebuildable cache, safe to delete,
never written into the indexed tree.
Config
Optional TOML at $RGX_CONFIG, else $XDG_CONFIG_HOME/rgx/config.toml, else
~/.config/rgx/config.toml. A missing file is fine; a malformed or invalid one is an error.
# Base directory for the rebuildable cache (index + socket). $RGX_CACHE_DIR overrides this.
# Must be an absolute path (no ~ expansion).
= "/var/tmp/rgx-cache"
# Persist the index only if the cold build took at least this long; below it the index stays
# RAM-only and is rebuilt on each daemon start. 0 always persists. Default 1000.
= 1000
# Exit the daemon after this many seconds with no search, freeing its RAM; the next search
# respawns it. Zero or negative stays resident forever. Default 3600.
= 3600
Benchmarks
rgx (warm daemon, index resident) vs ripgrep 15.1.0 on four real repositories. Output is
byte-for-byte rg's, so this measures only how much less work the index lets ripgrep do.
| repo | files | index size | cold build |
|---|---|---|---|
| lucene | 7.4k | 22 MB | ~1.5 s |
| vscode | 15.1k | 46 MB | ~1.2 s |
| kubernetes | 30.2k | 53 MB | ~1.5 s |
| linux | 93.6k | 210 MB | ~7.4 s |
Real queries (the kind of symbol / error string / API name a developer actually searches for, drawn
from each project's own code and commit history), mean ± σ over 10 runs:
| repo | query | rg |
rgx |
speedup |
|---|---|---|---|---|
| lucene | CorruptIndexException |
101 ± 2 ms | 4.6 ± 0.2 ms | 22× |
| lucene | IndexWriter |
103 ± 1 ms | 17.8 ± 0.8 ms | 6× |
| lucene | TieredMergePolicy|LogMergePolicy |
101 ± 1 ms | 6.3 ± 0.3 ms | 16× |
| vscode | TreeDataProvider |
198 ± 2 ms | 4.1 ± 0.1 ms | 48× |
| vscode | onDidChangeConfiguration |
201 ± 2 ms | 13.6 ± 0.3 ms | 15× |
| vscode | registerCommand |
200 ± 2 ms | 14.0 ± 0.2 ms | 14× |
| kubernetes | func (kl *Kubelet) |
409 ± 6 ms | 3.2 ± 0.2 ms | 128× |
| kubernetes | context deadline exceeded |
418 ± 7 ms | 5.7 ± 0.1 ms | 73× |
| kubernetes | EndpointSlice |
419 ± 9 ms | 8.4 ± 0.2 ms | 50× |
| kubernetes | metav1.ObjectMeta |
411 ± 10 ms | 29.9 ± 0.2 ms | 14× |
| linux | struct task_struct |
1803 ± 373 ms | 42.8 ± 1.0 ms | 42× |
| linux | kmalloc |
2308 ± 507 ms | 57.5 ± 1.4 ms | 40× |
| linux | EXPORT_SYMBOL_GPL |
1606 ± 56 ms | 54.0 ± 1.3 ms | 30× |
| linux | MODULE_LICENSE (broad) |
2518 ± 176 ms | 161.6 ± 1.8 ms | 16× |
The more selective the query, the bigger the win (a rare symbol touches few files; a func (kl *Kubelet) receiver hits 13 of 30k). rgx is also markedly more consistent: its σ stays sub-2 ms
while a full rg scan's swings with cache state (linux kmalloc: rg 2308 ± 507 ms vs rgx 57 ± 1 ms).
The full set (and the fallback rows below) is in bench/baseline.txt.
Honest caveat. A fallback query the index can't narrow — no usable trigram, e.g. \w+ or a
2-char pattern — is handled by an in-process pipelined scan and lands at parity with rg. The one
exception is a match-everything query like .* over the largest repo (printing all 1.5 GB), at
~0.8×: a degenerate "cat the repo", not a search. See
docs/index-and-storage.md §8 for why.
Methodology
- Machine: 12-core / 24 GB, macOS; ripgrep 15.1.0 (
rg --version, recorded by the harness); timings viahyperfine(1 warmup, 10 runs, reported as mean ± σ), output discarded. rgx <pattern> <repo>(CLI talking to its warm daemon) vsrg -n <pattern> <repo>; both pipe to the same sink, so the comparison is apples-to-apples.- Reproduce:
RGX=target/release/rgx bench/bench.sh <repo> <pattern>...(the script prints thergversion, warms the daemon, benchmarks each pattern, and flags any regression). Numbers vary with hardware and cache state.
Documentation
docs/design.md— mission, the index-in-front-of-ripgrep model, correctness contract, open questions.docs/cli.md— command surface and the--servergate.docs/mcp.md— the agent-facing MCP tools.docs/indexing.md— streaming index, freshness, incremental updates.docs/profiling.md— how to profile build/query (criterion, samply, dhat).docs/index-and-storage.md— trigram index design, storage engine choice, and benchmark results vsrg.
License
MIT — see LICENSE.