rgx-cli 0.1.1

A terminal regex debugger with real-time matching, capture group highlighting, and plain-English explanations
Documentation
# The side project most likely to earn mass GitHub stars

**A terminal-native regex debugger TUI — "regex101 for the terminal" — is the strongest candidate**, combining the largest untapped audience with multiple layers of AI replication resistance and a feasible build scope. It scores highest on the combined metric of star potential, AI resistance, and buildability, narrowly beating out a smart terminal theme manager and a universal format converter. Below is the full analysis of all candidates — the user's original 8 ideas plus 6 new ones discovered through community research — scored and ranked.

## Why most of the original 8 ideas fail the moat test

The original candidate list contains five "LLM wrapper" tools (token counter, repo explainer, AI log analyzer, git commit generator, and implicitly the config diff tool with AI). These share a fatal weakness: **their core logic is "pipe data to an LLM API."** A developer with Cursor can replicate any of them in 15–30 minutes. The git commit message generator is the worst offender — Andrej Karpathy demonstrated it as a shell one-liner, and the space already has **aicommits (7.8k stars)** and **opencommit (6.5k stars)**. The cross-platform port manager is equally doomed: **fkill-cli already has 6.9k stars**, and the underlying logic is just `lsof` parsing plus `kill`. The env var validator competes with **t3-env (3.7k stars)** and is schema-validation boilerplate.

Only one original idea has genuine AI resistance: the **semantic config diff tool**, which requires tree-matching algorithms, edit-distance computation, and multi-format parsers. However, **graphtage (2.3k stars, DARPA-funded)** already occupies this exact niche, and the build complexity is punishing — rated 8/10, making it a risky side project.

The repo explainer CLI deserves a partial pass on star potential — **Repomix's 21.7k stars** prove adjacent demand — but its moat is tissue-thin. The "AI log analyzer" has moderate potential but competes with **lnav (9.6k stars)** and **angle-grinder (3.5k stars)** in the non-AI space, and the AI version is trivially replicable.

## Top 7 candidates ranked by combined score

Each candidate is scored on star potential (1–10), AI replication resistance (1–10), and build complexity (1–10, lower = easier). The combined score weights stars and AI resistance equally while penalizing excessive build difficulty.

| Rank | Project | Star | AI Resist | Build | Combined | Moat category |
|------|---------|:----:|:---------:|:-----:|:--------:|---------------|
| 1 | Terminal regex debugger TUI | 8 | 7 | 6 | ★★★★★ | UX polish + format expertise + composition |
| 2 | Smart terminal theme manager | 7 | 7 | 6 | ★★★★☆ | Curated data + system integration |
| 3 | Universal format converter CLI | 7 | 6 | 5 | ★★★★☆ | Format expertise + edge-case depth |
| 4 | Smart universal file previewer | 7 | 8 | 7 | ★★★☆☆ | Format breadth + rendering + composition |
| 5 | Semantic config diff tool | 6 | 8 | 8 | ★★★☆☆ | Handcrafted algorithms + format expertise |
| 6 | Cross-shell command translator | 6 | 9 | 9 | ★★☆☆☆ | Deep format/protocol expertise |
| 7 | Repo explainer CLI | 7 | 3 | 4 | ★★☆☆☆ | Prompting polish only (weak moat) |

## #1: Terminal regex debugger TUI — the clear winner

**One-liner:** "regex101.com, but in your terminal." The concept is dead simple. The execution is not.

**Why stars are almost guaranteed.** regex101.com handles millions of developer visits monthly, yet **no terminal-native equivalent exists with more than 200 GitHub stars**. The closest tools — `rexi`, `regex-tui`, `retest` — are all sub-200-star weekend projects missing critical features. Meanwhile, **grex (7.2k stars)** proves massive demand for regex tooling in the terminal, though it solves a different problem (generating regex from examples). The target audience is essentially every developer who uses a terminal and writes regex — a near-universal demographic. The demo is inherently visual and shareable: colored capture groups, real-time matching, plain-English explanations.

**What makes this brutally hard for AI to replicate in an hour:**

The gap between "basic regex tester" and "regex101-quality experience" is enormous and spans multiple moat categories simultaneously:

- **Category 4 — UX/rendering polish:** Character-by-character match highlighting with nested capture groups (each group a distinct color), real-time updates on every keystroke, smooth scrolling through large inputs, and intuitive keybindings. This is hundreds of hours of TUI iteration. The difference between "it works" and "it feels great" is the entire product.
- **Category 7 — Deep format expertise:** Supporting multiple regex engines (PCRE2, Python `re`, Go `regexp`, Rust `regex`, JavaScript) with their genuinely different behaviors around lookaheads, backreferences, Unicode properties, and named groups. An AI agent would need to know that `\b` means different things in different engines, that JavaScript lacks lookbehinds in older specs, that Rust's `regex` crate doesn't support backreferences at all.
- **Category 1 — Handcrafted algorithm:** The plain-English explanation engine requires walking a regex AST and generating natural language descriptions of each node. This is a recursive descent problem with dozens of construct types (character classes, quantifiers, alternations, assertions, groups, Unicode categories). Getting the explanations both accurate and readable requires iterative tuning.
- **Category 5 — Emergent complexity from composition:** The tool integrates regex parsing, AST construction, multi-engine execution, terminal rendering, real-time input handling, and explanation generation. Each piece is tractable alone; the integration — keeping everything responsive at keystroke speed — is where the difficulty compounds.

**Concrete build plan:** Write in Rust using `ratatui` for the TUI. Start with Rust's `regex` crate engine, add PCRE2 via FFI. Build the AST explainer by walking `regex-syntax`'s parse tree. Ship a v0.1 with single-engine support and explanation, then expand to multi-engine comparison. Target **3–6 months** for a polished v1.0. The narrowing scope (it's still "just" regex) keeps it feasible as a side project despite the depth.

## #2: Smart terminal theme manager fills a surprising gap

**One-liner:** "Change one theme, and bat, delta, lazygit, starship, fzf, tmux, vim, and your terminal emulator all update."

The pain is acute for anyone who customizes their terminal. Every tool — **bat, delta, eza, lazygit, starship, fzf, tmux, neovim, alacritty, kitty, wezterm** — has its own config format, theme file format, and reload mechanism. Switching from Catppuccin Mocha to Tokyo Night means editing 8–12 config files manually. **Tinty (350–500 stars)** exists but is immature, limited to Base16 templates, and has known bugs.

The moat is **curated configuration knowledge**. Supporting N tools × M themes requires understanding each tool's config syntax, theme file location, color variable names, and hot-reload mechanism. This is exactly Category 2 (hard-to-curate data) plus Category 3 (system-level integration). An AI agent can generate a config snippet for one tool, but orchestrating live theme switching across a dozen tools with their idiosyncratic reload behaviors — `tmux source-file`, `nvim` Lua commands, `kitty` socket commands, `alacritty` TOML live-reload — requires deep, tested knowledge that accumulates over months. The moat grows with every tool added.

**Star ceiling: 5–15k.** The r/unixporn community alone has **1.2M+ subscribers**, and theme management is a perennial topic. The demo is inherently visual — a 10-second GIF showing a terminal transforming from one theme to another across every visible tool would be irresistible.

## #3: Universal format converter has quiet, compounding moat

**One-liner:** `conv input.yaml -o output.toml` — convert between JSON, YAML, TOML, XML, CSV, INI, HCL, and .env.

No single tool covers all 8 formats well. **yj (~1k stars)** handles only 4. **dasel (~5k stars)** focuses on querying, not conversion. **yq (~12k stars)** is YAML-first. The concept is dead simple, the audience is universal, and the edge cases are where the moat lives: **YAML anchors and aliases** have no equivalent in TOML, **TOML datetime types** collapse to strings in JSON, **XML attributes vs. elements** create ambiguous mappings, **CSV lacks nesting** entirely, and **HCL's expression syntax** is its own beast. Round-trip fidelity and comment preservation across formats require deep format expertise (Category 7). An AI agent could build a basic 3-format converter in an hour but would fail on the long tail of format interactions.

Build complexity is the lowest of the top 5 — a polished v1.0 in Rust using existing parser crates (`serde_yaml`, `toml`, `serde_json`, `quick-xml`) is achievable in **2–3 months**. Star ceiling: **5–10k**, driven by daily utility rather than flashy demos.

## #4 and #5 offer highest AI resistance but harder builds

The **smart universal file previewer** ("preview any file in the terminal") scores highest on raw AI resistance (**8/10**) because it requires handling 30+ file formats: images via sixel/kitty protocol, PDFs via text extraction, archives via listing, binaries via hex + structural annotation, code via syntax highlighting, data files via formatted tables. Each format handler is its own mini-project, and the breadth is the moat. Existing tools are fragmented — **pistol** and **ctpv** are file-manager-specific, **bat** only does code, **chafa** only does images. A standalone `preview` command doesn't exist. Build complexity is **7/10**, making it a longer commitment.

The **semantic config diff tool** has the deepest algorithmic moat (**8/10 AI resistance**) — proper tree-matching uses Wagner-Fischer edit distance, minimum-weight bipartite matching, and Levenshtein computation. **Graphtage** proved the concept works but is academic and slow. A faster, simpler alternative supporting TOML, HCL, and Terraform (formats graphtage doesn't cover) could capture the DevOps audience. But the **8/10 build complexity** — roughly equivalent to graphtage's DARPA-funded effort — makes this risky for a solo side project.

## What separates the winners from the also-rans

Three patterns emerged from analyzing every CLI tool that crossed **10k GitHub stars** in 2023–2025:

**Pattern 1: The moat is in the long tail, not the happy path.** ripgrep's value isn't "search files for a pattern" — it's the Unicode-aware regex engine, SIMD optimization, smart `.gitignore` handling, and thousands of edge cases fixed over years. fzf's value isn't "fuzzy match a list" — it's the scoring algorithm tuned across millions of real inputs, the preview system, and shell integration across bash/zsh/fish/nushell. The regex debugger follows this pattern: the happy path (match regex against string) is trivial; the long tail (multi-engine differences, Unicode properties, nested capture groups, performance at scale) is the product.

**Pattern 2: The best tools replace muscle memory.** `bat` replaced `cat`. `fd` replaced `find`. `eza` replaced `ls`. The regex debugger would replace "open browser → navigate to regex101.com → paste pattern → paste test string." That workflow interruption — leaving the terminal, context-switching to a browser — is the pain point. A TUI that stays in the terminal and pipes results directly to other commands captures a **workflow slot**, not just a feature.

**Pattern 3: Curated breadth compounds over time.** tldr-pages (55k stars) is valuable because of 5,000+ community-contributed pages, not because of the rendering code. The terminal theme manager gains value with every new tool config it supports. The file previewer gains value with every new format handler. This "curated breadth" moat is the hardest for AI to replicate because it requires real-world testing across hundreds of configurations — not something a single prompt can generate.

## Conclusion

The terminal regex debugger TUI wins because it sits at a rare intersection: **massive audience** (every developer), **zero serious competition** (nothing above 200 stars), **inherently visual demos** (GIFs of colored capture groups sell themselves), and **multi-layered AI resistance** spanning UX polish, format expertise, and compositional complexity. It looks like a weekend project but requires months of tuning to get right — exactly the profile that earns stars and resists commoditization.

The terminal theme manager is the strongest runner-up, with a curated-data moat that grows over time. The universal format converter is the safest bet — lowest build complexity, clearest value proposition, and edge-case depth that compounds. Any of the top 3 could realistically reach **5–15k stars** within a year of launch. The bottom line: **build something that looks like a one-liner to describe but requires a thousand edge cases to get right — and make sure those edge cases can't be generated by a single AI prompt.**