# ๐ก๏ธ ARGUS -- The verification layer for AI-generated code
> **AI generates code at near-zero cost. Human review didn't get faster. The bottleneck inverted: it's no longer generation -- it's verification.**
>
> ARGUS is the verification infrastructure. **15 Rust crates, 4 specialists, an audit chain that's BLAKE3-hash-chained and Ed25519-signed -- EU AI Act Art. 12 Level 2 ready by default.** MIT licensed. BYOK. Zero SaaS lock-in.
[](https://github.com/SuarezPM/apohara-argus/actions/workflows/ci.yml)
[](https://github.com/SuarezPM/apohara-argus/actions/workflows/aislop.yml)






[](https://scorecard.dev/viewer/?uri=github.com/SuarezPM/apohara-argus)
[](https://www.bestpractices.dev/projects/13242)
[Install](#-install) ยท [Quickstart](#-quickstart) ยท [Why](#-why-this-exists) ยท [What](#-what-argus-is) ยท [Numbers](#-the-numbers) ยท [Pricing](docs/pricing.md) ยท [Security](SECURITY.md)
---
## The problem is here. Now.
**Open Source is dying in 2026.** La confianza comunitaria se ahoga ante un +206% de scripts de Bash en proyectos AIยน, revisiones de PRs **4.6ร mรกs lentas**ยฒ y **15-18% mรกs de vulnerabilidades**ยฒ. Con **42% del cรณdigo commiteado hoy siendo AI-generated o AI-assisted**ยณ y el **96% de los devs desconfiando de รฉl**ยณ, el AI slop -- *Palabra del Aรฑo 2025*โด -- ha forzado medidas extremas:
| Project | Response | Date |
|---|---|---|
| ๐ **Ladybird** (browser) | Cerrรณ sus PRs pรบblicas. *"We will no longer accept public pull requests."* | Jun 2026โต |
| ๐จ **tldraw** (whiteboard) | Auto-close de PRs externas. *"Open source contribution has always been a gift economy held together by proof of work. AI has changed that."* | Jan 2026โถ |
| ๐ฎ **RPCS3** (PS3 emulator) | Tuvo que **revertir mรบltiples PRs AI que causaron regresiones en producciรณn**. | May 2026โท |
| ๐ **cURL** (web infrastructure) | **Cancelรณ su bug bounty** porque 19 de cada 20 reportes eran alucinaciones sintรฉticas. | Jan 2026โธ |
**Fuentes:** ยน[GitHub Octoverse 2025](https://github.blog/ai-and-ml/generative-ai/how-ai-is-reshaping-developer-choice-and-octoverse-data-proves-it/) ยท ยฒ[Opsera 2026 AI Coding Impact Report](https://opsera.ai/resources/report/ai-coding-impact-2026-benchmark-report/) ยท ยณ[Sonar State of Code Developer Survey 2026](https://www.sonarsource.com/blog/state-of-code-developer-survey-report-the-current-reality-of-ai-coding) ยท โด[Merriam-Webster Word of the Year 2025](https://www.globenewswire.com/news-release/2025/12/15/3205236/0/en/Merriam-Webster-Announces-Slop-as-the-2025-Word-of-the-Year.html) ยท โต[Ladybird blog](https://linuxiac.com/ladybird-browser-closes-public-pull-requests-ahead-of-first-alpha/) ยท โถ[tldraw issue #7695](https://github.com/tldraw/tldraw/issues/7695) ยท โท[RPCS3 commit c0b3580](https://github.com/RPCS3/rpcs3/commit/c0b358003f813e28d7902cd65251c3506847619a) ยท โธ[Daniel Stenberg, "The end of the curl bug-bounty"](https://daniel.haxx.se/blog/2026/01/26/the-end-of-the-curl-bug-bounty/)
> ๐ค *AI slop is a tragedy of the commons* ([arXiv:2603.27249](https://arxiv.org/abs/2603.27249)): individual productivity gains externalize costs onto reviewers and maintainers. **The bottleneck isn't generation. It's verification.**
---
## ๐ก What ARGUS is
ARGUS = **AI Review & Governance for Undermining Slop** -- the trust layer for AI-generated code.
**One product. Three layers. Four specialists. One signed certificate per analysis.**
Built for **engineering managers, OSS maintainers, and CISOs** who need an audit-grade, EU AI Act-ready answer to the verification bottleneck. Pure Rust (15 crates, zero Python, zero Node.js in production). BYOK (your NVIDIA NIM key, never persisted). MIT licensed.
---
## ๐ก๏ธ The 3 layers (one worker each)
| Worker | When it runs | What it does | Latency |
|---|---|---|---|
| **Aegis Guard** | Pre-commit / pre-push | Hybrid scan on the staged diff: deterministic AST pre-flight (5 SLOP rules, regex, <100ms) + LLM semantic. Blocks critical issues. | <2s |
| **Aegis Verify** | PR review (webhook or one-shot) | 4 specialists in parallel via Tokio `join!` + **CordonEnforcer** (synthesizer never sees raw code). Emits a `fix_plan.json` for downstream coding agents. | 4-8s |
| **Aegis Lens** | Weekly digest | Aggregates findings across an org, ranks top offenders, generates an executive briefing (text + optional HeyGen video). | 5-15s |
---
## ๐ค The 4 specialists (run in parallel inside Verify)
| Specialist | Prompt | What it catches | Hybrid? |
|---|---|---|---|
| **Aegis Slop** | `slop-detector` | Narrative comments, swallowed errors, oversized fns (>80 LOC), `.unwrap()` outside tests, TODO stubs, unused `pub fn` | โ
regex + LLM |
| **Aegis Security** | `redteam-security` | Hardcoded credentials, injection, unsafe panic, unhandled errors, OWASP Top 10 | LLM |
| **Aegis Arch** | `architecture-fit` | Repo coherence, pattern matching, idiom detection, separation of concerns | LLM |
| **Aegis Verdict** | `verdict-synthesizer` | Synthesizes the 3 above into Approved/ReviewRequired/Halted + `FixPlan` | LLM |
> **CordonEnforcer is the moat:** the verdict synthesizer in the pipeline **never sees raw code**. It only sees the structured outputs of the other three specialists. No competitor (CodeRabbit, Greptile, Qodo) has this constraint.
---
## โจ The 7 things that make ARGUS different
### 1. Hybrid detection -- cheap + deep
```
SLOP-001 oversized fn (size) โโบ regex < 1ms catches 40-60% of slop
SLOP-002 swallowed error arm โโบ regex < 1ms
SLOP-003 TODO stub โโบ regex < 1ms
SLOP-004 unwrap/expect outside tests โโบ regex < 1ms
SLOP-005 unused pub fn โโบ regex < 1ms
+ semantic reasoning โโบ LLM 2-4s catches the rest
```
No competitor has this combination. The result: **60-80% LLM cost reduction** on typical PRs. **Measured: P=1.000, R=0.818, F1=0.900 on 40-PR benchmark** ([BENCHMARK.md](docs/BENCHMARK.md)).
### 2. EU AI Act Article 12 Level 2 ready by default
The 16-field `AuditEvent` is automatically emitted on every LLM call:
```json
{
"audit_id": "...",
"timestamp": "2026-06-12T19:00:00Z",
"model_id": "deepseek-ai/deepseek-v4-flash",
"prompt_template_version": "abc123",
"prompt_fingerprint": "BLAKE3 hex (GDPR-safe)",
"response_fingerprint": "BLAKE3 hex",
"data_class": "source_code",
"policy_version": "verify-worker-v1-policy",
"decision": { "verdict": "warn", "findings_count": 2, "rationale": "..." },
"prev_hash": "...", "signature": "Ed25519 hex"
}
```
Verifiable: `curl /audit/export?from=2026-01-01&to=2026-12-31` returns NDJSON with a BLAKE3 manifest footer. **No cleartext prompts, ever.** GDPR derivative-liability-safe by construction. Enforcement starts **Aug 2, 2026** -- 51 days from this README.
### 3. MCP server for Claude Code / Codex / Cursor
```json
// ~/.config/claude-code/mcp.json
{
"mcpServers": {
"argus": {
"command": "argus-mcp",
"env": { "ARGUS_NIM_KEY": "nvapi-..." }
}
}
}
```
Four tools land in your agent's toolbox:
- `aegis_slop` โ AI slop signals
- `aegis_security` โ adversarial review
- `aegis_arch` โ architectural fit score
- `aegis_verdict` โ final verdict + FixPlan
Your coding agent now has ARGUS on tap. It can run a slop check, a security check, and a verdict on its own draft PR -- **automatically, before it ever asks for human review**.
### 4. A2A AgentCards -- discoverable to Google's open protocol
```
GET /.well-known/agent-card.json
GET /a2a/message
```
Opt-in via `ARGUS_A2A_DISABLED=false`. Google A2A orchestrators can discover and message our 4 specialists.
### 5. BYOK economics -- $0.05/dev/month
- User provides the NVIDIA NIM key (`X-LLM-Key` header or `ARGUS_NIM_KEY` env)
- No telemetry, no tracking, no per-seat fees
- We don't see your diffs -- they go directly from your process to NIM
- 100ร cheaper than CodeRabbit ($0.10-0.50/PR) at scale
### 6. Production resilience out of the box
- **LLM circuit breaker** with full-jitter exponential backoff (rolled our own, no `llm-retry` dep)
- **Idempotency-Key** support on `POST /analyze` (24h TTL)
- **Graceful shutdown** on SIGINT/SIGTERM (Axum `with_graceful_shutdown`)
- **OpenTelemetry** stdout exporter (env-gated via `ARGUS_OTEL_DISABLED`)
- **SQLite** audit persistence (`InMemoryAuditStore` for ephemeral, `SqliteAuditStore` for durable)
### 7. Pure Rust 100%, MSRV 1.88
- 15 crates, 4 binaries
- **194 tests passing** (no flaky)
- `cargo build --release` in 1m 27s
- Zero Python, zero Node.js in the production binary
- `RUSTFLAGS="-D warnings" cargo test` is the CI gate
---
## ๐๏ธ Architecture
```
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ GitHub PR / commit / org scan โ
โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ARGUS -- Three Layers โ
โ โ
โ Aegis Guard Aegis Verify Aegis Lens โ
โ (pre-commit) โโโบ (PR review) โโโบ (weekly digest) โ
โ <2s 4-8s 5-15s โ
โ โ โ
โ โผ โ
โ 4 specialists in parallel โ
โ (slop, security, arch, verdict) โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ AuditEvent (16 fields) -- EU AI Act L2 ready โ โ
โ โ BLAKE3 chain + Ed25519 signature + BLAKE3 NDJSON โ โ
โ โ manifest at /audit/export โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ SQLite (in-process) โโโโบ Supabase Postgres (remote, opt.) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Dashboard (axum + htmx + SSR) โ
โ Weekly briefings (HeyGen deeplink)โ
โ Cohort view (CodeRabbit-style) โ
โ + /audit/export for regulators โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
External:
โโโโโโโ
MCP server (apohara-argus-mcp) โโโบ Claude Code / Codex / Cursor
A2A AgentCards โโโบ Google A2A orchestrators
```
---
## ๐ฆ Install (30 seconds)
Pick the path that matches your environment. All three ship the same MIT-licensed core.
| Path | Command | What you get |
|---|---|---|
| **npm** (no Rust needed) | `npx @apohara/argus --help` | The CLI + the MCP server. Downloads the right binary on first run. |
| **cargo** (Rust toolchain) | `cargo install apohara-argus-cli` | Just the CLI. Faster startup, no download step. |
| **Docker** | `docker run -e ARGUS_NIM_KEY=$YOUR_NIM_KEY SuarezPM/apohara-argus --help` | Full containerized ARGUS, no host dependencies. |
### Build from source
```bash
git clone https://github.com/SuarezPM/apohara-argus
cd apohara-argus
cargo build --release
./target/release/argus --help
```
### Verify the install
```bash
npx @apohara/argus health
# or
argus health
# or
docker run -e ARGUS_NIM_KEY=$YOUR_NIM_KEY SuarezPM/apohara-argus health
```
---
## ๐ Quickstart (90 seconds end-to-end)
```bash
git clone https://github.com/SuarezPM/apohara-argus.git
cd apohara-argus
# 1. Get a free NVIDIA NIM key at https://build.nvidia.com/
export ARGUS_NIM_KEY=nvapi-xxx
# 2. Build everything (pure Rust, MSRV 1.88, ~1m 27s on a modern laptop)
cargo build --release
# 3. Pre-commit guard on a local diff
echo "+ user.password = 'hunter2'" | cargo run -p apohara-argus-cli -- guard --diff -
# 4. PR review (one-shot, with the 4 specialists)
cargo run -p apohara-argus-cli -- verify --pr-url https://github.com/owner/repo/pull/42
# 5. Weekly digest for an org
cargo run -p apohara-argus-cli -- lens --org acme --mock-prs "acme/api#1,acme/web#2"
# 6. Start the dashboard (SSR, port 3000)
cargo run -p argus-dashboard
# 7. Start the MCP server (for Claude Code / Codex)
cargo run -p apohara-argus-mcp
# 8. Verify EU AI Act compliance (BLAKE3 chain + manifest)
curl http://localhost:8080/audit/export?from=2026-01-01 | tail -1
# โ { "# manifest: { "count": 47, "b3_hash": "...", ... } }
```
---
## ๐ The numbers
Numbers we **measured** ([BENCHMARK.md](docs/BENCHMARK.md)), not promised:
| Metric | Value | Why it matters |
|---|---|---|
| **Precision** | **1.000** on 40-PR dataset | Zero false positives on the deterministic layer |
| **Recall** | **0.818** on 40-PR dataset | Catches 82% of AI-slop patterns; the 2 FNs are documented as rule-scope gaps |
| **F1 score** | **0.900** | Above the 0.70 plan target |
| **Deterministic slop pass** | <100ms on 10k LOC | 60-80% of LLM cost saved |
| **`cargo build --release`** | 1m 27s | Fast iteration |
| **Tests** | **194** passing | Boring reliable |
| **Per-dev cost** | $0.05/month (BYOK) | 100ร cheaper than CodeRabbit at scale |
| **EU AI Act Art. 12** | Level 2 ready | Regulators can verify via `curl /audit/export` |
| **Crates** | 15 | 4 binaries |
| **MSRV** | 1.88 | Compatible with stable Rust 2024 |
| **Pure Rust** | 100% | No Python, no Node.js in production |
---
## ๐ Comparison
| | **ARGUS** | CodeRabbit | Greptile | Qodo |
|---|---|---|---|---|
| **BYOK** | โ
NVIDIA NIM | โ SaaS only | โ SaaS only | โ SaaS only |
| **Per-dev cost** | $0.05/mo | $0.10-0.50/PR | $25/mo | $40-60/mo |
| **EU AI Act ready** | โ
Art.12 L2 | โ | โ | โ |
| **Audit trail signed** | โ
Ed25519 + BLAKE3 | โ | โ | โ |
| **MCP server** | โ
4 tools | โ | โ | โ |
| **A2A AgentCards** | โ
| โ | โ | โ |
| **CordonEnforcer** (synthesizer doesn't see raw code) | โ
| โ | โ | โ |
| **Hybrid detection** (deterministic + LLM) | โ
| โ LLM-only | โ LLM-only | โ LLM-only |
| **Measured P/R/F1** | โ
P=1.0, R=0.82 | โ | โ | โ |
| **Open source** | โ
MIT | โ | โ | โ |
| **Pure Rust** | โ
| โ TS/Node | โ TS/Node | โ TS/Node |
---
## ๐ฅ For the [target user]
### For the **CISO** ๐
EU AI Act Art. 12 compliance is **one `curl`**, not a 6-month audit. The audit chain is **BLAKE3-hash-chained and Ed25519-signed** -- your regulator can verify it offline without trusting ARGUS. **BYOK + offline-first** means your code never leaves your host. No data residency issue. See [docs/for-ciso.md](docs/for-ciso.md) for the full pitch.
### For the **engineering manager** ๐
ARGUS pays for itself in week 1 of any team > 3 developers:
- **Per dev:** 25-40 min/PR saved in review (only edit the bot's draft) + ~15 min/week avoided in re-work
- **Per team of 10 devs:** 4-7 hrs/week in maintainer time + 5-10 AI slop bugs prevented/month
- **Per engineering manager:** 4-6 hrs/week in manual reporting โ 0 with Aegis Lens
### For the **OSS maintainer** ๐ ๏ธ
Stop drowning in AI slop. Add ARGUS as a pre-commit hook or a PR webhook. **P=1.0, R=0.82** on the deterministic layer means **zero false positives** for the rules we ship. The LLM semantic layer catches the rest. Triage in 4-8 seconds, not 40 minutes.
---
## ๐บ๏ธ Roadmap (what's shipped, what's next)
The 19 features shipped (1 of 20 deliberately not done):
| # | Feature | Status |
|---|---|---|
| 1.1 | Cohort view (dashboard) | โ
Shipped |
| 1.2 | `fix_plan.json` hand-off | โ
Shipped |
| 1.3 | aislop CI badge | โ
Shipped (dogfooding virtuous loop) |
| 2.1 | `AuditEvent` (16 fields) BLAKE3 + Ed25519 | โ
Shipped |
| 2.2 | NDJSON audit export | โ
Shipped (regulator-ready) |
| 2.4 | Retention in `argus health` | โ
Shipped (warns if <180d per Art. 19) |
| 3.1 | LLM circuit breaker | โ
Shipped (no retry storms on NIM outage) |
| 3.2 | A2A AgentCards | โ
Shipped (Google's open protocol) |
| 4 | EU AI Act L2 conformance | โ
Shipped (default) |
| 4.1 | Per-role model registry | โ
Shipped (deepseek-v4 / nemotron-3 / glm-5.1) |
| 5 | MCP server | โ
Shipped (4 tools for Claude Code/Codex/Cursor) |
| 5.1 | Deterministic slop pre-flight | โ
Shipped (5 SLOP rules, <100ms) |
| 6.1 | Graceful shutdown | โ
Shipped (Axum `with_graceful_shutdown`) |
| 6.2 | Idempotency-Key | โ
Shipped (24h TTL, no double-billing) |
| 6.3 | OpenTelemetry stdout | โ
Shipped (env-gated) |
| 6.4 | SQLite audit persistence | โ
Shipped (sqlx 0.7) |
| 7.1 | HeyGen deeplink | โ
Shipped (url_encode, 0% cost) |
| 8.2 | SPIFFE primitives | โ
Shipped (spiffe 0.16) |
| **7.2** | **BYVK opt-in** (HeyGen/D-ID video integration) | **โ Deliberately not done** -- the $78-460/yr cost kills the $0.05/dev/month story. 7.1 (deeplink) gives 80% of the value at 0% of the cost. |
### What's next (human-action items)
- ๐ **crates.io publishing** -- 13 crates ready; awaiting `CARGO_REGISTRY_TOKEN` repo secret
- ๐ **OpenSSF Best Practices Silver** -- evidence map ready at `docs/best-practices-silver.md`; awaiting form submission at `bestpractices.dev`
- ๐ **First release on GitHub** with SLSA L3 attestation, SHA256 manifest, and distroless Docker image
---
## ๐ ๏ธ Use it. Fork it. Ship it.
```bash
git clone https://github.com/SuarezPM/apohara-argus.git
cd apohara-argus
export ARGUS_NIM_KEY=nvapi-xxx
cargo run -p apohara-argus-cli -- scan-diff ./your-pr.diff
```
**License: MIT.** Self-host, modify, redistribute. No telemetry, no phone-home.
Questions? Open an issue at `https://github.com/SuarezPM/apohara-argus/issues`.
---
### ๐ Read the docs
| Doc | What's in it |
|---|---|
| [docs/VERIFICATION.md](docs/VERIFICATION.md) | The 22-check local verification report |
| [docs/CI-VERIFICATION.md](docs/CI-VERIFICATION.md) | The 4 auto-trigger GitHub Actions workflows |
| [docs/HANDS-ON-QA.md](docs/HANDS-ON-QA.md) | 22/22 hands-on QA checks pass |
| [docs/SCOPE-FIDELITY.md](docs/SCOPE-FIDELITY.md) | 95/100 scope fidelity, 24/28 sub-tasks delivered |
| [docs/best-practices-silver.md](docs/best-practices-silver.md) | OpenSSF Best Practices Silver evidence map |
| [docs/BENCHMARK.md](docs/BENCHMARK.md) | P/R/F1 on 40 PRs + latency + cost |
| [docs/pricing.md](docs/pricing.md) | 3 tiers (Free / Team / Enterprise) |
| [docs/for-ciso.md](docs/for-ciso.md) | CISO-targeted EU AI Act pitch |
| [docs/branch-protection.md](docs/branch-protection.md) | Branch protection policy + `gh api` snippet |
| [SECURITY.md](SECURITY.md) | Threat model (covers / does NOT cover) |
| [GOVERNANCE.md](GOVERNANCE.md) | Roles, access continuity, fork-ability |
| [CONTRIBUTING.md](CONTRIBUTING.md) | DCO + coding standards + testing policy |
| [CHANGELOG.md](CHANGELOG.md) | Keep a Changelog format |
---
> *Built for the **Platzi Reto AI Academy** as 5 projects in one product:*
> *System of Prompts ยท Automate the Flow ยท Web App ยท The Agent ยท MVP with Real Intelligence.*
> *1 Cargo workspace, 15 crates, 194 tests, MIT license. The verification layer for the AI-generated code era.*