apohara-argus-cli 0.1.0

Unified CLI for ARGUS. `argus health`, `argus verify`, `argus guard`, `argus lens` โ€” all surfaces in one binary.
Documentation

๐Ÿ›ก๏ธ ARGUS -- The verification layer for AI-generated code

AI generates code at near-zero cost. Human review didn't get faster. The bottleneck inverted: it's no longer generation -- it's verification.

ARGUS is the verification infrastructure. 15 Rust crates, 4 specialists, an audit chain that's BLAKE3-hash-chained and Ed25519-signed -- EU AI Act Art. 12 Level 2 ready by default. MIT licensed. BYOK. Zero SaaS lock-in.

CI aislop score Rust 100% EU AI Act Art.12 MCP compatible BYOK License: MIT Tests 194+ OpenSSF Scorecard OpenSSF Best Practices Install ยท Quickstart ยท Why ยท What ยท Numbers ยท Pricing ยท Security


The problem is here. Now.

Open Source is dying in 2026. La confianza comunitaria se ahoga ante un +206% de scripts de Bash en proyectos AIยน, revisiones de PRs 4.6ร— mรกs lentasยฒ y 15-18% mรกs de vulnerabilidadesยฒ. Con 42% del cรณdigo commiteado hoy siendo AI-generated o AI-assistedยณ y el 96% de los devs desconfiando de รฉlยณ, el AI slop -- Palabra del Aรฑo 2025โด -- ha forzado medidas extremas:

Project Response Date
๐ŸŒ Ladybird (browser) Cerrรณ sus PRs pรบblicas. "We will no longer accept public pull requests." Jun 2026โต
๐ŸŽจ tldraw (whiteboard) Auto-close de PRs externas. "Open source contribution has always been a gift economy held together by proof of work. AI has changed that." Jan 2026โถ
๐ŸŽฎ RPCS3 (PS3 emulator) Tuvo que revertir mรบltiples PRs AI que causaron regresiones en producciรณn. May 2026โท
๐ŸŒ cURL (web infrastructure) Cancelรณ su bug bounty porque 19 de cada 20 reportes eran alucinaciones sintรฉticas. Jan 2026โธ

Fuentes: ยนGitHub Octoverse 2025 ยท ยฒOpsera 2026 AI Coding Impact Report ยท ยณSonar State of Code Developer Survey 2026 ยท โดMerriam-Webster Word of the Year 2025 ยท โตLadybird blog ยท โถtldraw issue #7695 ยท โทRPCS3 commit c0b3580 ยท โธDaniel Stenberg, "The end of the curl bug-bounty"

๐Ÿค– AI slop is a tragedy of the commons (arXiv:2603.27249): individual productivity gains externalize costs onto reviewers and maintainers. The bottleneck isn't generation. It's verification.


๐Ÿ’ก What ARGUS is

ARGUS = AI Review & Governance for Undermining Slop -- the trust layer for AI-generated code.

One product. Three layers. Four specialists. One signed certificate per analysis.

Built for engineering managers, OSS maintainers, and CISOs who need an audit-grade, EU AI Act-ready answer to the verification bottleneck. Pure Rust (15 crates, zero Python, zero Node.js in production). BYOK (your NVIDIA NIM key, never persisted). MIT licensed.


๐Ÿ›ก๏ธ The 3 layers (one worker each)

Worker When it runs What it does Latency
Aegis Guard Pre-commit / pre-push Hybrid scan on the staged diff: deterministic AST pre-flight (5 SLOP rules, regex, <100ms) + LLM semantic. Blocks critical issues. <2s
Aegis Verify PR review (webhook or one-shot) 4 specialists in parallel via Tokio join! + CordonEnforcer (synthesizer never sees raw code). Emits a fix_plan.json for downstream coding agents. 4-8s
Aegis Lens Weekly digest Aggregates findings across an org, ranks top offenders, generates an executive briefing (text + optional HeyGen video). 5-15s

๐Ÿค– The 4 specialists (run in parallel inside Verify)

Specialist Prompt What it catches Hybrid?
Aegis Slop slop-detector Narrative comments, swallowed errors, oversized fns (>80 LOC), .unwrap() outside tests, TODO stubs, unused pub fn โœ… regex + LLM
Aegis Security redteam-security Hardcoded credentials, injection, unsafe panic, unhandled errors, OWASP Top 10 LLM
Aegis Arch architecture-fit Repo coherence, pattern matching, idiom detection, separation of concerns LLM
Aegis Verdict verdict-synthesizer Synthesizes the 3 above into Approved/ReviewRequired/Halted + FixPlan LLM

CordonEnforcer is the moat: the verdict synthesizer in the pipeline never sees raw code. It only sees the structured outputs of the other three specialists. No competitor (CodeRabbit, Greptile, Qodo) has this constraint.


โœจ The 7 things that make ARGUS different

1. Hybrid detection -- cheap + deep

SLOP-001 oversized fn (size)         โ”€โ–บ regex  < 1ms     catches 40-60% of slop
SLOP-002 swallowed error arm          โ”€โ–บ regex  < 1ms
SLOP-003 TODO stub                    โ”€โ–บ regex  < 1ms
SLOP-004 unwrap/expect outside tests  โ”€โ–บ regex  < 1ms
SLOP-005 unused pub fn                โ”€โ–บ regex  < 1ms
  + semantic reasoning               โ”€โ–บ LLM    2-4s     catches the rest

No competitor has this combination. The result: 60-80% LLM cost reduction on typical PRs. Measured: P=1.000, R=0.818, F1=0.900 on 40-PR benchmark (BENCHMARK.md).

2. EU AI Act Article 12 Level 2 ready by default

The 16-field AuditEvent is automatically emitted on every LLM call:

{
  "audit_id": "...",
  "timestamp": "2026-06-12T19:00:00Z",
  "model_id": "deepseek-ai/deepseek-v4-flash",
  "prompt_template_version": "abc123",
  "prompt_fingerprint": "BLAKE3 hex (GDPR-safe)",
  "response_fingerprint": "BLAKE3 hex",
  "data_class": "source_code",
  "policy_version": "verify-worker-v1-policy",
  "decision": { "verdict": "warn", "findings_count": 2, "rationale": "..." },
  "prev_hash": "...", "signature": "Ed25519 hex"
}

Verifiable: curl /audit/export?from=2026-01-01&to=2026-12-31 returns NDJSON with a BLAKE3 manifest footer. No cleartext prompts, ever. GDPR derivative-liability-safe by construction. Enforcement starts Aug 2, 2026 -- 51 days from this README.

3. MCP server for Claude Code / Codex / Cursor

// ~/.config/claude-code/mcp.json
{
  "mcpServers": {
    "argus": {
      "command": "argus-mcp",
      "env": { "ARGUS_NIM_KEY": "nvapi-..." }
    }
  }
}

Four tools land in your agent's toolbox:

  • aegis_slop โ†’ AI slop signals
  • aegis_security โ†’ adversarial review
  • aegis_arch โ†’ architectural fit score
  • aegis_verdict โ†’ final verdict + FixPlan

Your coding agent now has ARGUS on tap. It can run a slop check, a security check, and a verdict on its own draft PR -- automatically, before it ever asks for human review.

4. A2A AgentCards -- discoverable to Google's open protocol

GET /.well-known/agent-card.json
GET /a2a/message

Opt-in via ARGUS_A2A_DISABLED=false. Google A2A orchestrators can discover and message our 4 specialists.

5. BYOK economics -- $0.05/dev/month

  • User provides the NVIDIA NIM key (X-LLM-Key header or ARGUS_NIM_KEY env)
  • No telemetry, no tracking, no per-seat fees
  • We don't see your diffs -- they go directly from your process to NIM
  • 100ร— cheaper than CodeRabbit ($0.10-0.50/PR) at scale

6. Production resilience out of the box

  • LLM circuit breaker with full-jitter exponential backoff (rolled our own, no llm-retry dep)
  • Idempotency-Key support on POST /analyze (24h TTL)
  • Graceful shutdown on SIGINT/SIGTERM (Axum with_graceful_shutdown)
  • OpenTelemetry stdout exporter (env-gated via ARGUS_OTEL_DISABLED)
  • SQLite audit persistence (InMemoryAuditStore for ephemeral, SqliteAuditStore for durable)

7. Pure Rust 100%, MSRV 1.88

  • 15 crates, 4 binaries
  • 194 tests passing (no flaky)
  • cargo build --release in 1m 27s
  • Zero Python, zero Node.js in the production binary
  • RUSTFLAGS="-D warnings" cargo test is the CI gate

๐Ÿ›๏ธ Architecture

                         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                         โ”‚  GitHub PR / commit / org scan      โ”‚
                         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                        โ”‚
                                        โ–ผ
        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
        โ”‚                       ARGUS -- Three Layers                    โ”‚
        โ”‚                                                               โ”‚
        โ”‚   Aegis Guard       Aegis Verify       Aegis Lens            โ”‚
        โ”‚   (pre-commit)  โ”€โ”€โ–บ (PR review)   โ”€โ”€โ–บ (weekly digest)       โ”‚
        โ”‚   <2s             4-8s              5-15s                    โ”‚
        โ”‚                   โ”‚                                            โ”‚
        โ”‚                   โ–ผ                                            โ”‚
        โ”‚          4 specialists in parallel                            โ”‚
        โ”‚          (slop, security, arch, verdict)                      โ”‚
        โ”‚                   โ”‚                                            โ”‚
        โ”‚                   โ–ผ                                            โ”‚
        โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”‚
        โ”‚   โ”‚  AuditEvent (16 fields) -- EU AI Act L2 ready      โ”‚       โ”‚
        โ”‚   โ”‚  BLAKE3 chain + Ed25519 signature + BLAKE3 NDJSON  โ”‚       โ”‚
        โ”‚   โ”‚  manifest at /audit/export                          โ”‚       โ”‚
        โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ”‚
        โ”‚                       โ”‚                                         โ”‚
        โ”‚                       โ–ผ                                         โ”‚
        โ”‚   SQLite (in-process)  โ—„โ”€โ”€โ–บ  Supabase Postgres (remote, opt.)  โ”‚
        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                        โ”‚
                                        โ–ผ
                โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                โ”‚  Dashboard  (axum + htmx + SSR)    โ”‚
                โ”‚  Weekly briefings (HeyGen deeplink)โ”‚
                โ”‚  Cohort view (CodeRabbit-style)   โ”‚
                โ”‚  + /audit/export for regulators    โ”‚
                โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

        External:
        โ”€โ”€โ”€โ”€โ”€โ”€โ”€
        MCP server (apohara-argus-mcp) โ”€โ”€โ–บ Claude Code / Codex / Cursor
        A2A AgentCards         โ”€โ”€โ–บ Google A2A orchestrators

๐Ÿ“ฆ Install (30 seconds)

Pick the path that matches your environment. All three ship the same MIT-licensed core.

Path Command What you get
npm (no Rust needed) npx @apohara/argus --help The CLI + the MCP server. Downloads the right binary on first run.
cargo (Rust toolchain) cargo install apohara-argus-cli Just the CLI. Faster startup, no download step.
Docker docker run -e ARGUS_NIM_KEY=$YOUR_NIM_KEY SuarezPM/apohara-argus --help Full containerized ARGUS, no host dependencies.

Build from source

git clone https://github.com/SuarezPM/apohara-argus
cd apohara-argus
cargo build --release
./target/release/argus --help

Verify the install

npx @apohara/argus health
# or
argus health
# or
docker run -e ARGUS_NIM_KEY=$YOUR_NIM_KEY SuarezPM/apohara-argus health

๐Ÿš€ Quickstart (90 seconds end-to-end)

git clone https://github.com/SuarezPM/apohara-argus.git
cd apohara-argus

# 1. Get a free NVIDIA NIM key at https://build.nvidia.com/
export ARGUS_NIM_KEY=nvapi-xxx

# 2. Build everything (pure Rust, MSRV 1.88, ~1m 27s on a modern laptop)
cargo build --release

# 3. Pre-commit guard on a local diff
echo "+ user.password = 'hunter2'" | cargo run -p apohara-argus-cli -- guard --diff -

# 4. PR review (one-shot, with the 4 specialists)
cargo run -p apohara-argus-cli -- verify --pr-url https://github.com/owner/repo/pull/42

# 5. Weekly digest for an org
cargo run -p apohara-argus-cli -- lens --org acme --mock-prs "acme/api#1,acme/web#2"

# 6. Start the dashboard (SSR, port 3000)
cargo run -p argus-dashboard

# 7. Start the MCP server (for Claude Code / Codex)
cargo run -p apohara-argus-mcp

# 8. Verify EU AI Act compliance (BLAKE3 chain + manifest)
curl http://localhost:8080/audit/export?from=2026-01-01 | tail -1
# โ†’ { "# manifest: { "count": 47, "b3_hash": "...", ... } }

๐Ÿ“Š The numbers

Numbers we measured (BENCHMARK.md), not promised:

Metric Value Why it matters
Precision 1.000 on 40-PR dataset Zero false positives on the deterministic layer
Recall 0.818 on 40-PR dataset Catches 82% of AI-slop patterns; the 2 FNs are documented as rule-scope gaps
F1 score 0.900 Above the 0.70 plan target
Deterministic slop pass <100ms on 10k LOC 60-80% of LLM cost saved
cargo build --release 1m 27s Fast iteration
Tests 194 passing Boring reliable
Per-dev cost $0.05/month (BYOK) 100ร— cheaper than CodeRabbit at scale
EU AI Act Art. 12 Level 2 ready Regulators can verify via curl /audit/export
Crates 15 4 binaries
MSRV 1.88 Compatible with stable Rust 2024
Pure Rust 100% No Python, no Node.js in production

๐Ÿ†š Comparison

ARGUS CodeRabbit Greptile Qodo
BYOK โœ… NVIDIA NIM โŒ SaaS only โŒ SaaS only โŒ SaaS only
Per-dev cost $0.05/mo $0.10-0.50/PR $25/mo $40-60/mo
EU AI Act ready โœ… Art.12 L2 โŒ โŒ โŒ
Audit trail signed โœ… Ed25519 + BLAKE3 โŒ โŒ โŒ
MCP server โœ… 4 tools โŒ โŒ โŒ
A2A AgentCards โœ… โŒ โŒ โŒ
CordonEnforcer (synthesizer doesn't see raw code) โœ… โŒ โŒ โŒ
Hybrid detection (deterministic + LLM) โœ… โŒ LLM-only โŒ LLM-only โŒ LLM-only
Measured P/R/F1 โœ… P=1.0, R=0.82 โŒ โŒ โŒ
Open source โœ… MIT โŒ โŒ โŒ
Pure Rust โœ… โŒ TS/Node โŒ TS/Node โŒ TS/Node

๐Ÿ‘ฅ For the [target user]

For the CISO ๐Ÿ‘”

EU AI Act Art. 12 compliance is one curl, not a 6-month audit. The audit chain is BLAKE3-hash-chained and Ed25519-signed -- your regulator can verify it offline without trusting ARGUS. BYOK + offline-first means your code never leaves your host. No data residency issue. See docs/for-ciso.md for the full pitch.

For the engineering manager ๐Ÿ“Š

ARGUS pays for itself in week 1 of any team > 3 developers:

  • Per dev: 25-40 min/PR saved in review (only edit the bot's draft) + ~15 min/week avoided in re-work
  • Per team of 10 devs: 4-7 hrs/week in maintainer time + 5-10 AI slop bugs prevented/month
  • Per engineering manager: 4-6 hrs/week in manual reporting โ†’ 0 with Aegis Lens

For the OSS maintainer ๐Ÿ› ๏ธ

Stop drowning in AI slop. Add ARGUS as a pre-commit hook or a PR webhook. P=1.0, R=0.82 on the deterministic layer means zero false positives for the rules we ship. The LLM semantic layer catches the rest. Triage in 4-8 seconds, not 40 minutes.


๐Ÿ—บ๏ธ Roadmap (what's shipped, what's next)

The 19 features shipped (1 of 20 deliberately not done):

# Feature Status
1.1 Cohort view (dashboard) โœ… Shipped
1.2 fix_plan.json hand-off โœ… Shipped
1.3 aislop CI badge โœ… Shipped (dogfooding virtuous loop)
2.1 AuditEvent (16 fields) BLAKE3 + Ed25519 โœ… Shipped
2.2 NDJSON audit export โœ… Shipped (regulator-ready)
2.4 Retention in argus health โœ… Shipped (warns if <180d per Art. 19)
3.1 LLM circuit breaker โœ… Shipped (no retry storms on NIM outage)
3.2 A2A AgentCards โœ… Shipped (Google's open protocol)
4 EU AI Act L2 conformance โœ… Shipped (default)
4.1 Per-role model registry โœ… Shipped (deepseek-v4 / nemotron-3 / glm-5.1)
5 MCP server โœ… Shipped (4 tools for Claude Code/Codex/Cursor)
5.1 Deterministic slop pre-flight โœ… Shipped (5 SLOP rules, <100ms)
6.1 Graceful shutdown โœ… Shipped (Axum with_graceful_shutdown)
6.2 Idempotency-Key โœ… Shipped (24h TTL, no double-billing)
6.3 OpenTelemetry stdout โœ… Shipped (env-gated)
6.4 SQLite audit persistence โœ… Shipped (sqlx 0.7)
7.1 HeyGen deeplink โœ… Shipped (url_encode, 0% cost)
8.2 SPIFFE primitives โœ… Shipped (spiffe 0.16)
7.2 BYVK opt-in (HeyGen/D-ID video integration) โ›” Deliberately not done -- the $78-460/yr cost kills the $0.05/dev/month story. 7.1 (deeplink) gives 80% of the value at 0% of the cost.

What's next (human-action items)

  • ๐Ÿ”“ crates.io publishing -- 13 crates ready; awaiting CARGO_REGISTRY_TOKEN repo secret
  • ๐Ÿ”“ OpenSSF Best Practices Silver -- evidence map ready at docs/best-practices-silver.md; awaiting form submission at bestpractices.dev
  • ๐Ÿ”“ First release on GitHub with SLSA L3 attestation, SHA256 manifest, and distroless Docker image

๐Ÿ› ๏ธ Use it. Fork it. Ship it.

git clone https://github.com/SuarezPM/apohara-argus.git
cd apohara-argus
export ARGUS_NIM_KEY=nvapi-xxx
cargo run -p apohara-argus-cli -- scan-diff ./your-pr.diff

License: MIT. Self-host, modify, redistribute. No telemetry, no phone-home.

Questions? Open an issue at https://github.com/SuarezPM/apohara-argus/issues.


๐Ÿ“š Read the docs

Doc What's in it
docs/VERIFICATION.md The 22-check local verification report
docs/CI-VERIFICATION.md The 4 auto-trigger GitHub Actions workflows
docs/HANDS-ON-QA.md 22/22 hands-on QA checks pass
docs/SCOPE-FIDELITY.md 95/100 scope fidelity, 24/28 sub-tasks delivered
docs/best-practices-silver.md OpenSSF Best Practices Silver evidence map
docs/BENCHMARK.md P/R/F1 on 40 PRs + latency + cost
docs/pricing.md 3 tiers (Free / Team / Enterprise)
docs/for-ciso.md CISO-targeted EU AI Act pitch
docs/branch-protection.md Branch protection policy + gh api snippet
SECURITY.md Threat model (covers / does NOT cover)
GOVERNANCE.md Roles, access continuity, fork-ability
CONTRIBUTING.md DCO + coding standards + testing policy
CHANGELOG.md Keep a Changelog format

Built for the Platzi Reto AI Academy as 5 projects in one product: System of Prompts ยท Automate the Flow ยท Web App ยท The Agent ยท MVP with Real Intelligence. 1 Cargo workspace, 15 crates, 194 tests, MIT license. The verification layer for the AI-generated code era.