vibecheck
"I don't always write Rust, but when I do, every function has a doc comment and zero
.unwrap()calls." — The Most Interesting LLM in the World
vibecheck detects AI-generated code and attributes it to a model family. It sniffs out the telltale "vibes" that different AI models leave in code — the suspiciously perfect formatting, the teaching-voice comments, the conspicuous absence of TODO: fix this later.
The 5 stages of vibecheck grief:
1. Denial "I wrote this myself"
2. Anger "The heuristics are WRONG"
3. Bargaining "Ok but I modified 2 lines"
4. Depression vibecheck src/my_code.rs
> Verdict: Claude (94%)
5. Acceptance "...yeah that's fair"
───────────────────────────────────────
Nobody:
Absolutely nobody:
Your AI-generated code:
/// Processes the input data by applying the configured
/// transformation pipeline and returning the validated result.
pub fn process_and_validate_input_data(
&self,
input_data: &InputData,
) -> Result<ValidatedOutput, ProcessingError> {
How It Works
vibecheck runs your source code through two layers of analysis:
Layer 1 — Text-pattern analyzers (all languages):
| Analyzer | What It Sniffs | Example Signal |
|---|---|---|
| Comment Style | Density, teaching voice, doc comments | "12 comments with teaching/explanatory voice" |
| AI Signals | TODO absence, no dead code, eerie perfection | "Every function has a doc comment — suspiciously thorough" |
| Error Handling | unwrap vs expect vs ?, panic usage | "Zero .unwrap() calls — careful error handling" |
| Naming | Variable length, descriptiveness, single-char names | "Very descriptive variable names (avg 14.2 chars)" |
| Code Structure | Type annotations, import ordering, formatting | "Import statements are alphabetically sorted" |
| Idiom Usage | Iterator chains, builder patterns, Display impls | "8 iterator chain usages — textbook-idiomatic Rust" |
Layer 2 — tree-sitter CST analyzers (language-aware):
| Language | Signals |
|---|---|
| Rust | Cyclomatic complexity, doc comment coverage on pub fns, identifier entropy, nesting depth, import ordering |
| Python | Docstring coverage, type annotation coverage, f-string vs %-format ratio |
| JavaScript | Arrow function ratio, async/await vs .then() chaining, optional chaining density |
| Go | Godoc coverage on exported functions, goroutine count, err != nil check density |
Each signal has a weight (positive = evidence for, negative = evidence against) and points to a model family. The pipeline aggregates all signals into a probability distribution.
Results are stored in a content-addressed cache (redb, keyed by SHA-256 of file contents) so unchanged files are never re-analyzed. A Merkle hash tree extends this to directory level — unchanged subdirectories are skipped entirely, making repeated directory scans near-instant.
Installation
# Install the CLI
# Add the library to your project
Usage
CLI
# No arguments: opens the TUI browser in the current directory
# Analyze a single file (pretty output with colors)
# Analyze a directory (supports .rs, .py, .js, .ts, .go)
# Symbol-level attribution — breaks down each function/method individually
# Plain text output
# JSON output (for piping to other tools)
# Enforce attribution in CI — exit 1 if any file isn't attributed to one of these families
# Assert human authorship specifically
# Skip the cache (always re-analyze, useful for CI reproducibility)
# List all detection signals with their default weights (pretty table)
# Same list as a TOML block ready to paste into .vibecheck
All commands are also available as explicit subcommands: vibecheck analyze, vibecheck tui, vibecheck watch, vibecheck history.
--assert-family accepts a comma-separated list of claude, gpt, copilot, gemini, or human. If any analyzed file's primary attribution is not in the list, vibecheck prints a failure summary to stderr and exits with code 1. This is the flag that makes vibecheck useful in CI.
TUI Codebase Navigator
# Open TUI in the current directory (same as running vibecheck with no args)
# Or point at a specific directory
Two-pane browser: file tree with family badges on the left, signal/score/symbol breakdown on the right. Press h on any file to open a git history panel showing per-commit AI attribution (loaded in the background). Confidence rolls up from symbol → file → directory (weighted by lines of code).
| Key | Action |
|---|---|
j / ↓ |
Move down |
k / ↑ |
Move up |
Enter / → / l |
Expand directory |
← |
Collapse directory or go to parent |
d / PageDown |
Scroll detail pane down |
u / PageUp |
Scroll detail pane up |
⇧→ / ⇧← |
Scroll detail pane right / left |
h |
Toggle git history panel (files only) |
↑ / ↓ in history |
Navigate commits |
Esc |
Close history panel |
q / Ctrl+C |
Quit |
Live Watch Mode
# Re-analyze on every file save, print deltas to stdout
Uses OS file-system events (inotify/kqueue/FSEvents) with a 300 ms debounce and a 2 s per-file cooldown to suppress duplicate events from a single save.
Ignore Rules
vibecheck respects .gitignore automatically. For additional exclusions, drop a .vibecheck file in your project root:
# .vibecheck
[]
# Extra patterns (gitignore glob syntax), additive on top of .gitignore.
= [
"vendor/",
"dist/",
"*.min.js",
"*.generated.ts",
]
# Set to false to stop reading .gitignore (default: true).
= true
# Set to false to stop reading the global gitignore (default: true).
= true
Discovery walks upward from the analyzed path to the nearest .vibecheck file or .git directory. Falls back to gitignore-only if no config file is found.
To point at a config file explicitly on any subcommand:
Ignored paths are excluded from all traversal layers — they do not enter the file list, the Merkle hash tree, or the watch event queue.
Heuristics
Every detection rule in vibecheck is a signal with three properties:
- Stable ID (
rust.errors.zero_unwrap) — used as the config key and for cache invalidation - Weight — how strongly the signal shifts the score (positive = evidence for the family;
0.0= disabled) - Family — which model family the signal points toward (Claude, Gpt, Copilot, Human, …)
There are currently 151 signals across Rust, Python, JavaScript, and Go.
Viewing signals
# Pretty table grouped by language then analyzer (default)
# Output:
# Language Analyzer Signal ID Family Weight Description
# ─────────────────────────────────────────────────────────────────────────────
# rust errors rust.errors.zero_unwrap Claude 1.50 Zero .unwrap() calls in a large file
# rust errors rust.errors.many_unwraps Human 1.50 5+ .unwrap() calls — pragmatic style
# …
# TOML block ready to paste into .vibecheck
# Output:
# [heuristics]
# # "rust.errors.zero_unwrap" = 1.5 # Claude: Zero .unwrap() calls in a large file
# # "rust.errors.many_unwraps" = 1.5 # Human: 5+ .unwrap() calls — pragmatic style
# …
Overriding weights
Add a [heuristics] section to your .vibecheck config. Any signal not listed falls back to its default weight.
# .vibecheck
[]
= ["vendor/", "dist/"]
[]
# Double the zero-unwrap signal — you care a lot about this one
= 3.0
# Disable the trailing-whitespace signal — your auto-formatter isn't deterministic
= 0.0
# Your codebase uses panic! legitimately; reduce human penalty
= 0.5
Setting a weight to 0.0 disables the signal entirely — it won't appear in reports or affect scores. Weights above the default amplify a signal you find particularly reliable.
Run vibecheck heuristics --format toml to get a pre-commented block of every signal with its default — copy, uncomment, and edit.
Signal catalogue
Top signals by weight per language (regenerated by cargo build --release -p vibecheck-cli; run vibecheck heuristics for the full live table):
| Language | Signal ID | Family | Weight | Description |
|---|---|---|---|---|
| rust | rust.ai_signals.all_fns_documented |
Claude | 2.0 | Every function has a doc comment — suspiciously thorough |
| rust | rust.ai_signals.commented_out_code |
Human | 2.0 | 2+ lines of commented-out code |
| rust | rust.comments.external_refs |
Human | 2.0 | 2+ ticket/issue references in comments |
| rust | rust.comments.terse_markers |
Human | 2.0 | 2+ terse/frustrated comments (TODO, HACK, etc.) |
| rust | rust.naming.many_single_char_vars |
Human | 2.0 | 3+ single-character variable names |
| python | python.ai_signals.all_fns_documented |
Claude | 2.0 | Every function has a docstring — suspiciously thorough |
| python | python.ai_signals.commented_out_code |
Human | 2.0 | 2+ lines of commented-out code |
| python | python.comments.external_refs |
Human | 2.0 | 2+ ticket/issue references in comments |
| python | python.comments.terse_markers |
Human | 2.0 | 2+ terse/frustrated comments |
| python | python.naming.many_single_char |
Human | 2.0 | 3+ single-character names |
| javascript | js.ai_signals.commented_out_code |
Human | 2.0 | 2+ lines of commented-out code |
| javascript | js.ai_signals.console_log |
Human | 2.0 | 3+ console.log calls — likely debugging artifacts |
| javascript | js.comments.external_refs |
Human | 2.0 | 2+ ticket/issue references in comments |
| javascript | js.comments.terse_markers |
Human | 2.0 | 2+ terse/frustrated comments (TODO, HACK, etc.) |
| javascript | js.naming.many_single_char |
Human | 2.0 | 3+ single-character names |
| go | go.ai_signals.all_exported_documented |
Claude | 2.0 | All exported identifiers have doc comments |
| go | go.ai_signals.commented_out_code |
Human | 2.0 | 2+ lines of commented-out code |
| go | go.comments.external_refs |
Human | 2.0 | 2+ ticket/issue references in comments |
| go | go.comments.terse_markers |
Human | 2.0 | 2+ terse/frustrated comments (TODO, HACK, etc.) |
| go | go.naming.many_single_char |
Human | 2.0 | 3+ single-character names |
Git History
# Replay git history for a file and show how attribution changed over commits
# Limit to the last N commits that touched the file (default: 20)
Reads blobs directly from the git object store (no working-tree checkout). Prints a table: COMMIT | DATE | FAMILY | CONFIDENCE | CHANGE.
The Ultimate Test: Self-Detection
vibecheck was written by an AI. Does it know?
$ vibecheck vibecheck-core/src/ --format text
vibecheck-core/src/store.rs → Claude (70%) # highest confidence
vibecheck-core/src/pipeline.rs → Claude (68%)
vibecheck-core/src/colors.rs → Claude (60%)
vibecheck-core/src/heuristics.rs → Claude (58%)
vibecheck-core/src/analyzers/cst/go.rs → Human (41%) # tree-sitter code: short vars, .unwrap()
vibecheck-core/src/project_tools.rs → Gemini (36%) # struct-heavy config detection
20 of 25 source files correctly attributed to Claude (34–70% confidence). The CST analyzer files — full of single-character tree-sitter cursor variables and pragmatic .unwrap() calls — read as Human, which is honestly fair. One config-detection module reads as Gemini (compact struct-heavy style).
$ vibecheck vibecheck-core/src/ --assert-family claude,human,gemini --no-cache
All files passed the vibe check. # exits 0
When the AI detector you wrote with AI detects itself as AI:
┌────────────────────────┐
│ │
│ ◉_◉ │
│ │
│ ...well, well, well. │
│ │
│ If it isn't the │
│ consequences of my │
│ own architecture. │
│ │
└────────────────────────┘
"I'm in this photo and I don't like it"
— this crate's source code, probably
Library API
use Path;
use ModelFamily;
// Analyze a source string directly (no file I/O)
let report = analyze;
println!;
// Analyze a file — content-addressed cache is consulted automatically
// Returns std::io::Result<Report>
let report = analyze_file?;
if report.attribution.primary != Human
// Bypass the cache entirely
let report = analyze_file_no_cache?;
// Symbol-level attribution — Report.symbol_reports is populated
// Returns anyhow::Result<Report>
let report = analyze_file_symbols?;
if let Some = &report.symbol_reports
// Symbol-level, cache bypassed
let report = analyze_file_symbols_no_cache?;
// Directory analysis — Merkle tree skips unchanged subtrees when use_cache=true
// Returns anyhow::Result<Vec<(PathBuf, Report)>>
let results = analyze_directory?;
for in results
// Directory analysis with custom ignore rules (dependency injection)
use ;
// Production: auto-discover .vibecheck + .gitignore
let ignore = load;
let results = analyze_directory_with?;
// Load from an explicit config file
let ignore = from_file?;
let results = analyze_directory_with?;
// Tests: inject a lightweight in-memory impl — no filesystem access needed
let ignore = PatternIgnore;
let results = analyze_directory_with?;
// Or implement the trait directly for full control
;
let results = analyze_directory_with?;
GitHub Action / CI Integration
A ready-to-use workflow lives at .github/workflows/vibecheck.yml. It triggers on every pull request and exits 1 if any file's attribution isn't in the allowed list — blocking the PR automatically.
Use case 1: enforce that all code is AI-generated (vibecheck dogfoods this on itself)
- name: Vibecheck source code
run: cargo run --release -p vibecheck-cli -- vibecheck-core/src/ --format text --assert-family claude,gpt,copilot,gemini --no-cache
Use case 2: enforce that all code is human-written (block AI slop from landing)
- name: No AI slop allowed
run: vibecheck src/ --assert-family human
When a file fails, stderr shows exactly what was caught and why:
src/new_feature.rs — detected as Claude (89%), expected one of: human
Exit code 1 fails the job and blocks the PR. Both use cases work the same way — --assert-family is just a comma-separated list of families you're willing to accept.
Architecture
Crate split:
| Crate | Contents | Who uses it |
|---|---|---|
vibecheck-core |
Analysis engine, CST analyzers, cache, corpus store | any tool that imports it |
vibecheck-cli |
CLI binary | end users |
vibecheck-core has no CLI dependencies — it is a clean library crate that any tool can import.
Model Family Profiles
How vibecheck tells them apart:
- Claude: Thorough doc comments, teaching voice, zero
unwrap(), textbook iterator chains,format!()over concatenation, sorted imports, suspiciously complete - GPT: Explicit type annotations, builder patterns, method chaining, explanatory (but less pedagogical) comments
- Copilot: Works but cuts corners — moderate
unwrap()usage, less documentation, pragmatic completion style - Gemini: Currently limited signal set (future improvement area)
- Human: TODOs everywhere,
// HACK, commented-out code, single-character variables,panic!()calls, string concatenation, chaotic formatting
Feature Flags
| Crate | Feature | Default | What it enables |
|---|---|---|---|
vibecheck-core |
corpus |
No | SQLite corpus + trend store (rusqlite) |
vibecheck-cli |
— | — | CLI binary; always has clap, walkdir, colored, anyhow |
The corpus feature
The corpus store is separate from the content-addressed redb cache. They serve different purposes:
- redb cache (always on) — performance. If a file's SHA-256 hash hasn't changed, return the cached
Reportinstantly without re-running any analyzers. - corpus store (opt-in) — data collection. Every result is written to SQLite in two tables:
corpus_entries— one deduplicated row per unique file hash, recording its attribution and confidence.trend_entries— a timestamped row on every analysis run (no deduplication). This lets you plot how a file's attribution drifts over time as you edit it or as the heuristics improve.
To enable the corpus store:
What's Coming
THE GRAND PLAN
──────────────────────────────────────────────────────
v0.1 - "It Works On My Machine" ✓ shipped
v0.2 - "Infrastructure That Doesn't Lie" ✓ shipped
(Merkle cache, symbol-level, TUI,
watch mode, git history)
v0.3 - "Please Don't Scan My node_modules" ✓ shipped
(ignore rules, .vibecheck config, IgnoreRules DI)
v0.4 - "Trust No Signal You Can't Override" ✓ shipped
(heuristics config system, signal IDs, weight overrides,
vibecheck heuristics command, TUI history panel)
v0.5 - "It's Giving Claude" ✓ shipped
(canonical color source, full model display names, Codecov)
v0.6 - "Signals Are Data, Not Code" <- next
(heuristics catalog: patterns/thresholds as structured definitions,
per-language and per-model configurability, deduped signal logic)
v0.7 - "Your Codebase Has a Trend Problem"
(persistent trend store, sparklines, TUI attribution drift panel)
v0.8 - "More Languages, Fewer Excuses"
(TypeScript-specific signals, Ruby, Java, expanded Go/Python depth,
accuracy benchmarks against known human/AI repos)
v0.9 - "We Trained a Model On This"
(corpus scraper via git co-author metadata, linfa classifier,
hand-tuned weights replaced by trained model, version detection)
v1.0 - "Skynet But For Code Review"
(vibecheck-core 1.0 API stability, WASM plugin interface,
IDE integration, published benchmark suite)
──────────────────────────────────────────────────────
Roadmap
Phase 1 — Infrastructure ✅
- Crate split —
vibecheck-core(library) +vibecheck-cli(binary) - Content-addressed cache — SHA-256 per file; skip re-analysis of unchanged files (redb)
- tree-sitter CST analysis — Rust (5 signals), Python (3 signals), JavaScript (3 signals), Go (3 signals)
- Corpus store — SQLite-backed labeled dataset + trend log, feature-gated (
--features corpus) - Library API —
vibecheck-coreis a clean library crate with no CLI dependencies - JSON output — pipe results to other tools
- GitHub Action — run vibecheck in CI, fail PRs based on AI attribution (
--assert-family)
Phase 2 — Visible Product ✅
- Historical trend tracking —
vibecheck history <path>replays git log - Live watch mode —
vibecheck watch <path>re-analyzes on file saves - TUI navigator — ratatui-based codebase browser with confidence bars
- Symbol-level attribution —
vibecheck --symbols <file>breaks down each function/method - Merkle hash tree — incremental directory analysis; unchanged subtrees are skipped entirely
- Ignore rules —
.vibecheckconfig file; auto-respects.gitignore;--ignore-fileflag;IgnoreRulestrait for DI in library consumers
Phase 3 — Configurability
- Heuristics catalog — patterns and thresholds as structured data, not scattered imperative logic
- Per-language signal config — tune or disable signals per language in
.vibecheck - Trend store + sparklines — persistent per-file attribution history; drift visible in TUI
- Expanded language support — TypeScript-specific signals, Ruby, Java, deeper Go/Python coverage
Phase 4 — Intelligence
- Corpus scraper — acquire labeled samples from public repos via git co-author metadata
- ML classification —
linfa-based model trained on corpus; replaces hand-tuned weights - Version detection — distinguish Claude 3.5 vs Claude 4, GPT-3.5 vs GPT-4o (corpus permitting)
- Benchmark suite — accuracy metrics against known human/AI code datasets
Phase 5 — Platform
- WASM plugin interface — external analyzers without recompiling
- IDE integration — LSP server or VS Code extension
-
vibecheck-core1.0 — stable semver API guarantee
Limitations
┌─────────────────────────────────────────────────┐
│ │
│ DISCLAIMER (legally required vibes disclosure) │
│ │
│ vibecheck is a heuristic tool. │
│ It detects VIBES, not PROOF. │
│ │
│ A meticulous human might code like Claude. │
│ A sloppy prompt might produce messy AI. │
│ │
│ Do NOT use this to: │
│ - accuse your coworker in a code review │
│ - settle bets on who wrote the bug │
│ - submit as evidence in a court of law │
│ │
│ DO use this to: │
│ - win bets on who wrote the bug │
│ - roast your team's PR descriptions │
│ - feel seen when it detects your AI code │
│ │
│ (Also, this entire crate was written by an AI │
│ so we are absolutely not throwing stones.) │
│ │
└─────────────────────────────────────────────────┘
Current limitations:
- Heuristic-based — no ML model; weights are hand-tuned, not learned from a corpus
- Not adversarial-resistant — deliberately obfuscated AI code will fool it
- Model family overlap — GPT and Claude share many patterns; attribution between them is fuzzy
- Symbol-level is file-cached —
--symbolsresults are cached per file hash; mixed authorship within a file is detected but symbol boundaries depend on tree-sitter parse quality - Watch/history are read-only — no persistent trend store yet; trend deltas are printed to stdout only
Contributing
Contributions welcome! Some high-impact areas:
- More signals — if you notice a pattern that screams "AI wrote this", open a PR
- Weight tuning — help calibrate signal weights against real-world code
- More CST signals — extend the existing JS/Go/Rust/Python CST analyzers or add a new language (implement
CstAnalyzerand register indefault_cst_analyzers()) - Test corpus — curate labeled examples of human vs AI code for training and benchmarking
- New text analyzers — implement the
Analyzertrait (analyze(&str) -> Vec<Signal>) and register indefault_analyzers()
License
MIT
Made with massive vibes by an AI that is fully aware
of the irony of writing a tool to detect itself.
┌──────────────────────────────────────────────────┐
│ $ vibecheck vibecheck-core │
│ │
│ Verdict: Claude (81%) │
│ │
│ Signals: │
│ [ai_signals] Zero TODOs, alphabetized │
│ imports, and every function has a doc │
│ comment. This is either a very disciplined │
│ human or — and I cannot stress this enough │
│ — a chatbot. │
│ │
│ Source: I am literally that chatbot. │
│ │
└──────────────────────────────────────────────────┘