whogitit 1.0.0 - Docs.rs

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Build & Test Commands

```bash
# Build
cargo build
cargo build --release

# Run all tests
cargo test --verbose

# Run only unit tests
cargo test --lib

# Run only integration tests
cargo test --test '*'

# Run a single test by name
cargo test test_full_workflow -- --nocapture

# Lint and format
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings

# Security audit
cargo audit
```

## CLI Usage

```bash
# Run locally during development
cargo run -- <command>

# Setup and configuration
cargo run -- setup              # One-time global setup (Claude Code integration)
cargo run -- doctor             # Verify configuration
cargo run -- init               # Initialize repository hooks

# Core attribution commands
cargo run -- blame src/main.rs
cargo run -- blame src/main.rs --ai-only
cargo run -- show HEAD
cargo run -- show HEAD --format json
cargo run -- prompt src/main.rs:42
cargo run -- summary --base main --format markdown

# Status and utility commands
cargo run -- status             # Show pending changes
cargo run -- clear              # Discard pending changes

# Data export and management
cargo run -- export --format json
cargo run -- export --since 2024-01-01 --until 2024-12-31
cargo run -- retention preview
cargo run -- retention apply --execute
cargo run -- audit --limit 100

# Developer integration (GitHub, git)
cargo run -- annotations --base main --head HEAD
cargo run -- annotations --base main --min-ai-lines 5 --sort-by coverage
cargo run -- annotations --base main --diff-only --group-ai-types
cargo run -- pager              # Read diff from stdin

# Privacy testing
cargo run -- redact-test --text "api_key=secret"
cargo run -- redact-test --list-patterns

# Copy attribution (after cherry-pick or recovery)
cargo run -- copy-notes <old-sha> <new-sha>
cargo run -- copy-notes abc123 def456 --dry-run
```

## Architecture Overview

whogitit tracks AI-generated code at line-level granularity using a dual-hook capture system and three-way diff analysis.

### Data Flow

```
Claude Code Session (Edit/Write tools)
    │
    ├─► PreToolUse Hook: saves file state before edit
    ├─► PostToolUse Hook: captures change + prompt from transcript
    │
    ▼
Pending Buffer (.whogitit-pending.json)
    │   - Full content snapshots (original → AI edit 1 → AI edit 2 → ...)
    │   - Session metadata, prompts, file histories
    │
    ▼ (git commit triggers post-commit hook)
    │
Three-Way Analysis
    │   - Compares: original → AI snapshots → final committed
    │   - Attributes each line: AI / AIModified / Human / Original
    │
    ▼
Git Notes (refs/notes/whogitit)
    - AIAttribution JSON per commit
    - Prompts (redacted), line-level attribution, summaries
```

### Key Modules

- **capture/**: Hook handlers and pending buffer
  - `hook.rs`: CaptureHook - handles PreToolUse/PostToolUse from Claude Code
  - `pending.rs`: PendingBuffer - stores snapshots until commit
  - `threeway.rs`: ThreeWayAnalyzer - core attribution algorithm
  - `snapshot.rs`: Data structures (ContentSnapshot, AIEdit, FileEditHistory, LineAttribution)
  - `diff.rs`: Diff utilities

- **core/**: Attribution data models and blame engine
  - `attribution.rs`: AIAttribution, PromptInfo, SessionMetadata, ModelInfo
  - `blame.rs`: AIBlamer - combines git blame with AI notes

- **storage/**: Git notes persistence
  - `notes.rs`: NotesStore - read/write attribution to `refs/notes/whogitit`
  - `trailers.rs`: TrailerGenerator - git trailers from attribution
  - `audit.rs`: AuditLog, AuditEvent - compliance event logging

- **cli/**: Command implementations
  - `blame.rs`, `show.rs`, `prompt.rs`, `summary.rs` - core attribution commands
  - `annotations.rs`: GitHub Checks API annotation generation
  - `pager.rs`: Git diff pager with AI attribution markers
  - `export.rs`: Bulk attribution export (JSON/CSV)
  - `setup.rs`: Global setup, doctor, and init commands
  - `retention.rs`: Data retention policy management
  - `audit.rs`: Audit log viewing
  - `redact.rs`: Redaction pattern testing
  - `copy.rs`: Copy attribution between commits
  - `output.rs`: Formatting (Pretty, JSON, Markdown)

- **privacy/**: Sensitive data protection
  - `redaction.rs`: Redactor - regex patterns for API keys, emails, passwords, etc.
  - `config.rs`: WhogititConfig, PrivacyConfig, RetentionConfig - `.whogitit.toml` parsing

### Line Attribution Types

| Source | Symbol | Description |
|--------|--------|-------------|
| AI | `●` | Generated by AI, unchanged |
| AIModified | `◐` | Generated by AI, then edited by human |
| Human | `+` | Added by human after AI edits |
| Original | `─` | Existed before AI session |
| Unknown | `?` | Could not determine source |

### Data Formats

**Pending Buffer** (`.whogitit-pending.json`): Version 2 format with full content snapshots per file, edit history chain, and session metadata.

**Git Notes** (`refs/notes/whogitit`): AIAttribution JSON with schema version 2, containing session info, prompts array, and per-file line-level attribution results.

## Code Style

- Rust 2021 edition
- Max line width: 100 characters
- 4-space indentation
- Run `cargo fmt` before committing

## Hook Integration

The shell hook at `hooks/whogitit-capture.sh` (installed to `~/.claude/hooks/`) reads the `transcript_path` JSONL file provided by Claude Code to extract user prompts. It uses `jq -s` (slurp mode) since the transcript is JSON Lines format, not a JSON array.