forgekit-core 0.2.2

# ForgeKit Manual - Developer Guide

> **TL;DR**: ForgeKit is your Black Box Recorder and Verification Engine. It stops the LLM from gaslighting you and gives you an "I Screwed Up" button when things go sideways.

---

## Table of Contents

1. [The "Anti-Hallucination" Check](#1-the-anti-hallucination-check)
2. [Audit Trails for Change Verification](#2-audit-trails-for-change-verification)
3. [The "I Screwed Up" Button (Temporal Checkpointing)](#3-the-i-screwed-up-button-temporal-checkpointing)
4. [Quick Start](#quick-start)
5. [Common Workflows](#common-workflows)

---

## 1. The "Anti-Hallucination" Check

### The Problem

The LLM claims: *"This function handles error case X"*

You look at the code... does it? Or is the LLM hallucinating?

### The Solution: Contradiction Detection

ForgeKit's CFG (Control Flow Graph) analysis verifies every claim against ground truth:

```rust
use forge_core::Forge;

let forge = Forge::open("./my-project").await?;

// LLM claims: "function foo handles the error path"
// Verify it:
let cfg = forge.cfg()
    .extract_function_cfg(Path::new("src/lib.rs"), "foo")
    .await?;

if let Some(cfg) = cfg {
    // Check if there's actually an error path
    let paths = cfg.enumerate_paths();
    let error_paths: Vec<_> = paths.iter()
        .filter(|p| p.is_error())
        .collect();
    
    if error_paths.is_empty() {
        println!("🚨 CONTRADICTION: LLM claims error handling, but CFG shows no error path!");
    } else {
        println!("✅ Verified: {} error path(s) found", error_paths.len());
    }
}
```

### The Rule

**If the Agent claims a function does X, but the forge_core graph shows the CFG never reaches that path, the Contradiction Detector flags it.**

No more "trust me bro" from the LLM.

### Supported Languages

| Language | CFG Extraction | Status |
|----------|---------------|--------|
| C | ✅ Working | Implemented |
| Java | ✅ Working | Implemented |
| Rust | ⚠️ Partial | In Development |

### Real-World Example

```rust
// LLM claims this function handles division by zero
fn divide(a: i32, b: i32) -> i32 {
    a / b  // No check!
}

// ForgeKit analysis:
let cfg = forge.cfg().extract_function_cfg(path, "divide").await?;
let has_error_check = cfg.has_error_path(); // false

// 🚨 CONTRADICTION DETECTED
// The CFG shows no error handling path for b == 0
```

---

## 2. Audit Trails for Change Verification

### The Problem

The LLM made a change. Why? What was it thinking? Can you prove this change was justified?

### The Solution: Hypothesis Board

Every change must be linked to a **Confirmed Hypothesis** with evidence.

```rust
use forge_core::Forge;
use forge_reasoning::{HypothesisBoard, Evidence, Confidence};

let forge = Forge::open("./my-project").await?;
let board = HypothesisBoard::new(&forge).await?;

// Create a hypothesis
let hypothesis = board.create_hypothesis(
    "Refactor: Extract helper function",
    "The main function is too complex (cyclomatic complexity > 10)"
).await?;

// Attach evidence
let evidence = Evidence::from_cfg_analysis(&forge, "main").await?;
hypothesis.add_evidence(evidence).await?;

// Confirm or reject
if hypothesis.meets_confidence_threshold(Confidence::High) {
    hypothesis.confirm().await?;
    // Now the refactoring can proceed with audit trail
} else {
    hypothesis.reject("Insufficient evidence").await?;
}
```

### The Rule

**Every PR generated by the agent must link to a ConfirmedHypothesisId.** 

If there's no evidence attached to that ID in the ForgeKit store, the PR is automatically suspect.

### Audit Query

```rust
// Audit: Find all changes without proper evidence
let suspect_changes = forge.analysis()
    .find_changes_without_evidence()
    .await?;

for change in suspect_changes {
    println!("🚨 SUSPECT: {} has no hypothesis evidence!", change.description);
}
```

### WebSocket Real-Time Monitoring

```rust
// Monitor hypothesis changes in real-time
let ws = forge_reasoning::WebSocketServer::new(8080);

ws.on_hypothesis_confirmed(|hypothesis| {
    println!("✅ Hypothesis confirmed: {}", hypothesis.id);
    println!("   Evidence: {:?}", hypothesis.evidence);
});

ws.on_hypothesis_rejected(|hypothesis| {
    println!("❌ Hypothesis rejected: {}", hypothesis.id);
    println!("   Reason: {}", hypothesis.rejection_reason);
});
```

---

## 3. The "I Screwed Up" Button (Temporal Checkpointing)

### The Problem

The LLM just deleted half the source tree. Or introduced a subtle bug. You need to go back.

### The Solution: State Insurance

**Temporal Checkpointing is not just a feature - it's State Insurance.**

```rust
use forge_core::Forge;
use forge_reasoning::Checkpoint;

let forge = Forge::open("./my-project").await?;

// 🎯 Create checkpoint before dangerous operation
checkpoint!("Before LLM refactoring session");

// Let the LLM do its work...
let result = risky_llm_operation().await;

// Something went wrong?
if result.is_err() {
    // 🚨 CRISIS AVERTED
    forge.checkpoint().restore_last().await?;
    println!("Restored to pre-LLM state");
}
```

### Checkpoint CLI

```bash
# Create checkpoint
forge checkpoint create "Stable state before feature X"

# List checkpoints
forge checkpoint list
# Output:
# ID          TIMESTAMP           DESCRIPTION
# checkpoint_5  2026-02-18T18:30  Stable state before feature X
# checkpoint_4  2026-02-18T18:15  Working tests
# checkpoint_3  2026-02-18T18:00  Before LLM session

# Restore checkpoint
forge checkpoint restore checkpoint_3

# The LLM just deleted half the source tree?
# Crisis averted in 5 seconds.
```

### Automatic Checkpointing

```rust
// Auto-checkpoint every N changes
let forge = Forge::builder()
    .auto_checkpoint_interval(10)  // Every 10 changes
    .auto_checkpoint_on_error(true)
    .open("./my-project")
    .await?;
```

### Checkpoint Scope

| What Gets Saved | Description |
|-----------------|-------------|
| Source code | All file changes |
| Graph state | Symbol database |
| Hypotheses | Current hypothesis board |
| CFG cache | Extracted control flow graphs |
| Session state | LLM conversation context |

---

## Quick Start

### Installation

```bash
cargo add forge_core --features treesitter-cfg
```

### Basic Usage

```rust
use forge_core::Forge;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // Open your project
    let forge = Forge::open("./my-project").await?;
    
    // Create safety checkpoint
    checkpoint!("Before automated changes");
    
    // Verify LLM claim about code
    let cfg = forge.cfg()
        .extract_function_cfg(Path::new("src/lib.rs"), "main")
        .await?;
    
    // Check if claim matches reality
    if let Some(cfg) = cfg {
        let loops = cfg.detect_loops();
        println!("Detected {} loops", loops.len());
    }
    
    Ok(())
}
```

---

## Common Workflows

### Workflow 1: Verify LLM Refactoring Claim

```rust
// LLM: "I simplified the error handling"
let before = forge.cfg().extract_function_cfg(path, "old_func").await?;
let after = forge.cfg().extract_function_cfg(path, "new_func").await?;

// Verify error paths still exist
let before_errors = before.map(|c| c.count_error_paths()).unwrap_or(0);
let after_errors = after.map(|c| c.count_error_paths()).unwrap_or(0);

if after_errors < before_errors {
    println!("🚨 LLM may have removed error handling!");
}
```

### Workflow 2: Pre-Commit Safety Check

```rust
// Before committing LLM changes
let issues = forge.analysis()
    .find_dead_code()
    .await?;

if !issues.is_empty() {
    println!("🚨 LLM may have introduced dead code:");
    for issue in issues {
        println!("  - {}", issue.name);
    }
}
```

### Workflow 3: Full Audit Trail

```rust
// Create tracked change
let hypothesis = forge.hypotheses()
    .create("Optimize hot path")
    .with_evidence(Evidence::from_profiler(&forge, "main").await?)
    .confirm()
    .await?;

// Make the change
let edit = forge.edit()
    .patch_symbol("slow_function", optimized_code)
    .with_hypothesis(hypothesis.id)  // Linked!
    .await?;

// Later: Audit
let changes = forge.analysis()
    .changes_for_hypothesis(hypothesis.id)
    .await?;
```

---

## Pre-Change Verification Checklist

Before trusting any LLM change:

- [ ] **Checkpoint created?** (The "I Screwed Up" button)
- [ ] **Hypothesis documented?** (Why is this change being made?)
- [ ] **Evidence attached?** (CFG analysis, test results, etc.)
- [ ] **Contradictions checked?** (Does the code match the claim?)
- [ ] **Audit trail linked?** (Can you trace this change later?)

**Remember**: ForgeKit doesn't replace your judgment - it gives you the tools to exercise it effectively.

---

## See Also

- [E2E Test Reports](tests/e2e/reports/) - Verification that the tools work
- [API Documentation](src/lib.rs) - Code-level details
- [README.md](../README.md) - Project overview