forge-reasoning 0.1.2

Reasoning tools for LLM debugging - checkpointing, hypotheses, contradictions
Documentation
# TDD Wave 10: Data Integrity & Validation

**Status**: ✅ Complete  
**Date**: 2026-02-19  
**Focus**: Checksum verification and corruption detection

---

## Overview

Wave 10 adds data integrity features to the checkpointing system. Each checkpoint now includes a SHA-256 checksum computed over its data, enabling:
- Detection of data corruption
- Validation on restore/import
- Health checks with integrity verification

---

## Test Results

| Test | Description | Status |
|------|-------------|--------|
| 78 | Checkpoints include checksum on creation ||
| 79 | Checksum computation is correct ||
| 80 | Different data produces different checksum ||
| 81 | Validate returns true for valid checkpoint ||
| 82 | Service validate_checkpoint method works ||
| 83 | Export includes checksums ||
| 84 | Import handles checksums ||
| 85 | Health check validates recent checkpoints ||
| 86 | Batch validation of all checkpoints ||
| 87 | Validation report provides details ||
| 88 | Concurrent validation is safe ||
| 89 | Checksum algorithm is SHA-256 ||
| 90 | Validation catches tampered data ||

**Results**: 13 passed, 0 failed

---

## Implementation Details

### Checksum Field

Added `checksum` field to `TemporalCheckpoint`:

```rust
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct TemporalCheckpoint {
    // ... other fields ...
    /// SHA-256 checksum for data integrity verification
    pub checksum: String,
}
```

### Checksum Computation

Uses SHA-256 hash of serialized checkpoint data (excluding checksum itself):

```rust
impl TemporalCheckpoint {
    pub fn new(...) -> Self {
        let mut checkpoint = Self {
            // ... init fields ...
            checksum: String::new(),
        };
        checkpoint.checksum = checkpoint.compute_checksum();
        checkpoint
    }
    
    fn compute_checksum(&self) -> String {
        let data_for_hash = CheckpointDataForHash {
            // All fields except checksum
            id: self.id,
            timestamp: self.timestamp,
            // ... etc
        };
        
        let json = serde_json::to_vec(&data_for_hash).unwrap_or_default();
        compute_checksum(&json)  // SHA-256
    }
}
```

### Validation

```rust
impl TemporalCheckpoint {
    pub fn validate(&self) -> Result<()> {
        let expected = self.compute_checksum();
        if self.checksum != expected {
            return Err(ReasoningError::ValidationFailed(
                format!("Checksum mismatch: expected {}, got {}", expected, self.checksum)
            ));
        }
        Ok(())
    }
}
```

---

## API Additions

### CheckpointService

| Method | Description |
|--------|-------------|
| `validate_checkpoint(id)` | Validate a single checkpoint's checksum |
| `health_check_with_validation()` | Health check + validate recent checkpoints |
| `validate_all_checkpoints()` | Full validation of all checkpoints |

### ValidationReport

```rust
pub struct ValidationReport {
    pub valid: usize,       // Number of valid checkpoints
    pub invalid: usize,     // Number of invalid (corrupted) checkpoints
    pub skipped: usize,     // Legacy checkpoints without checksums
    pub checked_at: Option<DateTime<Utc>>,
}

impl ValidationReport {
    pub fn total(&self) -> usize { self.valid + self.invalid + self.skipped }
    pub fn all_valid(&self) -> bool { self.invalid == 0 }
}
```

---

## Storage Format

Checkpoints stored in SQLiteGraph include the checksum:

```json
{
  "id": "uuid",
  "timestamp": "2026-02-19T12:34:56Z",
  "sequence_number": 42,
  "message": "...",
  "tags": [],
  "trigger": "manual",
  "session_id": "uuid",
  "state_data": "{...}",
  "checksum": "a1b2c3d4..."  // 64 hex chars (SHA-256)
}
```

---

## Backward Compatibility

- **Legacy checkpoints** (without checksum) are treated as valid
- Empty checksum field skips validation
- Storage layer handles missing checksum field gracefully

---

## Usage Examples

### Validate a Checkpoint

```rust
let service = CheckpointService::new(storage);
let id = service.checkpoint(&session, "Important state")?;

// Later, validate it
if service.validate_checkpoint(&id)? {
    println!("Checkpoint is valid!");
} else {
    println!("Checkpoint is corrupted!");
}
```

### Health Check with Validation

```rust
let health = service.health_check_with_validation()?;
if health.healthy {
    println!("Service healthy: {}", health.message);
} else {
    eprintln!("Data corruption detected: {}", health.message);
}
```

### Batch Validation

```rust
let report = service.validate_all_checkpoints()?;
println!("Validated {} checkpoints", report.total());
println!("  Valid: {}", report.valid);
println!("  Invalid: {}", report.invalid);
println!("  Skipped (legacy): {}", report.skipped);

if !report.all_valid() {
    eprintln!("WARNING: Data corruption detected!");
}
```

---

## Security Considerations

- **SHA-256**: Industry-standard cryptographic hash
- **Not for security**: Checksums detect accidental corruption, not malicious tampering
- **No signature**: For malicious protection, add digital signatures

---

## Performance

| Operation | Cost |
|-----------|------|
| Create checkpoint | +~1μs (SHA-256 of ~1KB) |
| Validate checkpoint | ~1μs |
| Health check validation | O(sessions × 5) |
| Full validation | O(total checkpoints) |

---

## Files Modified

| File | Changes |
|------|---------|
| `Cargo.toml` | Added `sha2` dependency |
| `src/checkpoint.rs` | Added checksum field, compute_checksum, validate |
| `src/errors.rs` | Added `ValidationFailed` error variant |
| `src/storage_sqlitegraph.rs` | Store/retrieve checksum |
| `src/service.rs` | Validation methods, ValidationReport |
| `src/lib.rs` | Export ValidationReport |
| `tests/data_integrity_tests.rs` | New test file (13 tests) |

---

## Future Enhancements

1. **Digital signatures**: Cryptographic signing for tamper-proofing
2. **Incremental validation**: Background validation of older checkpoints
3. **Repair mechanism**: Attempt to repair corrupted checkpoints from backups
4. **Checksum algorithms**: Configurable (BLAKE3, SHA3-256)

---

**Wave 10 Complete** ✅

Data integrity features ensure checkpoint reliability and provide tools for detecting corruption in production deployments.