certeza 0.1.1 - Docs.rs

<div align="center">

<p align="center">
  <img src=".github/certeza-hero.svg" alt="certeza" width="800">
</p>

<h1 align="center">certeza</h1>

<p align="center">
  <b>A scientific experiment into realistic provability with Rust</b>
</p>

<p align="center">
  <a href="https://github.com/paiml/certeza/actions/workflows/ci.yml"><img src="https://github.com/paiml/certeza/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
  <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="License"></a>
</p>

</div>

---

## Overview

certeza is a comprehensive framework for approaching **asymptotic test effectiveness** in Rust software systems through the pragmatic integration of:

- **Property-based testing** (using proptest)
- **Mutation testing** (using cargo-mutants)
- **Structural coverage analysis** (using cargo-llvm-cov)
- **Selective formal verification** (using Kani)

### Philosophy

While complete verification remains theoretically impossible (Dijkstra: "testing can only prove the presence of bugs, not their absence"), this framework provides a reproducible methodology for achieving practical maximum confidence in critical systems.

## Tiered TDD-X Workflow

The framework implements three-tiered verification that balances rigor with developer productivity:

### Tier 1: ON-SAVE (Sub-second feedback)
- Unit tests and focused property tests
- Static analysis (`cargo check`, `cargo clippy`)
- Enables rapid iteration in flow state

**Command**: `make tier1`

### Tier 2: ON-COMMIT (1-5 minutes)
- Full property-based test suite
- Coverage analysis (target: 95%+ line coverage)
- Integration tests
- Pre-commit hook enforcement

**Command**: `make tier2`

### Tier 3: ON-MERGE/NIGHTLY (Hours)
- Comprehensive mutation testing (target: >85% mutation score)
- Formal verification for critical paths
- Performance benchmarks
- CI/CD gate for main branch

**Command**: `make tier3`

**Critical Principle**: Different verification techniques operate at different time scales. Fast feedback enables flow; slow feedback causes context switching waste. Never run mutation testing or formal verification in the inner development loop.

## Quick Start

```bash
# Clone the repository
git clone https://github.com/paiml/certeza.git
cd certeza

# Run quick checks (Tier 1)
make tier1

# Run full test suite with coverage (Tier 2)
make tier2

# Build the project
cargo build

# Run all tests
cargo test

# Generate documentation
cargo doc --open
```

## Installation

Add this to your `Cargo.toml`:

```toml
[dependencies]
certeza = "0.1.0"
```

## Usage Example

### TruenoVec - Custom Growable Vector

The primary demonstration of the certeza framework is `TruenoVec<T>`, a custom growable vector implementation that showcases the complete three-tiered testing approach.

```rust
use certeza::TruenoVec;

fn main() {
    // Create a new vector
    let mut vec = TruenoVec::new();

    // Push elements
    vec.push(1);
    vec.push(2);
    vec.push(3);

    // Access elements
    assert_eq!(vec.len(), 3);
    assert_eq!(vec.get(1), Some(&2));

    // Pop elements
    assert_eq!(vec.pop(), Some(3));
    assert_eq!(vec.len(), 2);

    // Pre-allocate capacity for performance
    let mut vec2 = TruenoVec::with_capacity(1000);
    for i in 0..1000 {
        vec2.push(i);  // No reallocation
    }
}
```

### Key Features

- **Manual Memory Management**: Uses `NonNull<T>` for safe manual allocation
- **2x Growth Factor**: Exponential growth for amortized O(1) push operations
- **RAII Semantics**: Proper `Drop` implementation ensures no memory leaks
- **Thread Safety**: `Send + Sync` where `T` is `Send/Sync`
- **Zero-Cost Abstraction**: 24-byte overhead (ptr + len + capacity)
- **Full Iterator Support**: Bidirectional iterators (`iter`, `iter_mut`, `into_iter`)
- **Advanced Operations**: `insert`, `remove`, `clear` with optimal performance
- **Complete Trait Support**: All standard collection traits (Deref, Clone, Hash, Ord, Display, Borrow, etc.)
- **Comprehensive Testing**: 260 tests with 97.7% mutation score

### Complete Trait Implementation

TruenoVec implements **all standard Rust collection traits** for full `std::Vec` feature parity:

**Core Traits:**
- `Default`, `Drop`, `Clone`
- `Send + Sync` (thread safety)

**Conversion Traits:**
- `From<Vec<T>>`, `From<&[T]>`
- `FromIterator<T>`, `Extend<T>`
- `IntoIterator` (owned, `&`, `&mut`)

**Indexing Traits:**
- `Index<usize>`, `IndexMut<usize>`

**Comparison Traits:**
- `PartialEq`, `Eq`
- `PartialOrd`, `Ord` (sorting support)

**Hashing:**
- `Hash` (HashMap/HashSet keys)

**Ergonomic Traits (Phase 3.1):**
- `Deref<Target=[T]>`, `DerefMut` (automatic slice coercion)
- `AsRef<[T]>`, `AsMut<[T]>` (generic bounds)

**Display & Formatting (Phase 3.3):**
- `Debug` (pretty printing)
- `Display` (user-friendly output)

**Advanced Borrowing (Phase 3.3):**
- `Borrow<[T]>`, `BorrowMut<[T]>` (HashMap/BTreeMap lookup support)

This enables TruenoVec to be used in:
- ✅ Generic algorithms accepting `&[T]` or `AsRef<[T]>`
- ✅ `HashMap` and `HashSet` as keys
- ✅ `BTreeMap` and `BTreeSet` as keys
- ✅ Sorting with `.sort()` or `Vec::sort()`
- ✅ All slice methods through `Deref` coercion

### Comprehensive Test Coverage

**Total Tests: 260 tests across all tiers** ✅ **97.7% mutation score**

#### Tier 1: Unit Tests (180+ tests)
- Basic operations: `new`, `with_capacity`, `push`, `pop`, `clear`
- Index access: `get`, `get_mut` with bounds checking
- Advanced operations: `insert`, `remove` at various positions
- Iterator support: `iter`, `iter_mut`, `into_iter` with bidirectional iteration
- Growth behavior and capacity management
- Drop trait verification with destructor counting
- Memory deallocation and leak prevention tests
- Edge case handling (empty vectors, single elements, boundary conditions)
- Mutation-resistant test cases for critical operations
- **Phase 3.1 Ergonomic Traits** (50 tests):
  - Deref/DerefMut coercion and slice method access
  - AsRef/AsMut with generic functions
  - Slice operation consistency
- **Phase 3.2 Comparison & Hash Traits** (33 tests):
  - PartialOrd: `<`, `>`, `<=`, `>=` operators
  - Ord: Total ordering, BTreeMap/BTreeSet support
  - Hash: HashMap/HashSet key support
- **Phase 3.3 Display & Borrow Traits** (22 tests):
  - Display: User-friendly formatting, consistency with std::Vec
  - Borrow/BorrowMut: Generic function support, HashMap lookup compatibility
  - Slice operation interoperability
- Sub-second execution

#### Tier 2: Property-Based Tests (60 properties)
- Length invariant after push operations
- Capacity bound (capacity >= len) maintained
- Push/pop symmetry (inverse operations)
- Index access correctness
- Out-of-bounds safety
- Exponential growth factor verification
- Behavioral equivalence with `std::Vec`
- Empty vector invariants
- Mutable access correctness
- Repeated operations integrity
- Clear operation preserves capacity
- Insert/remove maintain order
- Insert-then-remove inverse operations
- Clear-then-reuse functionality
- **Phase 3.1 Deref/AsRef Properties** (15 properties):
  - Deref/AsRef equivalence with as_slice
  - Slice method correctness vs std::Vec
  - Generic function interoperability
  - DerefMut mutation operations
- **Phase 3.2 Comparison/Hash Properties** (10 properties):
  - PartialOrd/Ord matches std::Vec behavior
  - Ordering transitivity and antisymmetry
  - Hash consistency with Eq
  - HashMap/BTreeMap operations correctness
  - Sorting and lexicographic ordering
- **Phase 3.3 Display/Borrow Properties** (7 properties):
  - Display format consistency with std::Vec Debug output
  - Borrow equivalence with AsRef/AsMut
  - Display respects element Display implementation
  - Slice operation correctness through Borrow

#### Tier 2: Integration Tests (26 scenarios)
- **Basic Integration Tests**:
  - Basic vector operations workflow
  - Capacity management and preallocation
  - Stack semantics (LIFO behavior)
  - Integration with standard Rust types (String, complex structs)
  - Real-world buffer usage patterns
  - Interleaved push/pop operations
  - Out-of-bounds access safety
  - Thread safety (Send + Sync)
  - Arc integration for shared ownership
- **Advanced Scenario-Based Tests**:
  - User records management system
  - Stack-based expression evaluation
  - Batch processing with pre-allocation
  - Undo/Redo system implementation
  - Graph adjacency list data structures
  - Ring buffer patterns
  - Large dataset processing (10,000 elements)
  - Type safety with various element types
  - Thread safety compilation checks
  - Memory efficiency validation (24-byte overhead)
  - Edge case handling
  - std::Vec behavioral equivalence

#### Tier 3: Formal Verification (3 Kani proofs)
- Capacity invariant proof (capacity >= len for all paths)
- Push/pop correctness proof (mathematical verification)
- Bounds checking verification (no buffer overflows)

#### Documentation Tests (21 tests)
- All public API examples verified
- 100% rustdoc coverage for public items
- Comprehensive examples in module documentation

### Performance Characteristics

Benchmarks demonstrate competitive performance with `std::Vec`:

- **Push (sequential)**: ~10 ns/operation average
- **Pop**: O(1) constant time
- **Get (random access)**: O(1) with cache efficiency
- **Growth pattern**: ~14 reallocations for 10,000 elements (log₂ n)
- **Memory efficiency**: 24-byte overhead per vector

## Chaos Engineering & Fuzz Testing

**certeza** includes chaos engineering and fuzz testing capabilities adapted from [**renacer v0.4.1**](https://github.com/paiml/renacer) (Sprint 29).

### Chaos Engineering

Test system resilience under adverse conditions:

```rust
use certeza::chaos::{ChaosConfig, ChaosResult};
use std::time::Duration;

// Gentle preset for development (512MB, 80% CPU, 120s timeout)
let gentle = ChaosConfig::gentle();

// Aggressive preset for CI/CD (64MB, 25% CPU, 10s timeout, signals)
let aggressive = ChaosConfig::aggressive();

// Custom configuration using renacer builder pattern
let custom = ChaosConfig::new()
    .with_memory_limit(128 * 1024 * 1024)  // 128MB
    .with_cpu_limit(0.5)                   // 50% CPU
    .with_timeout(Duration::from_secs(30))
    .with_signal_injection(true)
    .build();
```

**Chaos Features** (renacer pattern):
- `chaos-basic`: Core chaos configuration and error types
- `chaos-network`: Network failure simulation (planned)
- `chaos-byzantine`: Byzantine fault injection (planned)
- `chaos-full`: All chaos features combined

**Commands**:
```bash
# Run chaos engineering tests (Tier 2)
make chaos-test

# Enable specific chaos features
cargo test --features chaos-basic
cargo test --features chaos-full
```

### Fuzz Testing

Automated testing with libFuzzer to discover edge cases and crashes:

```bash
# Run fuzz tests for 60 seconds
make fuzz

# Extended fuzzing session
cargo +nightly fuzz run fuzz_target_1 -- -max_total_time=300
```

The fuzz target tests `TruenoVec` operations:
- Push/pop operations
- Random access
- Iterator operations
- Invariant verification

**Integration**: Chaos tests run in Tier 2 (ON-COMMIT) for fast feedback.

## PMAT Compliance

This project is fully compliant with the **Pragmatic AI Labs Multi-Language Agent Toolkit (PMAT)** standards, enforcing **EXTREME TDD**:

### Quality Standards

- **Coverage**: ≥85% line coverage (minimum), 95% (target)
- **Complexity**: ≤10 cyclomatic complexity per function
- **SATD**: Zero tolerance for TODO/FIXME/HACK comments
- **Mutation Testing**: ≥85% mutation score
- **Documentation**: ≥90% rustdoc coverage

### Makefile Commands

```bash
# Development workflow
make tier1              # Quick checks (sub-second)
make tier2              # Full test suite (1-5 min)
make tier3              # Mutation testing (hours)

# Testing
make test               # Run all tests
make test-quick         # Run unit tests only
make test-property      # Run property-based tests
make coverage           # Generate coverage report
make chaos-test         # Chaos engineering tests (renacer)
make fuzz               # Fuzz testing (60s)

# Code quality
make clippy             # Run clippy linter
make clippy-strict      # Run strict clippy
make fmt                # Format code
make fmt-check          # Check formatting

# Analysis
make complexity         # Analyze complexity with PMAT
make tdg                # Technical debt grading
make security           # Security audit

# Documentation
make docs               # Generate documentation
make validate-docs      # Validate docs with PMAT

# Setup
make install-tools      # Install required tooling
make install-hooks      # Install PMAT git hooks
```

## Testing Framework

### Running Tests

```bash
# Tier 1: Quick checks (unit tests)
make tier1
cargo test --lib

# Tier 2: Full test suite (unit + property + integration + doc tests)
make tier2
cargo test --all

# Integration tests
cargo test --test integration_tests

# Benchmarks
cargo test --benches

# Property-based tests with custom case count
PROPTEST_CASES=1000 cargo test property_
```

### Mutation Testing

Run mutation testing to verify test suite quality:

```bash
make mutation
# or
cargo mutants --no-times
```

**Target**: >85% mutation score

### Coverage Analysis

Generate coverage report:

```bash
make coverage
# or
cargo llvm-cov --all-features --workspace
```

**Target**: 95%+ line coverage, 90%+ branch coverage

## Risk-Based Verification

Not all code requires the same verification intensity:

| Risk Level | Components | Verification Approach |
|------------|------------|----------------------|
| **Very High** | `unsafe` blocks, memory allocators, crypto | Full framework: Property + Coverage + Mutation (90%) + Formal |
| **High** | Core algorithms, data structures, parsers | Property + Coverage + Mutation (85-90%) |
| **Medium** | Business logic, API handlers | Property + Coverage + Mutation (80%) |
| **Low** | Simple accessors, config | Unit tests + Coverage (90%) |

**Resource Allocation**: Spend 40% of verification time on the 5-10% highest-risk code.

## Architecture

### Testing Pyramid

```
┌─────────────────┐
│  Formal (Kani)  │  ~1-5% code (invariant proofs)
├─────────────────┤
│   Integration   │  ~10% tests (system properties)
├─────────────────┤
│  Property-Based │  ~30% tests (algorithmic correctness)
├─────────────────┤
│   Unit Tests    │  ~60% tests (basic functionality)
└─────────────────┘
```

### Key Design Principles

- **Sustainable Workflows**: Tiered feedback loops prevent burnout
- **Risk-Based Resource Allocation**: Focus on high-risk components
- **Human-Centered Analysis**: Mutation analysis as learning exercise
- **Economic Realism**: Acknowledge costs and diminishing returns

## Documentation

- [Main Specification](docs/specifications/theoretical-max-testing-spec.md) - Complete framework specification (v1.1, ~14K words)
- [Improvements v1.1](docs/specifications/IMPROVEMENTS_v1.1.md) - Philosophy shift from "theoretical maximum" to "asymptotic effectiveness"
- [CLAUDE.md](CLAUDE.md) - Guidance for Claude Code development

## CI/CD

GitHub Actions workflow enforces quality gates:

- **Tier 1**: Quick checks on every push
- **Tier 2**: Full test suite + coverage on PR (≥85%)
- **Security**: Parallel security audit (cargo-audit, cargo-deny)
- **Tier 3**: Mutation testing on merge to main (≥85% mutation score)

## Contributing

Contributions are welcome! Please ensure:

1. All tests pass: `make tier2`
2. Coverage ≥85%: `make coverage`
3. Clippy passes: `make clippy-strict`
4. Code is formatted: `cargo fmt`
5. No SATD comments (TODO/FIXME/HACK)

## License

MIT License - see [LICENSE](LICENSE) file for details.

## References

This project implements research findings from the certeza specification, incorporating:

- Property-based testing methodologies (QuickCheck, Hypothesis, proptest)
- Mutation testing research (Jia & Harman, 2011)
- Toyota Production System principles (Kaizen, Muda elimination)
- Continuous integration best practices (Fowler, Google)
- Software economics (Boehm, McConnell)
- Safety-critical systems standards (DO-178C, IEC 61508)

## Acknowledgments

- **trueno project** (https://github.com/paiml/trueno) - Reference implementation
- **PMAT** (https://github.com/paiml/paiml-mcp-agent-toolkit) - Test orchestration toolkit
- Pragmatic AI Labs Research Division