primordial-opt 0.1.4

Genetic algorithm library with composable selection, crossover, and mutation operators
Documentation
# Primordial Specification

This document captures design requirements and constraints for the Primordial library.

## Overview

Primordial is an optimization framework/library, starting with genetic algorithms (with potential expansion to other metaheuristics).

## Architecture

### Representation-Agnostic Design

All genetic operators work on a generic byte stream representation (`Chromosome`), decoupling the optimization algorithms from problem-specific data types.

**Two-layer architecture:**

1. **Library layer** - Operates on `Chromosome` (raw bytes)
   - Mutation (bit_flip, etc.)
   - Crossover (single_point, etc.)
   - Selection

2. **Application layer** - Interprets bytes as domain types for fitness evaluation
   - Uses RKYV for zero-copy serialization/deserialization
   - Example: `Vec<f64>` for continuous optimization, custom structs for combinatorial problems

**Rationale:**

- Generic operators work across all problem types
- Zero-copy RKYV bridging minimizes overhead
- Users define only fitness functions and type mappings

**Example:**
```rust
// Library provides byte-level operations
mutation::bit_flip(&mut chromosome, 0.01, &mut rng);

// Application interprets bytes via RKYV for fitness evaluation
let params: &Archived<Vec<f64>> = rkyv::access(&chromosome.data).unwrap();
let fitness = evaluate(params);
```

## Core Principles

### Determinism / Reproducibility

All stochastic operations must be deterministic and reproducible given the same seed.

**Requirements:**
- All functions involving randomness must accept an RNG parameter (`&mut impl Rng`) rather than creating their own
- Use seedable RNGs (e.g., `StdRng::seed_from_u64()`) for reproducible results
- Never use `rand::rng()` or other non-seedable RNG sources in library code

**Rationale:**
- Enables reproducible experiments and debugging
- Allows users to replay exact evolutionary runs
- Facilitates testing with deterministic outcomes

**Example:**
```rust
use rand::{Rng, SeedableRng, rngs::StdRng};

// Good: accepts RNG parameter
pub fn bit_flip<R: Rng>(chromosome: &mut Chromosome, p: f64, rng: &mut R) { ... }

// Usage with reproducible seed
let mut rng = StdRng::seed_from_u64(42);
bit_flip(&mut chromosome, 0.05, &mut rng);
```

### Fitness Caching

Chromosomes cache their fitness values to avoid redundant computation. Fitness is expected to be CPU-intensive and the critical path.

**Design:**
- Fitness is stored as `Option<Box<[f64]>>` supporting both single and multi-objective optimization
- Single-objective: `chromosome.set_fitness(vec![0.5])`
- Multi-objective: `chromosome.set_fitness(vec![0.5, 0.3, 0.8])`
- Fitness is automatically invalidated when chromosome data changes

**Implicit Invalidation:**
- All mutation functions (`bit_flip`, `fill_random`) automatically set `fitness = None`
- Crossover produces children with `fitness = None`
- Internal `set_data()` method ensures fitness is invalidated on any data change

**Rationale:**
- Avoids recomputing expensive fitness for unchanged chromosomes
- Uniform `&[f64]` API simplifies multi-objective support
- Implicit invalidation prevents stale fitness bugs

**Example:**
```rust
// Create chromosome and evaluate fitness
let mut chromosome = Chromosome::new(16, &mut rng);
chromosome.set_fitness(vec![rastrigin(&chromosome)]);

// Access cached fitness
if let Some(fitness) = chromosome.fitness() {
    println!("Fitness: {}", fitness[0]);
}

// Mutation automatically invalidates fitness
mutation::bit_flip(&mut chromosome, 0.01, &mut rng);
assert!(chromosome.fitness().is_none()); // Must re-evaluate
```

### Concurrency

Tokio is used for task-level parallelism (multiple GA runs, island model). SIMD is expected to provide benefit for individual fitness calculations.

**Rationale:**
- Task-level parallelism enables running multiple independent optimizations concurrently
- Async patterns support distributed/networked scenarios
- SIMD optimization is left to the application layer for domain-specific fitness functions

### Selection

All selection methods return `impl Iterator<Item = Chromosome>`, enabling flexible composition with `.take(n)` to select any number of parents.

**Available algorithms:**

| Algorithm | Objective | Deterministic | Description |
|-----------|-----------|---------------|-------------|
| `rank` | Single | Yes | Sorts by fitness ascending (lower = better) |
| `tournament` | Single | No | Best of k random individuals per selection |
| `roulette` | Single | No | Probability proportional to fitness quality |
| `nsga2` | Multi | Yes | Pareto fronts + crowding distance |

**Single-objective constraints:**

- `rank`, `tournament`, and `roulette` require fitness length = 1
- Enforced via `debug_assert!` (panics in debug builds only)

**Multi-objective (NSGA-II):**

- Supports any number of objectives
- Returns chromosomes ordered by: Pareto front (lower = better), then crowding distance (higher = more diverse)
- Boundary solutions receive infinite crowding distance

**Example:**

```rust
// Single-objective: get top 50 by rank
let parents: Vec<_> = selection::rank(&population).take(50).collect();

// Single-objective: tournament selection with k=3
let parents: Vec<_> = selection::tournament(&population, 3, &mut rng)
    .take(50)
    .collect();

// Multi-objective: NSGA-II for Pareto-optimal selection
let parents: Vec<_> = selection::nsga2(&population).take(100).collect();
```