# Primordial Specification
This document captures design requirements and constraints for the Primordial library.
## Overview
Primordial is an optimization framework/library, starting with genetic algorithms (with potential expansion to other metaheuristics).
## Architecture
### Representation-Agnostic Design
All genetic operators work on a generic byte stream representation (`Chromosome`), decoupling the optimization algorithms from problem-specific data types.
**Two-layer architecture:**
1. **Library layer** - Operates on `Chromosome` (raw bytes)
- Mutation (bit_flip, etc.)
- Crossover (single_point, etc.)
- Selection
2. **Application layer** - Interprets bytes as domain types for fitness evaluation
- Uses RKYV for zero-copy serialization/deserialization
- Example: `Vec<f64>` for continuous optimization, custom structs for combinatorial problems
**Rationale:**
- Generic operators work across all problem types
- Zero-copy RKYV bridging minimizes overhead
- Users define only fitness functions and type mappings
**Example:**
```rust
// Library provides byte-level operations
mutation::bit_flip(&mut chromosome, 0.01, &mut rng);
// Application interprets bytes via RKYV for fitness evaluation
let params: &Archived<Vec<f64>> = rkyv::access(&chromosome.data).unwrap();
let fitness = evaluate(params);
```
## Core Principles
### Determinism / Reproducibility
All stochastic operations must be deterministic and reproducible given the same seed.
**Requirements:**
- All functions involving randomness must accept an RNG parameter (`&mut impl Rng`) rather than creating their own
- Use seedable RNGs (e.g., `StdRng::seed_from_u64()`) for reproducible results
- Never use `rand::rng()` or other non-seedable RNG sources in library code
**Rationale:**
- Enables reproducible experiments and debugging
- Allows users to replay exact evolutionary runs
- Facilitates testing with deterministic outcomes
**Example:**
```rust
use rand::{Rng, SeedableRng, rngs::StdRng};
// Good: accepts RNG parameter
pub fn bit_flip<R: Rng>(chromosome: &mut Chromosome, p: f64, rng: &mut R) { ... }
// Usage with reproducible seed
let mut rng = StdRng::seed_from_u64(42);
bit_flip(&mut chromosome, 0.05, &mut rng);
```
### Fitness Caching
Chromosomes cache their fitness values to avoid redundant computation. Fitness is expected to be CPU-intensive and the critical path.
**Design:**
- Fitness is stored as `Option<Box<[f64]>>` supporting both single and multi-objective optimization
- Single-objective: `chromosome.set_fitness(vec![0.5])`
- Multi-objective: `chromosome.set_fitness(vec![0.5, 0.3, 0.8])`
- Fitness is automatically invalidated when chromosome data changes
**Implicit Invalidation:**
- All mutation functions (`bit_flip`, `fill_random`) automatically set `fitness = None`
- Crossover produces children with `fitness = None`
- Internal `set_data()` method ensures fitness is invalidated on any data change
**Rationale:**
- Avoids recomputing expensive fitness for unchanged chromosomes
- Uniform `&[f64]` API simplifies multi-objective support
- Implicit invalidation prevents stale fitness bugs
**Example:**
```rust
// Create chromosome and evaluate fitness
let mut chromosome = Chromosome::new(16, &mut rng);
chromosome.set_fitness(vec![rastrigin(&chromosome)]);
// Access cached fitness
if let Some(fitness) = chromosome.fitness() {
println!("Fitness: {}", fitness[0]);
}
// Mutation automatically invalidates fitness
mutation::bit_flip(&mut chromosome, 0.01, &mut rng);
assert!(chromosome.fitness().is_none()); // Must re-evaluate
```
### Concurrency
Tokio is used for task-level parallelism (multiple GA runs, island model). SIMD is expected to provide benefit for individual fitness calculations.
**Rationale:**
- Task-level parallelism enables running multiple independent optimizations concurrently
- Async patterns support distributed/networked scenarios
- SIMD optimization is left to the application layer for domain-specific fitness functions
### Selection
All selection methods return `impl Iterator<Item = Chromosome>`, enabling flexible composition with `.take(n)` to select any number of parents.
**Available algorithms:**
| `rank` | Single | Yes | Sorts by fitness ascending (lower = better) |
| `tournament` | Single | No | Best of k random individuals per selection |
| `roulette` | Single | No | Probability proportional to fitness quality |
| `nsga2` | Multi | Yes | Pareto fronts + crowding distance |
**Single-objective constraints:**
- `rank`, `tournament`, and `roulette` require fitness length = 1
- Enforced via `debug_assert!` (panics in debug builds only)
**Multi-objective (NSGA-II):**
- Supports any number of objectives
- Returns chromosomes ordered by: Pareto front (lower = better), then crowding distance (higher = more diverse)
- Boundary solutions receive infinite crowding distance
**Example:**
```rust
// Single-objective: get top 50 by rank
let parents: Vec<_> = selection::rank(&population).take(50).collect();
// Single-objective: tournament selection with k=3
let parents: Vec<_> = selection::tournament(&population, 3, &mut rng)
.take(50)
.collect();
// Multi-objective: NSGA-II for Pareto-optimal selection
let parents: Vec<_> = selection::nsga2(&population).take(100).collect();
```