genetic_algorithm 0.11.1

A genetic algorithm implementation
Documentation

genetic-algorithm

Crates.io MSRV Crates.io Version Rust Crates.io License

A genetic algorithm implementation for Rust. Inspired by the book Genetic Algorithms in Elixir

There are three main elements to this approach:

  • The Genotype (the search space)
  • The Fitness function (the search goal)
  • The strategy (the search strategy)
    • Evolve (evolution strategy)
    • Permutate (for small search spaces, with a 100% guarantee)
    • HillClimb (when search space is convex with little local optima or when crossover is impossible/inefficient)

Terminology:

  • Population: a population has population_size number of individuals (called chromosomes).
  • Chromosome: a chromosome has genes_size number of genes
  • Gene: a gene is a combination of position in the chromosome and value of the gene (allele)
  • Allele: alleles are the possible values of the genes
  • Genotype: holds the genes_size and alleles and knows how to generate and mutate chromosomes efficiently
  • Fitness: knows how to determine the fitness of a chromosome

All multithreading mechanisms are implemented using rayon::iter and std::sync::mpsc.

Documentation

See docs.rs

Quick Usage

use genetic_algorithm::strategy::evolve::prelude::*;

// the search space
let genotype = BinaryGenotype::builder() // boolean alleles
    .with_genes_size(100)                // 100 genes per chromosome
    .build()
    .unwrap();

println!("{}", genotype);

// the search goal to optimize towards (maximize or minimize)
#[derive(Clone, Debug)]
pub struct CountTrue;
impl Fitness for CountTrue {
    type Allele = BinaryGenotype; // bool
    fn calculate_for_chromosome(&mut self, chromosome: &Chromosome<Self::Allele>) -> Option<FitnessValue> {
        Some(chromosome.genes.iter().filter(|&value| *value).count() as FitnessValue)
    }
}

// the search strategy
let evolve = Evolve::builder()
    .with_genotype(genotype)
    .with_compete(CompeteElite::new())             // sort the chromosomes by fitness to determine crossover order
    .with_crossover(CrossoverUniform::new(0.5))    // crossover all individual genes between 2 chromosomes for offspring, keep 50% of parents around for next generation
    .with_mutate(MutateOnce::new(0.2))             // mutate a single gene with a 20% probability per chromosome
    .with_fitness(CountTrue)                       // count the number of true values in the chromosomes
    .with_target_population_size(100)              // evolve with 100 chromosomes
    .with_target_fitness_score(100)                // goal is 100 times true in the best chromosome
    .with_reporter(EvolveReporterSimple::new(100)) // optional builder step, report every 100 generations
    .call();
    .unwrap()

println!("{}", evolve);

Examples

Run with cargo run --example [EXAMPLE_BASENAME] --release

Performance considerations

For the Evolve strategy:

  • Compete: no considerations. All competes are basically some form of in-place sorting of some kind. This is relatively fast compared to the rest of the operations.
  • Crossover: the workhorse of internal parts. Crossover touches most genes each generation and clones the whole population if you keep the parents around. See performance tips below.
  • Mutate: no considerations. It touches genes like crossover does, but should be used sparingly anyway; with low gene counts (<10%) and low probability (5-20%)
  • Fitness: can be anything. This fully depends on the user domain. Parallelize it using with_par_fitness() in the Builder. But beware that parallelization has it's own overhead and is not always faster.

Performance Tips

  • Small genes sizes
    • It seems that CrossoverMultiGene with number_of_crossovers = genes_size / 2 and allow_duplicates = true is the best tradeoff between performance and effect. CrossoverUniform is an alias for the same approach, taking the genes_size from the genotype at runtime.
    • Keeping the parents around doesn't matter that much as the cloning is relatively less pronounced (but becomes more prominent for larger population sizes)
  • Large genes sizes
    • It seems that CrossoverMultiPoint with number_of_crossovers = genes_size / 9 and allow_duplicates = false is the best tradeoff between performance and effect.
    • Keeping the parents around has major performance effects and should be avoided

Tests

Run tests with cargo test

Use .with_rng_seed_from_u64(0) builder step to create deterministic tests results.

Benchmarks

Implemented using criterion. Run benchmarks with cargo bench

Profiling

Implemented using criterion and pprof.

Uncomment in Cargo.toml

[profile.release]
debug = 1

Run with cargo run --example profile_evolve_binary --release -- --bench --profile-time 5

Find the flamegraph in: ./target/criterion/profile_evolve_binary/profile/flamegraph.svg

TODO

MAYBE

  • Target cardinality range for Mutate Dynamic to avoid constant switching
  • Default max_stale_generations to 1 for SteepestAscent
  • Add scaling permutate? Can be done by grid search and then search within last grid with new scale
  • Add scaling helper function
  • Add simulated annealing strategy
  • Add Roulette competition with and without duplicates (with fitness ordering)
  • Add OrderOne crossover for UniqueGenotype?
  • Add WholeArithmetic crossover for RangeGenotype?
  • Add CountTrueWithWork instead of CountTrueWithSleep for better benchmarks?
  • Explore non-Vec genes:
    • All Mutate and Crossover logic should be internal to the Genotype (where the genes internal structure is known). Fitness would know about the genes internal structure as well of course. Compete, Extension and the rest of the main loop don't care, as these only care about the fitness value.
    • Switch the user API associated trait back from Allele to Genotype. And then Chromosome would store Genotype::Genes, which is not necessarily a Vec<Allele> anymore.
    • Then we could add PackedSimd, SmallVec or FixedBitSet genotypes. And see how they perform differently

ISSUES

  • permutate (and possibly others) with gene_size 0 panics. Maybe it should just return a empty chromosome?