1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
#![deny(missing_docs)]

//! # Hypo - Automatic hypothesis testing
//!
//! This library consists of various algorithms for automatic hypothesis testing.
//!
//! ### Introduction
//!
//! In the scientific method, one formulates a falsifiable hypothesis
//! which is tested against experiments.
//!
//! Instead of designing one experiment to test a single hypothesis,
//! one can design an experiment to eliminate as many hypotheses as possible.
//! This method is somewhat efficient when hypotheses are easy to generate
//! and computationally fast to make predictions.
//!
//! Automatic hypothesis testing can be used when hypotheses are generated
//! in advance and the experiments can be configured and executed automatically.
//!
//! ### Optimal guesses
//!
//! An optimal guess is an experiment which is picked based on its capability
//! to eliminate potentially false hypotheses.
//!
//! Assume that an experiment simply returns a conclusion `true` or `false`.
//! Before performing the experiment, the conclusion is unknown.
//!
//! To maximize the number of eliminated false hypotheses,
//! one exploits the property that the conclusion of the experiment is yet unknown.
//! The optimal guess is picked such that competing hypotheses disagree with each other.
//!
//! The ideal experiment is one that eliminates half the hypotheses
//! no matter what the conclusion of the experiment will be.
//! This works in a similar way to binary search,
//! except that hypotheses do not need to be sorted.
//! In practice one can choose the experiment for which the number of hypotheses
//! that returns `true` is as close as possible to half the total number of hypotheses.
//!
//! In [path semantical notation](https://github.com/advancedresearch/path_semantics),
//! the following measure is minimized over `e`:
//!
//! ```
//! abs(|h : (prediction e)| - |Hypothesis| / 2)
//!
//! h : Hypothesis
//! e : Experiment
//! prediction : Hypothesis x Experiment -> bool
//! ```
//!
//! By repeating such experiments,
//! one is more likely to minimize the number of experiments required
//! to eliminate as many false hypotheses as possible.

/// Returns an optimal guess given a list of hypotheses.
///
/// An optimal guess is an experiment that is as close as possible
/// to eliminating half the hypotheses when being tested.
///
/// Here, the experiment is referred to by an index.
///
/// - `n` is the number of available experiments
/// - `f` is a function that makes a prediction from experiment
pub fn optimal_guess<H, F>(n: usize, hypos: &[H], f: F) -> Option<usize>
    where F: Fn(&H, usize) -> bool
{
    let mut min: Option<(usize, usize)> = None;
    let half = hypos.len() / 2;
    for i in 0..n {
        let mut count = 0;
        for hy in hypos {
            if f(hy, i) {count += 1}
        }
        let dist = if count > half {count - half} else {half - count};
        if min.is_none() || min.unwrap().0 > dist {
            min = Some((dist, i));
        }
    }
    min.map(|(_, b)| b)
}

/// Remove all hypotheses that predicted the wrong answer.
pub fn update<H, F>(hypos: &mut Vec<H>, val: usize, answer: bool, f: F)
    where F: Fn(&H, usize) -> bool
{
    for i in (0..hypos.len()).rev() {
        if f(&hypos[i], val) != answer {
            hypos.swap_remove(i);
        }
    }
}

/// Reduce number of hypotheses by repeatedly making optimal guesses for experiments.
pub fn experiment<H, F, P>(n: usize, hypos: &mut Vec<H>, mut f: F, p: P)
    where F: FnMut(usize) -> bool,
          P: Copy + Fn(&H, usize) -> bool
{
    loop {
        if let Some(guess) = optimal_guess(n, hypos, p) {
            let n = hypos.len();
            update(hypos, guess, f(guess), p);
            if hypos.len() == n {break}
        } else {break}
    }
}