1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
/*!
# BoolNetEvo

Evolve populations of boolean networks to approximate bitstring functions and their (unknown) inverses.

<details>
<summary>
► Machinery <i>(click to show/hide)</i>
</summary>

```text
Layer k+1               ●                         ●
↑               ┌───────┼───────┐         ┌───────┼───────┐
↑               ○       ○       ○         ○       ○       ○
↑             / | \   / | \   / | \     / | \   / | \   / | \
Layer k      ●  ●  ● ●  ●  ● ●  ●  ●   ●  ●  ● ●  ●  ● ●  ●  ●
```

[Boolean network](https://en.wikipedia.org/wiki/Boolean_network), or *boolnet*, is a well-known particular case of an [artificial neural network](https://en.wikipedia.org/wiki/Artificial_neural_network) where inputs/outputs of each cell, or *boolon*, are bits — 0 (false) and 1 (true) — rather than values from a range, say, (-∞; +∞). Output of a boolon is a boolean function of its inputs, which are outputs of boolons from previous layer.

In this implementation, boolean functions have 3 variables and are encoded by `u8` values, which represent the truth tables of such functions. Each boolon has input tree, whose leaves are its input boolons, and then, at each level of that tree, a boolean function of 3 values at current level gives 1 value at next level. Single value at root level is the output of this boolon. On the figure above, input tree has height 2, so each boolon's output is affected by 3<sup>2</sup> = 9 outputs of boolons from previous layer (some may coincide).

Number of boolons is the same in all layers, and the signal can propagate through a boolnet for few times, called *cycles*: at the end of each cycle, outputs of the last layer become inputs for the first layer. Under these constraints, layers can be viewed as steps residing in time rather than space, if you identify i-th boolon of all layers, which changes its input tree at every step within cycle.

Layers may have *rands* boolons whose outputs are random. Enter robustness and free (unpredictable) will... at the cost of slower learning.

Array of 256 possible 3-variable boolean functions is used to perform simultaneous calculation for 128 data flows, in a *batch*.

Several boolnets of the same architecture make a *population*, which evolves in time by means of the [genetic algorithm](https://en.wikipedia.org/wiki/Genetic_algorithm). At each *epoch*:

- all boolnets are scored w.r.t. how they calculate some bitstring function or its inverse, on the average (the more wrong bits, the less the score),

- boolnets with the lowest scores are replaced by inheritants of boolnets with the highest scores.

Inheritance, as usually, is [crossover](https://en.wikipedia.org/wiki/Crossover_(genetic_algorithm)) and [mutation](https://en.wikipedia.org/wiki/Mutation_(genetic_algorithm)),
in this case for 1) indices of input boolons and 2) boolean functions in the input tree, for each boolon.

Note that a single boolnet does not evolve; a population does.

The purpose of the crate is to provide controls for this process and make it more interactive.
</details>

## Quick Start

**Warning**: use `release` profile, without optimizations the execution is 20–30 times slower.

In your `code.rs`, bring `BitransformRegister` and `Evolver` structs into scope:

```ignore
extern crate boolnetevo;

use boolnetevo::{
    BitransformRegister,
    Evolver
};
```

and create their default instances:

```ignore
fn app() -> Result<(), String> {
    let register = BitransformRegister::new_default();
    let mut evolver = Evolver::new_default(&register)?;
    ...
}
```

Now, you can run the built-in shell...

```ignore
    evolver.shell(&register)?;
```

and get something like this (with colors) on your terminal:

<div style="background: #000000; font-family: mono; font-size: 80%; color: #D0D0D0; padding: 16px;">
<font color="#729FCF"><b>BITRANSFORM:</b></font> InvXor { direct: Xor }<br>
<font color="#729FCF"><b>POPULATION:</b></font> 250<br>
<font color="#729FCF"><b>ARCHITECTURE:</b></font> <font color="#FCE94F">height</font> = 1, <font color="#FCE94F">size</font> = 48, <font color="#FCE94F">rands</font> = 2, <font color="#FCE94F">layers</font> = 1, <font color="#FCE94F">cycles</font> = 1<br>
<font color="#729FCF"><b>PARAMETERS:</b></font> <font color="#FCE94F">batches</font> = 8, <font color="#FCE94F">replace_ratio</font> = 0.7, <font color="#FCE94F">par_ratio</font> = 0.8, <font color="#FCE94F">parents</font> = 2, <font color="#FCE94F">par_sw_prob</font> = 0.9, <font color="#FCE94F">mutat_prob</font> = 0.1<br>
<font color="#3465A4"><b>SETTINGS:</b></font> <font color="#C4A000">log</font> <font color="#06989A">off</font>, <font color="#C4A000">test</font> <font color="#06989A">new</font>, <font color="#C4A000">print</font> <font color="#06989A">improve</font><br>
<font color="#555753">Type &quot;?&quot; to see the list of available commands...</font><br>
$ evol<br>
<font color="#555753">Press any key to stop...</font><br>
<font color="#AD7FA8"><b>Epoch</b></font> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1 ( &nbsp;&nbsp;&nbsp; 41 <font color="#3465A4">ms</font>)<font color="#AD7FA8"><b>:</b></font> <font color="#AD7FA8">min</font> = &nbsp; -9.0586<font color="#AD7FA8">, average</font> = &nbsp; -7.9585<font color="#AD7FA8">, max</font> = &nbsp; -6.8008<br>
<font color="#AD7FA8"><b>Epoch</b></font> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2 ( &nbsp;&nbsp;&nbsp; 29 <font color="#3465A4">ms</font>)<font color="#AD7FA8"><b>:</b></font> <font color="#AD7FA8">min</font> = &nbsp; -8.9746<font color="#AD7FA8">, average</font> = &nbsp; -7.7293<font color="#AD7FA8">, max</font> = &nbsp; -6.7021<br>
<font color="#AD7FA8"><b>Epoch</b></font> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 3 ( &nbsp;&nbsp;&nbsp; 29 <font color="#3465A4">ms</font>)<font color="#AD7FA8"><b>:</b></font> <font color="#AD7FA8">min</font> = &nbsp; -8.9043<font color="#AD7FA8">, average</font> = &nbsp; -7.5695<font color="#AD7FA8">, max</font> = &nbsp; -6.4512<br>
<font color="#AD7FA8"><b>Epoch</b></font> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 4 ( &nbsp;&nbsp;&nbsp; 18 <font color="#3465A4">ms</font>)<font color="#AD7FA8"><b>:</b></font> <font color="#AD7FA8">min</font> = &nbsp; -8.9697<font color="#AD7FA8">, average</font> = &nbsp; -7.4082<font color="#AD7FA8">, max</font> = &nbsp; -5.7324<br>
<font color="#AD7FA8"><b>Epoch</b></font> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 6 ( &nbsp;&nbsp;&nbsp; 17 <font color="#3465A4">ms</font>)<font color="#AD7FA8"><b>:</b></font> <font color="#AD7FA8">min</font> = &nbsp; -8.7197<font color="#AD7FA8">, average</font> = &nbsp; -7.1513<font color="#AD7FA8">, max</font> = &nbsp; -5.7129<br>
...<br>
<font color="#AD7FA8"><b>Epoch</b></font> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 58 ( &nbsp;&nbsp;&nbsp; 29 <font color="#3465A4">ms</font>)<font color="#AD7FA8"><b>:</b></font> <font color="#AD7FA8">min</font> = &nbsp; -5.0742<font color="#AD7FA8">, average</font> = &nbsp; -1.9552<font color="#AD7FA8">, max</font> = &nbsp; -0.4736<br>
<font color="#AD7FA8"><b>Epoch</b></font> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 64 ( &nbsp;&nbsp;&nbsp; 16 <font color="#3465A4">ms</font>)<font color="#AD7FA8"><b>:</b></font> <font color="#AD7FA8">min</font> = &nbsp; -4.5156<font color="#AD7FA8">, average</font> = &nbsp; -1.8883<font color="#AD7FA8">, max</font> = &nbsp; -0.2500<br>
<font color="#AD7FA8"><b>Epoch</b></font> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 65 ( &nbsp;&nbsp;&nbsp; 17 <font color="#3465A4">ms</font>)<font color="#AD7FA8"><b>:</b></font> <font color="#AD7FA8">min</font> = &nbsp; -5.0068<font color="#AD7FA8">, average</font> = &nbsp; -1.8879<font color="#AD7FA8">, max</font> = &nbsp;&nbsp; 0.0000<br>
<br>
<font color="#729FCF"><b>BITRANSFORM:</b></font> InvXor { direct: Xor }<br>
<font color="#729FCF"><b>POPULATION:</b></font> 250<br>
<font color="#729FCF"><b>ARCHITECTURE:</b></font> <font color="#FCE94F">height</font> = 1, <font color="#FCE94F">size</font> = 48, <font color="#FCE94F">rands</font> = 2, <font color="#FCE94F">layers</font> = 1, <font color="#FCE94F">cycles</font> = 1<br>
<font color="#729FCF"><b>PARAMETERS:</b></font> <font color="#FCE94F">batches</font> = 8, <font color="#FCE94F">replace_ratio</font> = 0.7, <font color="#FCE94F">par_ratio</font> = 0.8, <font color="#FCE94F">parents</font> = 2, <font color="#FCE94F">par_sw_prob</font> = 0.9, <font color="#FCE94F">mutat_prob</font> = 0.1<br>
<font color="#3465A4"><b>SETTINGS:</b></font> <font color="#C4A000">log</font> <font color="#06989A">off</font>, <font color="#C4A000">test</font> <font color="#06989A">new</font>, <font color="#C4A000">print</font> <font color="#06989A">improve</font><br>
$ run 0 a7b8<br>
e9924e2a
</div>

<br>

(To verify that `0xe992 XOR 0x4e2a = 0xa7b8`,

<div style="background: #000000; font-family: mono; font-size: 80%; color: #D0D0D0; padding: 16px;">
$ dir e9924e2a<br>
a7b8
</div>

<br>

or use any calculator in programming mode.)

...*OR* you can call `Evolver`'s methods directly:

```ignore
    evolver.bitran(&register, "inv-sha1", &[1, 4])?; // 1 round of SHA-1 with 4-byte message
    evolver.arch(2, 240, 8, 2, 1)?;
    evolver.par("mutat_prob", "0.2")?;
    evolver.evol(100)?;
    evolver.par("mutat_prob", "0.1")?;
    evolver.evol(50)?;
    evolver.save("evolvers/invsha1-demo")?;
```

See `src/bin/shell.rs` and `src/bin/script.rs`.

## Custom Bitransforms

The evolution optimizes a population towards precise calculation of some bitstring function. The related methods — constructing the function's parameters block, providing input and output size in bits, scoring the input-output batch pair in regard to the function — are gathered in `Bitransform` trait. The crate implements it for `xor`, `add`, `mult`, [`CRC32`](https://en.wikipedia.org/wiki/Cyclic_redundancy_check#CRC-32_algorithm), [`MD5`](https://en.wikipedia.org/wiki/MD5), [`SHA1`](https://en.wikipedia.org/wiki/SHA-1), [`SHA2`](https://en.wikipedia.org/wiki/SHA-2), [`SHA3/Keccak`](https://en.wikipedia.org/wiki/SHA-3) functions and their inverses, and you can implement it for your own functions.

As an example, see `src/bitransform/xor.rs` and `src/bitransform/sha1.rs`. The difference between direct and inverse bitransforms is that in the former case, the *function of input* is compared bitwise with the *output*, while in the latter case, the same *function of output* is compared with the *input*. In other words, you do not need explicit inverse function to score something that tries to perform inversion... actually, "something" should become an inverter.

Assume you have done it for `Func` in `func` module, then you add it to register:

```ignore
mod func;

use func::{Func, InvFunc};

fn app() -> Result<(), String> {
    let mut register = BitransformRegister::new_default();
    register.add(vec![
        Func::reg_entry(), InvFunc::reg_entry()
    ])?;
    ...
}
```

and now you can use it with evolver:

```ignore
    let mut evolver = Evolver::new_default(&register)?;
    evolver.bitran(&register, "inv-func", &[7, 5, 3])?;
    evolver.evol(1000)?;
```

(Here `7, 5, 3` are your function's parameters.)

## Modifying the crate

Well, copy its source and do whatever you like: extend architecture, apply another genetic algorithm, optimize performance, ... It is something else then, and you should give it a different name.

## Kin

Some resembling frameworks, packages, projects:

- [Boolean-Circuit-Evolution](https://github.com/JBQuim/Boolean-Circuit-Evolution)

- [BoolNet](https://cran.r-project.org/web/packages/BoolNet/)

- [PyBoolNet](https://github.com/hklarner/PyBoolNet)

- [revonet](https://crates.io/crates/revonet)

- [rsgenetic](https://crates.io/crates/rsgenetic)

There are probably more... When searching for those, keep in mind that the "evolution" term in this area sometimes means change of a single boolnet state in time rather than optimization of a boolnet population.

## A*s*hievements

So far, well... While populations relatively quickly learn to invert simple bitstring functions like bitwise XOR, they struggle with more complex ones, especially those created to prevent inversion, like rounds of cryptographic hashes SHA-x. Even ADD, due to carry, requires more "advanced" boolnet architecture (more layers etc.), which makes learning significantly slower. There seem to be non-zero bounds for the best score, and, of course, there is no guarantee to reach score 0, that is, to breed a boolnet that *always* calculates a function precisely.

There may be a *proof* somewhere that this approach has insurmountable limitations when a function's complexity exceeds certain threshold, and then to apply it would be like trying to make a bridge of ashes. On the other hand, it can be a part of some more sophisticated scheme.

## A glitch

**Case A.** Set the following evolver:

- bitransform is `inv-sha2` with 64 rounds (i.e. full SHA2-256) and message length 60 bytes (480 bits) *OR* `inv-keccak` with hash length 256, 24 rounds (i.e. full SHA3-256), and the same message length;

- population is 500;

- architecture is `height = 1`, `size = 8000` (yes, most will be unused), `rands = 0`, `layers = 1` (single layer...), `cycles = 1`;

- evolution parameters are default except for `batches = 16` (or `batches = 32`) and `mutat_prob = 0.001`;

- settings: change `test` to `all` and `print` to `all`.

Now, start the evolution, `$ e` in shell, and watch the `average` score column. It remains very close to `-128.0` however many epochs pass, you should never get more than `-127.95` or so (the more the `batches`, the more precision the law of large numbers provides).

*This* outcome is quite expectable: not a single network of such simple architecture, — where each output bit is a function of 3 or less hash bits, — is able to invert these *preimage-resistant* cryptographic hashes statistically better than guessing a message (and thus its hash) randomly. The population as a whole shows the absence of improvement as well, of course.

**Case B.** Same as Case A with a single difference: decrease message length to 40 bytes (320 bits — notice it is still larger than hash length); do not forget to `reset` the population after. Now, start the evolution again. After 100 or so epochs, the `average` score increases to nearly `-127.0` and remains there. It seemingly begins to improve at 20–30th epoch, when some networks gain (?) a very, very tiny ability to invert hashes better (?) than a random guessing; then their descendants spread in the population.

Reset populations and repeat evolutions from Cases A and B several times to verify their stability. Or try it for SHA1, MD5 with similar results.

What would that 1 bit (which is >10σ-deviation from mean of approximated normal distribution of the score) mean, were it not the header of this section... Hint: consider the inverter that, given a hash, tries several (even 2) fixed messages and selects one whose hash has maximum number of coinciding bits.
*/

mod base;
mod bitransform;
mod bool3funcs;
mod boolnet;
mod evolearn;
mod evolver;
mod rwbytes;

pub use bitransform::{
    Bitransform,
    BitransformDesc,
    BitransformRegister
};

pub use evolver::Evolver;