symbios 1.4.0

A derivation engine for L-Systems (ABOP compliant).
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
# Symbios Architecture

This document explains the design decisions and architectural patterns used in Symbios.

## Table of Contents

1. [System Overview]#system-overview
2. [Structure-of-Arrays (SoA) Layout]#structure-of-arrays-soa-layout
3. [Virtual Machine Design]#virtual-machine-design
4. [Symbol Interning]#symbol-interning
5. [Topology Skip-Links]#topology-skip-links
6. [Compilation Pipeline]#compilation-pipeline
7. [Derivation Engine]#derivation-engine
8. [Genetic Algorithm Architecture]#genetic-algorithm-architecture
9. [Source Round-Tripping Pipeline]#source-round-tripping-pipeline
10. [Memory Bounds and Safety]#memory-bounds-and-safety

---

## System Overview

Symbios is organized into four core modules, with the `system` module further split into focused submodules:

```
src/
├── core/              # State management and data structures
│   ├── mod.rs         # SymbiosState with SoA layout
│   └── interner.rs    # Symbol table for string interning
├── parser/            # Rule and expression parsing
│   ├── mod.rs         # Parser combinators (nom-based)
│   └── ast.rs         # Abstract syntax tree definitions
├── vm/                # Expression evaluation engine
│   ├── mod.rs         # Stack-based virtual machine
│   ├── ops.rs         # Bytecode instruction set
│   ├── compiler.rs    # AST-to-bytecode compiler
│   └── decompiler.rs  # Bytecode-to-AST decompiler
└── system/            # Main orchestrator
    ├── mod.rs         # System struct, RuntimeRule, config types
    ├── derivation.rs  # Derive loop with stochastic rule selection
    ├── matching.rs    # Context-sensitive pattern matching
    ├── mutate.rs      # Mutation operators (9 types)
    ├── crossover.rs   # Crossover operators (4 strategies)
    ├── export.rs      # Rule decompilation and to_source()
    └── source.rs      # SourceGenotype wrapper, from_source()
```

### Data Flow

```
Rule String → Parser → AST → Compiler → Bytecode → VM → f64 result
Axiom String → Parser → Interned Symbols + Parameters → State
Derivation Loop:
  1. Match predecessor + context (matching.rs)
  2. Select stochastic rule (derivation.rs)
  3. Evaluate guard condition (VM)
  4. Compile successor expressions (VM)
  5. Build new state with results
    Export Pipeline (reverse):
  Bytecode → Decompiler → AST → Formatter → Source Text
```

---

## Structure-of-Arrays (SoA) Layout

### Why SoA?

Traditional object-oriented approaches store module data as:

```rust
// Array-of-Structs (AoS) - NOT used
struct Module {
    symbol: u16,
    birth_time: f64,
    parameters: Vec<f64>,
    topology_link: Option<usize>,
}
let modules: Vec<Module>;
```

Symbios instead uses **Structure-of-Arrays**:

```rust
// SoA - What Symbios actually uses
struct ModuleData {
    symbol: u16,
    birth_time: f64,
    param_start: u32,
    param_len: u16,
    topology_link: u32,
}

pub struct SymbiosState {
    modules: Vec<ModuleData>,  // Flat array of metadata
    params: Vec<f64>,          // Separate flat array of all parameters
    current_time: f64,
    max_capacity: usize,
}
```

### Benefits

1. **Cache Locality**: During derivation, we iterate over all modules checking symbols and conditions. With SoA, all `symbol` fields are packed contiguously in memory, leading to fewer cache misses.

2. **Memory Efficiency**:
   - No per-module `Vec` overhead (24 bytes each on 64-bit systems)
   - Parameters stored in a single flat `Vec<f64>` with O(1) indexing via `param_start`

3. **WASM Compatibility**: Flat memory layout translates cleanly to WASM linear memory, avoiding pointer indirection.

4. **Serialization**: The entire state can be serialized/deserialized efficiently with serde.

### Trade-offs

- **Complexity**: Accessing a module requires indexing into multiple arrays
- **Solution**: Provide a `ModuleView<'a>` helper that presents a clean interface:

```rust
pub struct ModuleView<'a> {
    pub sym: u16,
    pub age: f64,
    pub params: &'a [f64],
    pub skip_idx: Option<usize>,
}
```

---

## Virtual Machine Design

### Stack-Based Bytecode VM

Symbios compiles arithmetic expressions (from conditions and successor parameters) into bytecode instructions, then executes them on a stack-based virtual machine.

#### Instruction Set

```rust
pub enum Op {
    // Data loading
    Push(f64),          // Push constant onto stack
    LoadParam(u16),     // Load predecessor parameter by index
    LoadAge,            // Load module age (current_time - birth_time)

    // Arithmetic
    Add, Sub, Mul, Div, Pow, Neg,

    // Relational (epsilon-based comparison)
    Eq, Ne, Gt, Lt, Ge, Le,

    // Logical
    And, Or, Not,

    // Math functions
    Math(MathOp),
}

pub enum MathOp {
    Sin, Cos, Tan,        // Trigonometric (unary)
    Sqrt, Abs,            // Unary
    Floor, Ceil, Round,   // Rounding (unary)
    Min, Max,             // Binary (arity=2)
}
```

#### Example Compilation

```
Expression:  x + 2 * sin(y)
Bytecode:    LoadParam(0)  // x
             Push(2.0)
             LoadParam(1)  // y
             Math(Sin)
             Mul
             Add
Stack trace: [x] → [x, 2.0] → [x, 2.0, y] → [x, 2.0, sin(y)] → [x, 2.0*sin(y)] → [x + 2.0*sin(y)]
```

### Why a VM?

**Alternative 1: Interpret AST directly**
- Problem: Tree traversal overhead on every derivation step
- Overhead: Pointer chasing, dynamic dispatch, repeated allocations

**Alternative 2: Code generation**
- Problem: Requires JIT compilation or unsafe FFI
- Not viable for WASM or embedded targets

**VM Approach:**
- Compile once, execute many times
- Bytecode is compact (`Vec<Op>`, enum-based)
- Stack operations are cache-friendly
- Deterministic execution (no JIT non-determinism)

### Safety Guarantees

- Stack overflow protection (max 256 items)
- Stack underflow checks before each pop
- NaN/Infinity rejection (returns `VMError::MathError`)
- Division by zero produces NaN, caught by finite check
- Epsilon-based float comparison for relational operators

---

## Symbol Interning

### Problem

L-System strings can grow exponentially. Storing symbol names as `String` for each module would be prohibitively expensive.

### Solution

A **symbol table** (interner) maps strings to u16 IDs:

```rust
pub struct SymbolTable {
    to_id: HashMap<Arc<str>, u16>,
    to_str: Vec<Arc<str>>,
    current_bytes: usize,
    max_bytes: usize,  // Default: 10 MB
}
```

**Usage:**
```rust
let id = interner.get_or_intern("A")?;  // Returns u16
let name = interner.resolve(id)?;       // Returns &str
```

### Benefits

1. **Memory**: Each module stores only 2 bytes (u16) instead of 24+ bytes (String)
2. **Comparison**: Symbol equality is integer comparison (O(1))
3. **Bounds**: Maximum 65,535 unique symbols (u16::MAX)

### Safety

- Heap overflow protection: Rejects new symbols if `current_bytes > max_bytes`
- Returns error instead of OOM panics
- Uses `Arc<str>` for shared ownership

---

## Topology Skip-Links

### Context-Sensitive Matching

L-Systems support context-sensitive rules like:

```
B < A(x) > C : condition -> successor
```

This means "match A only when preceded by B and followed by C".

### Branch Problem

In branching structures, context can span branches:

```
String: X [ Y A ] Z
Rule:   Y < A > Z
```

The predecessor `A` is inside a branch `[]`, but its right context `Z` is outside.

### Skip-Link Solution

Symbios pre-calculates **topology links** that map:
- Open bracket `[` → matching close bracket `]`
- Close bracket `]` → matching open bracket `[`

**Algorithm:**
1. Scan the string once
2. Use a stack to match bracket pairs
3. Store forward/backward indices in `topology_link: u32`

**Result:** O(1) navigation to skip entire branches during context matching.

**Example:**
```
String:     X [ Y A ] Z
Indices:    0 1 2 3 4 5
Links:      - 1→4 - - 4→1 -
```

### Interaction with `#ignore`

Topology skip-links take precedence over `#ignore` for bracket symbols. Even if `[` and `]` appear in an `#ignore` directive, they are still used for branch navigation during context matching. The `#ignore` only suppresses them as context match targets.

---

## Compilation Pipeline

### Phase 1: Parsing and Interning

```rust
// Input
sys.add_rule("A(x) : x < 10 -> B(x + 1)")?;

// Parse (nom combinators)
let rule = parse_rule(input)?;  // Returns Rule AST

// Intern symbols
let pred_sym = interner.get_or_intern("A")?;
let succ_sym = interner.get_or_intern("B")?;
```

### Phase 2: Bytecode Compilation

```rust
// Compile guard condition
let guard_code = compiler.compile(&rule.condition)?;  // Vec<Op>

// Compile successor parameter expressions
for expr in &rule.successor.parameters {
    let param_code = compiler.compile(expr)?;
    // Store in RuntimeModule
}
```

### Phase 3: Execution (Derivation)

```rust
for each module in state {
    for each rule matching module.symbol {
        // Evaluate guard
        let guard_result = vm.eval(&rule.guard_code, module.params, module.age)?;
        if guard_result != 0.0 {
            // Collect matching rules, select stochastically
            // Evaluate successor parameters
            // Add to new state
        }
    }
}
```

### Phase 4: Decompilation (Reverse)

The decompiler reconstructs ASTs from bytecode, and the exporter formats them back to source text:

```
Vec<Op> → Decompiler → Expr AST → format_expr() → String
RuntimeRule → export_rule() → Rule AST → format_rule() → Source text
```

This supports lossless round-tripping with preserved parameter names.

---

## Derivation Engine

### Stochastic Rule Selection

When multiple rules match a module, Symbios uses weighted random selection:

1. Collect all rules whose predecessor symbol matches and whose guard condition evaluates to nonzero
2. Sum their `probability` weights to get `total_weight`
3. Generate a random value in `[0, total_weight)` using the system RNG
4. Select the rule whose cumulative weight range contains the random value
5. A single matching rule always fires regardless of its probability value

### Zero-Allocation Matching

The `MatchScratch` struct provides reusable buffers to avoid per-module allocation:

```rust
pub struct MatchScratch {
    pub context_frame: Vec<f64>,    // Reusable parameter buffer
    pub left_indices: Vec<usize>,   // Left context match indices
    pub right_indices: Vec<usize>,  // Right context match indices
}
```

### Context Parameter Binding

When a context-sensitive rule matches, parameters from context modules are loaded into the VM evaluation frame. The derivation engine builds a combined parameter vector: predecessor params, then left context params, then right context params.

---

## Genetic Algorithm Architecture

### Mutation Operators

Symbios provides 9 mutation operators, organized by granularity:

| Operator | Config | Targets |
|----------|--------|---------|
| **Probability mutation** | `MutationConfig` | Rule probability weights |
| **Constant mutation** | `MutationConfig` | `#define` constant values |
| **Gaussian jitter** | `MutationConfig` | `Push(f64)` bytecode literals |
| **Structural mutation** | `StructuralMutationConfig` | Successor module sequence |
| **Operator flip** | `OperatorFlipConfig` | Arithmetic/relational ops |
| **Rule duplication** | `RuleDuplicationConfig` | Clone + specialize rules |
| **Topological mutation** | `TopologicalMutationConfig` | Turtle command pairs |
| **Literal promotion** | `LiteralPromotionConfig` | Promote literals to constants |
| **Sub-expression graft** | `graft_rate: f64` | Branch blocks between systems |

### Crossover Strategies

| Strategy | Config | Behavior |
|----------|--------|----------|
| **Uniform** | `CrossoverConfig` | Select entire rule sets per predecessor |
| **Homologous** | `AdvancedCrossoverConfig` | Select individual rules per shared predecessor |
| **BLX-alpha** | `AdvancedCrossoverConfig` | Interpolate bytecode literals between parents |
| **Sub-expression grafting** | `graft_rate: f64` | Swap balanced `[...]` branch blocks |

### Arity Safety

The system tracks `symbol_arities: HashMap<u16, usize>` to ensure structural mutations insert modules with the correct number of parameters. When inserting a new module, the mutation engine samples from known symbols and respects their observed arity.

### All Operators Have `_with_rng` Variants

Every mutation and crossover method delegates to a `_with_rng` variant that accepts an external `&mut impl Rng`. The non-`_with_rng` methods use the system's internal `Pcg64` RNG. This separation supports both convenience and deterministic testing.

---

## Source Round-Tripping Pipeline

### `from_source(source: &str) -> Result<System>`

Parses a multi-line L-system definition:

1. Lines starting with `#define` → constants
2. Lines starting with `#ignore` → ignored symbols
3. Lines starting with `omega:` → axiom
4. Other non-empty lines → preamble (comments) or rules
5. Rules are parsed with `parse_rule`, compiled, and stored

### `to_source() -> String`

Reconstructs complete source text from a compiled system:

1. Emit preamble lines (comments, `#ignore` directives)
2. Emit `#define` entries from stored constants (sorted alphabetically)
3. Emit `omega: <axiom>` from `axiom_source`
4. For each rule, decompile bytecode to AST, format with preserved parameter names

### `SourceGenotype`

A thin wrapper around source text that provides the Parse → Mutate → Reconstruct cycle:

```rust
pub struct SourceGenotype {
    source: String,
}
```

Each mutation/crossover call:
1. Parses `self.source` into a `System`
2. Applies the genetic operator
3. Calls `to_source()` to reconstruct source text
4. Stores the result back in `self.source`

This enables evolutionary loops that operate entirely in source space while leveraging the full compiled system for operator application.

---

## Memory Bounds and Safety

### DoS Attack Protection

Symbios is designed to reject adversarial input that could cause resource exhaustion.

| Limit | Value | Error |
|-------|-------|-------|
| Max recursion depth (parser) | 64 | `ParseError` |
| Max identifier length | 64 chars | `ParseError` |
| Max function arguments | 32 | `ParseError` |
| Max successor count | 128 | `CompileError::TooManySuccessors` |
| Max nesting depth (topology) | 4,096 | `SymbiosError::MaxNestingExceeded` |
| VM stack size | 256 | `VMError::StackOverflow` |
| State capacity | 1,000,000 modules | `SymbiosError::CapacityOverflow` |
| Interner heap | 10 MB | Interner error |

### Numeric Safety

- All floats validated with `.is_finite()` during parsing
- NaN and Infinity rejected at parse time via `InvalidNumericValue`
- VM rejects NaN/Inf results with `VMError::MathError`
- Float equality uses epsilon comparison
- Time overflow in `advance_time` is hardened against wrapping

### Index Safety

- All array accesses bounds-checked
- u16/u32 overflow protection in `SymbiosState::push`
- Parameter indices validated before access
- Topology links validated during construction

### Zero Unsafe Code

Symbios contains **zero `unsafe` blocks**. All safety is achieved through:
- Rust's type system
- Explicit bounds checks
- Result-based error handling
- Comprehensive validation

---

## Error Hierarchy

```
SystemError (top-level)
├── ParseError(String)              # Syntax errors from nom parser
├── CompileError(String)            # Variable resolution, successor limits
├── InvalidPredecessorParam         # Context symbols with parameters
├── InternerError(String)           # Symbol table overflow
├── VMError(String)                 # Stack overflow, NaN/Inf, underflow
├── State(SymbiosError)             # Core state errors
│   ├── ParameterOverflow           # > u16::MAX parameters per module
│   ├── UnmatchedBracket(usize)     # Topology link failure
│   ├── AmbiguousTopology           # Same symbol for open and close
│   ├── MaxNestingExceeded          # > 4096 bracket levels
│   ├── CapacityOverflow            # > max_capacity modules
│   ├── InvalidIndex(usize)         # Out-of-bounds module access
│   └── InvalidNumericValue         # NaN/Inf in input
└── StateCorruption(usize)          # Internal invariant violation
```

---

## Performance Characteristics

| Operation | Complexity | Notes |
|-----------|-----------|-------|
| Rule matching | O(N × R) | N = modules, R = rules per symbol |
| Context matching | O(1) per context | Via skip-links |
| Parameter access | O(1) | Direct indexing into flat array |
| Symbol comparison | O(1) | Integer equality |
| VM execution | O(I) | I = instruction count |
| Topology calculation | O(N) | Single scan with stack |
| Stochastic selection | O(R) | Linear scan of matching rules |
| Mutation (all types) | O(R × S) | R = rules, S = successors per rule |
| Crossover | O(R) | R = total rules across both parents |

---

## Design Philosophy

1. **Determinism**: Same input always produces same output (given same RNG seed)
2. **No Surprises**: Reject bad input with errors, never panic
3. **Performance**: Optimize for common case (large strings, many derivations)
4. **Embeddability**: No heavy dependencies, portable to WASM/embedded
5. **Correctness**: Strictly adhere to ABOP specification
6. **Evolvability**: First-class support for genetic operations and source preservation

---

## References

- [Prusinkiewicz, P., & Lindenmayer, A. (1990). *The Algorithmic Beauty of Plants*. Springer-Verlag.]https://algorithmicbotany.org/papers/abop/abop.pdf
- [L-Systems Wikipedia]https://en.wikipedia.org/wiki/L-system
- [nom Parser Combinator Library]https://github.com/rust-bakery/nom