do-memory-core 0.1.31

Core episodic learning system for AI agents with pattern extraction, reward scoring, and dual storage backend
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
# Property-Based Testing Framework for Memory-Core

## Overview

This guide documents the property-based testing implementation in do-memory-core, designed to catch edge cases and verify invariants that must hold regardless of input values.

## What is Property-Based Testing?

Property-based testing (PBT) is a testing approach where instead of testing specific input-output pairs, you test general properties (invariants) that should always hold true for any valid input. The test framework automatically generates hundreds or thousands of random inputs to verify these properties.

## Benefits Over Traditional Testing

1. **Edge Case Discovery**: Random input generation reveals edge cases developers might miss
2. **Input Space Exploration**: Tests cover more of the input space than hand-written tests
3. **Invariant Verification**: Tests focus on properties that must always be true
4. **Shrinking**: When a test fails, the framework automatically finds the minimal failing input
5. **Reproducibility**: Same seed produces same random inputs for debugging

## Installation

The property-based testing uses the `proptest` crate (version 1.5):

```toml
[dev-dependencies]
proptest = "1.5"
```

## Running Property Tests

Run all property tests:
```bash
cargo test -p do-memory-core
```

Run specific property test files:
```bash
cargo test -p do-memory-core episode_property_tests
cargo test -p do-memory-core pattern_property_tests
cargo test -p do-memory-core relationship_property_tests
cargo test -p do-memory-core tag_property_tests
```

Run specific properties:
```bash
cargo test -p do-memory-core episode_id_is_valid_uuid
cargo test -p memory_core similarity_is_reflexive
```

Increase test cases for exhaustive testing:
```bash
PROPTEST_CASES=10000 cargo test -p do-memory-core
```

## Property Test Structure

### Basic Template

```rust
use memory_core::types::MyType;
use proptest::prelude::*;

proptest! {
    /// Property description
    #[test]
    fn property_name(input1 in strategy1, input2 in strategy2) {
        // Arrange & Act
        let result = function_under_test(input1, input2);

        // Assert - property must always be true
        assert!(result.is_valid(), "Property failed for inputs: {:?}, {:?}", input1, input2);
    }
}
```

### Common Strategies

```rust
// Strings
"[a-zA-Z0-9]{1,50}"              // Alphanumeric, 1-50 chars
"[a-z]{10}"                      // Lowercase letters, exactly 10
"\\PC{0,100}"                    // Any Unicode, up to 100

// Vectors
proptest::collection::vec("[a-z]{1,10}", 0..10)  // 0-10 elements
proptest::collection::vec(any::<T>(), 1..5)       // 1-5 elements
proptest::collection::hash_map(".*", ".*", 0..10) // HashMap with 0-10 entries

// Numbers
0..100usize                      // Unsigned integer 0-99
0.0f32..1.0                      // Float in range
any::<MyEnum>()                  // Any enum variant

// Custom types
any::<TaskType>()                // Implement Arbitrary trait
```

### Advanced Patterns

```rust
// Branching strategies
let input = prop_oneof![
    0..5usize,
    10..20usize,
    100..200usize,
];

// Optional values
let opt = prop::option::maybe(0..100usize);

// With frequency
let freq = prop_oneof![
    9 => prop::sample::select(vec![1, 2, 3]),
    1 => 0usize,
];
```

## Test Files

### episode_property_tests.rs

Tests for Episode and ExecutionStep invariants:

**Episode Creation:**
- `episode_id_is_valid_uuid`: Episode IDs are always valid, non-nil UUIDs
- `episode_start_time_is_set`: Start time is close to creation time
- `new_episode_is_incomplete`: New episodes have no outcome or end_time
- `new_episode_has_no_steps`: New episodes have zero steps

**Episode Tags:**
- `tag_normalization_is_idempotent`: Normalizing tags twice gives same result
- `tags_are_normalized`: Tags are case-insensitive and trimmed
- `tags_maintain_uniqueness`: No duplicate tags exist
- `invalid_tags_are_rejected`: Empty/invalid tags return errors

**Episode Completion:**
- `completing_episode_sets_outcome`: Completion sets end_time and outcome
- `duration_requires_completed_episode`: Duration only available after completion
- `duration_is_non_negative`: Duration is always non-negative

**Episode Steps:**
- `adding_step_increases_count`: Step count increases correctly
- `step_numbers_accurate`: Step numbering is tracked accurately
- `successful_step_count_accurate`: Success/failure counts are correct

**Serialization:**
- `episode_serialization_roundtrip`: Episode serializes/deserializes correctly
- `step_serialization_roundtrip`: ExecutionStep serializes/deserializes correctly

**Invariants:**
- `episode_modification_preserves_invariants`: Modifications preserve episode ID, etc.

**Test Count:** 20+ property tests

### relationship_property_tests.rs

Tests for episode relationship invariants:

**Validation:**
- `self_relationships_rejected`: Self-relationships always fail
- `duplicate_relationships_rejected`: Same relationship can't be added twice
- `invalid_priority_rejected`: Priority must be 1-10

**Cycle Detection:**
- `depends_on_prevents_cycles`: DependsOn cannot create cycles
- `parent_child_prevents_cycles`: ParentChild cannot create cycles
- `blocks_prevents_cycles`: Blocks cannot create cycles
- `non_acyclic_allows_cycles`: Follows/RelatedTo/Duplicates allow cycles

**Removal:**
- `relationship_removal_idempotent`: Removing twice is safe
- `removal_updates_indexes`: Removal clears all indexes

**Querying:**
- `relationship_exists_consistency`: Existence check is consistent
- `outgoing_incoming_symmetry`: Outgoing/incoming are symmetric
- `get_by_type_returns_both_directions`: Returns both directions
- `relationship_count_accurate`: Count is accurate

**Type Properties:**
- `relationship_type_string_roundtrip`: String conversion is round-trippable
- `relationship_type_serializable`: Types serialize/deserialize correctly
- `directionality_property_consistent`: Directional types have inverses
- `acyclic_requirement_consistent`: Acyclic types are directional

**Graph:**
- `load_relationships_preserves_state`: Loading preserves state
- `multiple_relationships_between_pairs`: Multiple different pairs work
- `custom_fields_preserved`: Custom metadata fields preserved

**Test Count:** 25+ property tests

### pattern_property_tests.rs

Tests for pattern similarity and effectiveness:

**ID Properties:**
- `pattern_id_is_valid_uuid`: Pattern IDs are valid UUIDs

**Similarity:**
- `similarity_is_reflexive`: Pattern similarity with self is 1.0
- `similarity_is_symmetric`: A ~ B = B ~ A
- `similarity_is_bounded`: Scores between 0.0 and 1.0
- `different_types_zero_similarity`: Different pattern types have 0.0 similarity
- `similarity_key_deterministic`: Key is deterministic for same pattern

**Scores:**
- `success_rate_bounded`: Success rates 0.0-1.0
- `pattern_success_rate_bounded`: Pattern success rate is bounded

**Effectiveness:**
- `initial_effectiveness_zero`: New trackers have zero counts
- `record_retrieval_increments`: Retrieval counting works
- `application_success_rate_bounded`: Application success rate is bounded
- `application_stats_sum`: Success/failure counts sum correctly
- `usage_rate_bounded`: Usage rate is bounded
- `reward_delta_converges`: Moving average converges correctly

**Serialization:**
- `pattern_serialization_roundtrip`: Patterns serialize/deserialize correctly
- `effectiveness_serialization_roundtrip`: Effectiveness serializes correctly

**Context:**
- `pattern_context_retrieval`: Context retrieval works
- `context_pattern_no_context`: ContextPattern has no context

**Sample Size:**
- `sample_size_non_negative`: Sample size is non-negative
- `context_pattern_sample_size`: Matches evidence length

**Test Count:** 20+ property tests

### tag_property_tests.rs

Tests for tag operations and normalization:

**Normalization:**
- `tag_normalization_lowercase`: Tags are lowercase
- `tag_normalization_trims`: Whitespace is trimmed
- `tag_normalization_deterministic`: Same input normalizes to same value

**Uniqueness:**
- `tags_maintain_uniqueness`: No duplicate tags
- `duplicate_tag_add_returns_false`: Adding duplicate returns false
- `case_variations_duplicates`: Case variations are duplicates
- `whitespace_variations_duplicates`: Whitespace variations are duplicates

**Idempotence:**
- `add_tag_idempotent`: Adding repeatedly is idempotent
- `remove_tag_idempotent`: Removing repeatedly is safe
- `clear_tags_idempotent`: Clearing repeatedly is safe

**Validation:**
- `empty_tag_rejected`: Empty tags fail
- `whitespace_only_tag_rejected`: Whitespace-only fails
- `invalid_characters_rejected`: Invalid chars fail
- `too_short_tag_rejected`: Tags < 2 chars fail
- `tag_length_limit_enforced`: Length limit enforced

**Querying:**
- `has_tag_correct`: has_tag returns correct results
- `has_tag_case_insensitive`: Case-insensitive lookups
- `has_tag_whitespace_insensitive`: Whitespace-insensitive lookups
- `get_tags_returns_all`: Returns all added tags

**Combination:**
- `add_multiple_unique_tags`: Adding multiple unique tags works
- `remove_and_readd_tag`: Remove and readd works correctly
- `tags_preserve_addition_order`: Tags maintain addition order

**Test Count:** 25+ property tests

## Total Test Coverage

- **Episode Tests**: 20+ properties
- **Relationship Tests**: 25+ properties
- **Pattern Tests**: 20+ properties
- **Tag Tests**: 25+ properties
- **Total**: 90+ property tests

## Writing New Property Tests

### Step 1: Identify the Property

Find an invariant that should always be true:

- "Episode IDs are never nil"
- "Adding a tag twice doesn't create duplicates"
- "Similarity scores are between 0 and 1"
- "Cycles are prevented in depends_on relationships"

### Step 2: Choose Appropriate Strategies

Select the right proptest strategy for your inputs:

```rust
// For simple values
input in ".*"              // Any string
input in 0..100usize       // Integer range
input in any::<MyType>()   // Custom type

// For collections
inputs in proptest::collection::vec(".*", 1..10)
```

### Step 3: Write the Test

```rust
proptest! {
    #[test]
    fn my_property(input in strategy) {
        // Act
        let result = function_under_test(input);

        // Assert - property must hold
        prop_assert!(result.is_valid());
    }
}
```

### Step 4: Run and Debug

```bash
# Run the test
cargo test my_property

# If it fails, proptest will minimize the failing input
# and show you the minimal counterexample

# Save the failing seed for reproducibility
PROPTEST_SEED=0x3f8a... cargo test my_property
```

## Test Configuration

### Increasing Test Cases

```bash
# Default: 256 cases per property
PROPTEST_CASES=1000 cargo test
```

### Custom Configuration in Cargo.toml

```toml
[[profile.dev.overrides]]
package.do-memory-core.proptest = "release"

[[profile.test.overrides]]
opt-level = 1
```

### Fuzzing Configuration

In the test file or module:

```rust
proptest! {
    #![proptest_config(ProptestConfig::with_cases(10000))]
    #[test]
    fn exhaustive_property(input in ".*") {
        // ...
    }
}
```

## Shrinking (Finding Minimal Failures)

When a property test fails, proptest automatically shrinks the failing input to find the minimal counterexample:

```bash
---- property::test_name stdout ----
thread 'property::test_name' panicked at 'assertion failed',
        tests/property/test.rs:42:9:
note: failing input test_name(
    "a very long string that causes a bug"
)

minimized failing test case:
    test_name("")
```

## Best Practices

### DO:
- Test properties that represent real invariants
- Use descriptive property names
- Add property descriptions in doc comments
- Keep properties simple and focused
- Use appropriate strategies for input types
- Consider edge cases in strategies

### DON'T:
- Test implementation details
- Write properties that depend on specific values
- Use overly complex strategies
- Ignore failing tests without investigation
- Remove tests that find bugs

## Common Anti-Patterns

### 1. Testing Implementation Instead of Properties

❌ Bad:
```rust
proptest! {
    #[test]
    fn internal_state_correct(input in ".*") {
        assert_eq!(obj._internal_field, expected);
    }
}
```

✅ Good:
```rust
proptest! {
    #[test]
    fn public_api_preserves_invariant(input in ".*") {
        let result = obj.process(input);
        assert!(result.is_valid());
    }
}
```

### 2. Overly Complex Strategies

❌ Bad:
```rust
proptest! {
    #[test]
    fn complex_property(
        (a, b, c) in (".*", ".*", ".*").prop_map(|(a,b,c)| (format!("{}_{}", a, b), c))
    ) { /* ... */ }
}
```

✅ Good:
```rust
proptest! {
    #[test]
    fn simpler_property(a in ".*", b in ".*", c in ".*") {
        let combined = format!("{}_{}", a, b);
        // ...
    }
}
```

### 3. Ignoring Falsifications

❌ Bad:
```rust
proptest! {
    #[test]
    fn failing_property(input in ".*") {
        if input == "special case" {
            return; // Ignore special case!
        }
        assert!(process(input));
    }
}
```

✅ Good:
```rust
proptest! {
    #[test]
    fn handles_special_case(input in ".*") {
        let result = process(input);
        assert!(result.is_ok(), "Failed with input: {}", input);
    }
}
```

## Integration with Existing Tests

Property tests complement traditional unit tests:

```rust
#[cfg(test)]
mod tests {
    // Traditional unit tests
    #[test]
    fn test_specific_case() {
        let result = function("specific input");
        assert_eq!(result, expected);
    }

    // Property tests
    proptest! {
        #[test]
        fn test_general_property(input in ".*") {
            let result = function(input);
            prop_assert!(result.is_valid());
        }
    }
}
```

## Performance Considerations

Property tests can be slower than unit tests due to:
- Multiple test cases per property
- Input generation overhead
- Shrinking on failures

Tips for performance:
- Keep test count reasonable (default 256)
- Use efficient strategies
- Avoid expensive operations in property generation
- Profile slow tests with `cargo test --release`

## Continuous Integration

In CI, consider:
- Running property tests with higher case count on nightly builds
- Using fixed seeds for reproducible failures
- Separating property test runs from unit test runs
- Failing the build on any property test failure

```yaml
# Example CI configuration
- name: Run property tests
  run: |
    PROPTEST_CASES=1000 cargo test -p do-memory-core
```

## Resources

- [proptest documentation](https://docs.rs/proptest/)
- [Prop-based testing book](https://propertesting.com/)
- [Rust proptest guide](https://altsysrq.github.io/proptest-book/intro.html)

## Summary

The property-based testing framework provides:
- 90+ property tests across 4 major areas
- Automatic input generation and shrinking
- Invariant verification for core types
- Edge case discovery beyond traditional tests
- High confidence in code correctness

This complements the existing 92.5% test coverage by testing properties rather than just specific inputs, helping achieve the >95% coverage goal.