embeddenator 0.20.0-alpha.1

Sparse ternary VSA holographic computing substrate
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
# Local Development Guide

**Version:** 0.20.0  
**Last Updated:** January 4, 2026

## Overview

This guide covers **local development workflows** for working across multiple Embeddenator components simultaneously. It focuses on:

- Setting up a multi-repo workspace
- Using `[patch.crates-io]` for local path dependencies
- Development iteration patterns
- Testing strategies
- Pre-release validation

## Prerequisites

- **Rust:** 1.84 or later (`rustup update`)
- **Git:** 2.40+ recommended
- **Disk Space:** ~2GB for all component repos + build artifacts
- **Optional:** FUSE libraries for `embeddenator-fs` development

## Workspace Setup

### Directory Structure

Clone all component repos into a common parent directory:

```bash
mkdir ~/embeddenator-workspace
cd ~/embeddenator-workspace

# Clone core orchestrator
git clone https://github.com/tzervas/embeddenator

# Clone components (libraries)
git clone https://github.com/tzervas/embeddenator-vsa
git clone https://github.com/tzervas/embeddenator-io
git clone https://github.com/tzervas/embeddenator-retrieval
git clone https://github.com/tzervas/embeddenator-fs
git clone https://github.com/tzervas/embeddenator-interop
git clone https://github.com/tzervas/embeddenator-obs

# Clone tools (optional)
git clone https://github.com/tzervas/embeddenator-testkit
git clone https://github.com/tzervas/embeddenator-contract-bench
git clone https://github.com/tzervas/embeddenator-workspace
```

**Final structure:**
```
~/embeddenator-workspace/
├── embeddenator/                    (core)
├── embeddenator-vsa/                (components)
├── embeddenator-io/
├── embeddenator-retrieval/
├── embeddenator-fs/
├── embeddenator-interop/
├── embeddenator-obs/
├── embeddenator-testkit/            (tools)
├── embeddenator-contract-bench/
└── embeddenator-workspace/
```

### Verify Setup

Test that all repos are healthy:

```bash
cd ~/embeddenator-workspace

for repo in embeddenator-*; do
  echo "Testing $repo..."
  (cd "$repo" && cargo check 2>&1 | grep -E "(Compiling|Finished|error)" | head -5)
done

echo "Testing embeddenator core..."
(cd embeddenator && cargo check 2>&1 | grep -E "(Compiling|Finished|error)" | head -5)
```

Expected output: `Finished dev [unoptimized + debuginfo] target(s)` for each repo.

## Using [patch.crates-io]

### What is [patch.crates-io]?

Cargo's `[patch]` mechanism allows **temporary override** of dependencies. When you specify:

```toml
[patch.crates-io]
embeddenator-vsa = { path = "../embeddenator-vsa" }
```

Cargo will:
1. Use the **local path** version instead of the git tag
2. Apply this override **transitively** (all crates in the workspace)
3. Ignore version mismatches (uses whatever is in the local path)

**Critical:** `[patch.crates-io]` is for **development only**. Never commit it to production code.

### When to Use [patch.crates-io]

✅ **Use when:**
- Developing features spanning multiple components
- Debugging cross-component issues
- Testing API changes before releasing
- Rapid iteration with immediate feedback

❌ **Don't use when:**
- Preparing for release (must test with git tags)
- Code review (reviewers need reproducible builds)
- CI/CD pipelines (patches break reproducibility)
- Single-component changes (work directly in that repo)

### Adding [patch.crates-io]

**Option 1: Workspace-level (recommended for core development)**

In `embeddenator/Cargo.toml` (root), add at the bottom:

```toml
[patch.crates-io]
embeddenator-vsa = { path = "../embeddenator-vsa" }
embeddenator-io = { path = "../embeddenator-io" }
embeddenator-retrieval = { path = "../embeddenator-retrieval" }
embeddenator-fs = { path = "../embeddenator-fs" }
embeddenator-interop = { path = "../embeddenator-interop" }
embeddenator-obs = { path = "../embeddenator-obs" }
```

**Option 2: Component-level (for testing component changes in isolation)**

In a component's `Cargo.toml` (e.g., `embeddenator-retrieval/Cargo.toml`):

```toml
[patch.crates-io]
embeddenator-vsa = { path = "../embeddenator-vsa" }
embeddenator-io = { path = "../embeddenator-io" }
```

### Removing [patch.crates-io]

Before committing or releasing:

```bash
cd embeddenator

# Option 1: Comment out (preserves setup)
sed -i '/\[patch.crates-io\]/,/^$/s/^/# /' Cargo.toml

# Option 2: Delete entirely
# Edit Cargo.toml and remove [patch.crates-io] section

# Verify it's gone
grep -A 10 "\[patch.crates-io\]" Cargo.toml || echo "Patches removed ✓"

# Update to use git tags again
cargo update
cargo build --release
cargo test --all
```

## Development Workflows

### Workflow 1: Single Component Change

**Scenario:** Fix a bug in `embeddenator-vsa`, test in core.

```bash
# 1. Work in component repo
cd ~/embeddenator-workspace/embeddenator-vsa
git checkout -b fix/cosine-precision

# Make changes to src/similarity.rs
vim src/similarity.rs

# Test locally
cargo test

# 2. Test in core with [patch.crates-io]
cd ../embeddenator

# Add patch (if not already present)
echo '
[patch.crates-io]
embeddenator-vsa = { path = "../embeddenator-vsa" }
' >> Cargo.toml

# Test integration
cargo test --all

# 3. Release component
cd ../embeddenator-vsa
git add src/similarity.rs
git commit -m "Fix cosine similarity precision loss"
git tag -a v0.1.1 -m "v0.1.1: Precision fix"
git push origin main --tags

# 4. Update core to use new tag
cd ../embeddenator

# Remove patch
sed -i '/\[patch.crates-io\]/,/^$/d' Cargo.toml

# Update dependency
vim Cargo.toml  # Change embeddenator-vsa tag = "v0.1.1"
cargo update -p embeddenator-vsa

# Test with git tag
cargo test --all

# Commit core update
git commit -am "Update embeddenator-vsa to v0.1.1"
git push origin main
```

### Workflow 2: Cross-Component Feature

**Scenario:** Add new query algorithm affecting vsa, io, and retrieval.

```bash
# 1. Branch all affected repos
cd ~/embeddenator-workspace
for repo in embeddenator-vsa embeddenator-io embeddenator-retrieval embeddenator; do
  (cd "$repo" && git checkout -b feat/semantic-search)
done

# 2. Enable local paths in core
cd embeddenator
cat >> Cargo.toml <<'EOF'

[patch.crates-io]
embeddenator-vsa = { path = "../embeddenator-vsa" }
embeddenator-io = { path = "../embeddenator-io" }
embeddenator-retrieval = { path = "../embeddenator-retrieval" }
EOF

# 3. Develop iteratively
cd ../embeddenator-vsa
# Add semantic distance metric
vim src/similarity.rs
cargo test

cd ../embeddenator-io
# Add metadata for semantic queries
vim src/manifest.rs
cargo test

cd ../embeddenator-retrieval
# Implement semantic search
vim src/semantic.rs
cargo test

cd ../embeddenator
# Wire up CLI interface
vim src/cli/query.rs
cargo test --all

# 4. Pre-release validation (remove patches, test with tags)
cd ~/embeddenator-workspace/embeddenator
sed -i '/\[patch.crates-io\]/,/^$/d' Cargo.toml

# This will FAIL because new versions aren't tagged yet - that's expected!
cargo build 2>&1 | grep "error"

# 5. Release in dependency order
cd ../embeddenator-vsa
git push origin feat/semantic-search
# Create PR, merge to main
git checkout main && git pull
git tag -a v0.2.0 -m "v0.2.0: Semantic distance metric"
git push origin --tags

cd ../embeddenator-io
# Update vsa dependency to v0.2.0
vim Cargo.toml
cargo update -p embeddenator-vsa
git commit -am "Update embeddenator-vsa to v0.2.0"
git push origin feat/semantic-search
# Create PR, merge to main
git checkout main && git pull
git tag -a v0.2.0 -m "v0.2.0: Semantic metadata"
git push origin --tags

cd ../embeddenator-retrieval
# Update dependencies to v0.2.0
vim Cargo.toml
cargo update -p embeddenator-vsa -p embeddenator-io
git commit -am "Update dependencies to v0.2.0"
git push origin feat/semantic-search
# Create PR, merge to main
git checkout main && git pull
git tag -a v0.2.0 -m "v0.2.0: Semantic search engine"
git push origin --tags

cd ../embeddenator
# Update all dependencies to v0.2.0
vim Cargo.toml
cargo update
cargo test --all
git commit -am "Add semantic search (vsa v0.2.0, io v0.2.0, retrieval v0.2.0)"
git push origin feat/semantic-search
# Create PR, merge to main
```

### Workflow 3: Rapid Prototyping

**Scenario:** Experiment with API changes without git overhead.

```bash
# 1. Set up persistent patches
cd ~/embeddenator-workspace/embeddenator
cat > Cargo.local.toml <<'EOF'
# Local development patches - DO NOT COMMIT
[patch.crates-io]
embeddenator-vsa = { path = "../embeddenator-vsa" }
embeddenator-io = { path = "../embeddenator-io" }
embeddenator-retrieval = { path = "../embeddenator-retrieval" }
embeddenator-fs = { path = "../embeddenator-fs" }
embeddenator-interop = { path = "../embeddenator-interop" }
embeddenator-obs = { path = "../embeddenator-obs" }
EOF

# Link into main Cargo.toml
echo '
# Local development overrides (see Cargo.local.toml)
include = "Cargo.local.toml"
' >> Cargo.toml

# Add to .gitignore
echo "Cargo.local.toml" >> .gitignore

# 2. Develop freely
cd ../embeddenator-vsa
# Try breaking API change
vim src/lib.rs

cd ../embeddenator
cargo test --all  # Instant feedback!

# 3. When done, commit components first
cd ../embeddenator-vsa
git commit -am "Refactor: Simplify SparseVec API"
git tag v0.2.0 && git push origin main --tags

cd ../embeddenator
# Remove include line
vim Cargo.toml  # Delete 'include = "Cargo.local.toml"'
cargo test --all
git commit -am "Update to embeddenator-vsa v0.2.0"
```

## Testing Strategies

### Unit Tests (Per Component)

Test each component in isolation:

```bash
cd ~/embeddenator-workspace/embeddenator-vsa
cargo test
cargo test --doc  # Doc tests
cargo test --release  # Optimized builds (slower, more realistic)
```

### Integration Tests (Cross-Component)

Test component interactions:

```bash
cd ~/embeddenator-workspace/embeddenator
cargo test --test integration_retrieval  # Tests vsa + io + retrieval
cargo test --test integration_fs         # Tests fs + io + vsa
```

### Contract Tests (API Stability)

Validate component contracts:

```bash
cd ~/embeddenator-workspace/embeddenator-contract-bench
cargo bench --no-run  # Compile without running
cargo bench           # Run and measure
```

### E2E Tests (Full Pipeline)

Test complete workflows:

```bash
cd ~/embeddenator-workspace/embeddenator
cargo test --test e2e -- --test-threads=1 --nocapture
```

### Test with Local Paths

```bash
cd ~/embeddenator-workspace/embeddenator

# Add patches
echo '
[patch.crates-io]
embeddenator-vsa = { path = "../embeddenator-vsa" }
embeddenator-io = { path = "../embeddenator-io" }
' >> Cargo.toml

# Run full test suite
cargo test --all --all-features

# Check for warnings
cargo clippy --all-targets --all-features -- -D warnings

# Build docs
cargo doc --no-deps --all-features
```

### Test with Git Tags (Pre-Release)

```bash
cd ~/embeddenator-workspace/embeddenator

# Remove patches
sed -i '/\[patch.crates-io\]/,/^$/d' Cargo.toml

# Clean and rebuild
cargo clean
cargo build --release
cargo test --all --release

# Test installation
cargo install --path . --force
embeddenator --version
```

## Common Issues

### Issue 1: "Failed to resolve patches"

**Error:**
```
error: failed to resolve patches for `https://github.com/rust-lang/crates.io-index`
Caused by: patch for `embeddenator-vsa` in `https://github.com/rust-lang/crates.io-index` points to the same source
```

**Cause:** Component is specified as both a git dependency AND a path patch, but versions don't match.

**Fix:**
```bash
# Option 1: Remove version constraint in [dependencies]
[dependencies]
embeddenator-vsa = { git = "..." }  # No tag = accepts any version

# Option 2: Update path version to match tag
cd ../embeddenator-vsa
vim Cargo.toml  # Set version = "0.1.1"
```

### Issue 2: "No such file or directory"

**Error:**
```
error: failed to load source for dependency `embeddenator-vsa`
Caused by: Unable to update file:///home/user/embeddenator-workspace/embeddenator-vsa
```

**Cause:** Path in `[patch.crates-io]` is incorrect.

**Fix:**
```bash
# Check actual location
ls -la ../embeddenator-vsa

# Update Cargo.toml with correct relative path
[patch.crates-io]
embeddenator-vsa = { path = "../embeddenator-vsa" }  # From embeddenator/ dir
```

### Issue 3: Changes not reflected

**Error:** Code changes in component don't appear in core builds.

**Cause:** Cargo cache not invalidated.

**Fix:**
```bash
# Force clean rebuild
cd ~/embeddenator-workspace/embeddenator
cargo clean
rm -rf target/
cargo build

# Or use touch to force recompilation
cd ../embeddenator-vsa
touch src/lib.rs
cd ../embeddenator
cargo build
```

### Issue 4: Clippy warnings differ

**Error:** Clippy passes in component but fails in core (or vice versa).

**Cause:** Different Rust toolchains or clippy versions.

**Fix:**
```bash
# Standardize toolchain
rustup update
rustup default stable

# Run clippy consistently
cd ~/embeddenator-workspace/embeddenator-vsa
cargo clippy --all-targets -- -D warnings

cd ../embeddenator
cargo clippy --all-targets -- -D warnings
```

## Pre-Release Checklist

Before releasing any component:

- [ ] **Remove `[patch.crates-io]`** from all Cargo.toml files
- [ ] **Update git tag dependencies** to new versions
- [ ] **Run full test suite:**
  ```bash
  cargo clean
  cargo test --all --all-features --release
  ```
- [ ] **Check for warnings:**
  ```bash
  cargo clippy --all-targets --all-features -- -D warnings
  ```
- [ ] **Build docs:**
  ```bash
  cargo doc --no-deps --all-features
  ```
- [ ] **Verify version numbers:**
  ```bash
  grep '^version' Cargo.toml
  git tag -l | tail -1
  ```
- [ ] **Test installation:**
  ```bash
  cargo install --path . --force
  embeddenator --version
  ```
- [ ] **Update CHANGELOG.md** with release notes
- [ ] **Commit, tag, and push:**
  ```bash
  git commit -am "Release v0.X.Y"
  git tag -a v0.X.Y -m "v0.X.Y: Summary"
  git push origin main --tags
  ```

## Advanced: Scripted Workflows

### Script: Update All Components

```bash
#!/usr/bin/env bash
# update-all.sh - Update all component repos to latest main

set -euo pipefail

WORKSPACE=~/embeddenator-workspace
COMPONENTS=(
  embeddenator-vsa
  embeddenator-io
  embeddenator-retrieval
  embeddenator-fs
  embeddenator-interop
  embeddenator-obs
  embeddenator-testkit
  embeddenator-contract-bench
  embeddenator-workspace
)

for comp in "${COMPONENTS[@]}"; do
  echo "Updating $comp..."
  (
    cd "$WORKSPACE/$comp"
    git checkout main
    git pull --tags
    cargo update
  ) || {
    echo "⚠️  Failed to update $comp (skipping)"
  }
done

echo "✓ All components updated"
```

### Script: Enable/Disable Patches

```bash
#!/usr/bin/env bash
# patches.sh - Toggle [patch.crates-io] in embeddenator core

set -euo pipefail

CARGO_TOML=~/embeddenator-workspace/embeddenator/Cargo.toml
PATCH_MARKER="# LOCAL_DEV_PATCHES"

enable_patches() {
  if grep -q "$PATCH_MARKER" "$CARGO_TOML"; then
    echo "Patches already enabled"
    return
  fi
  
  cat >> "$CARGO_TOML" <<EOF

$PATCH_MARKER
[patch.crates-io]
embeddenator-vsa = { path = "../embeddenator-vsa" }
embeddenator-io = { path = "../embeddenator-io" }
embeddenator-retrieval = { path = "../embeddenator-retrieval" }
embeddenator-fs = { path = "../embeddenator-fs" }
embeddenator-interop = { path = "../embeddenator-interop" }
embeddenator-obs = { path = "../embeddenator-obs" }
EOF
  
  echo "✓ Patches enabled"
}

disable_patches() {
  if ! grep -q "$PATCH_MARKER" "$CARGO_TOML"; then
    echo "Patches already disabled"
    return
  fi
  
  sed -i "/$PATCH_MARKER/,\$d" "$CARGO_TOML"
  echo "✓ Patches disabled"
}

case "${1:-}" in
  on|enable)
    enable_patches
    ;;
  off|disable)
    disable_patches
    ;;
  *)
    echo "Usage: $0 {on|off}"
    exit 1
    ;;
esac
```

**Usage:**
```bash
chmod +x patches.sh
./patches.sh on   # Enable local development
./patches.sh off  # Prepare for release
```

## See Also

- [COMPONENT_ARCHITECTURE.md]COMPONENT_ARCHITECTURE.md - Architecture overview
- [VERSIONING.md]VERSIONING.md - Versioning strategy
- [Cargo Book: Overriding Dependencies]https://doc.rust-lang.org/cargo/reference/overriding-dependencies.html