embeddenator-vsa

Vector Symbolic Architecture (VSA) operations for sparse and dense balanced ternary representations.

Independent component extracted from the Embeddenator monolithic repository. Part of the Embeddenator workspace.

Status

Phase 2A Component Extraction - Production-ready independent component.

Current Version: v0.20.0-alpha.1

Repository: https://github.com/tzervas/embeddenator-vsa

Implementation Status

✅ Core VSA Operations: Bundle, bind, permute
✅ Bitsliced Ternary: Optimized 2-bits-per-trit representation
✅ Sparse & Dense: Adaptive representation (SparseVec / PackedTritVec)
✅ SIMD-Ready: Word-level parallelism with auto-vectorization
✅ Well-Tested: 53 tests passing (unit + integration + doc tests)
📋 Explicit SIMD: Planned for v0.21.0 (AVX2, NEON)
📋 GPU Acceleration: Planned for v1.1.0 (CUDA, OpenCL)

Features

Balanced Ternary Arithmetic: {-1, 0, +1} operations with mathematical guarantees
Bitsliced Representation: 32 trits per u64 word for SIMD efficiency
Adaptive Storage: Automatic selection between sparse (< 25% density) and dense (≥ 25% density)
High Performance: Word-level operations with compiler auto-vectorization
GPU-Ready Design: Coalesced memory access, branchless operations

Documentation

Core Documentation

Bitsliced Ternary Design - Comprehensive guide to the bitslicing technique
SIMD Design - SIMD strategy and implementation roadmap
Ternary Representation Guide - Quick reference for choosing representations

Planning & Analysis

Implementation Plan - Detailed roadmap to v1.0.0 and beyond
Gap Analysis - Current gaps and priorities
Refactoring Analysis - Code quality assessment

Quick Start

use embeddenator_vsa::{SparseVec, ReversibleVSAConfig};

// Encode data to hypervectors
let config = ReversibleVSAConfig::default();
let vec1 = SparseVec::encode_data(b"hello", &config, None);
let vec2 = SparseVec::encode_data(b"world", &config, None);

// Bundle (superposition)
let bundled = vec1.bundle(&vec2);

// Bind (association)
let bound = vec1.bind(&vec2);

// Compute similarity
let similarity = vec1.cosine(&vec2);
println!("Similarity: {:.3}", similarity);

Usage

[dependencies]
embeddenator-vsa = { git = "https://github.com/tzervas/embeddenator-vsa", tag = "v0.1.0" }

Development

# Build
cargo build

# Run tests
cargo test

# Run tests with all features
cargo test --all-features

# Check code quality
cargo clippy --all-features -- -D warnings

# For cross-repo work, use Cargo patches:
# Add to Cargo.toml:
# [patch."https://github.com/tzervas/embeddenator-vsa"]
# embeddenator-vsa = { path = "../embeddenator-vsa" }

Architecture

Representation Types

SparseVec: For sparse vectors (< 25% density)

Memory: 8 bytes per non-zero trit
Best for: Random data encoding, feature vectors
Operations: Sorted merge algorithms

PackedTritVec: For dense vectors (≥ 25% density)

Memory: 2 bits per trit
Best for: Bundle operations, SIMD/GPU acceleration
Operations: Word-level bitwise operations (32 trits per u64)

See Ternary Representation Guide for detailed comparison.

Performance

Dot Product: O(n/32) for PackedTritVec, O(k log k) for SparseVec
Bind/Bundle: O(n/32) for PackedTritVec, O(k) for SparseVec
SIMD: Designed for compiler auto-vectorization; actual speedup varies by platform
Future: Explicit SIMD (Phase 2) and GPU (Phase 5) planned

Note: Performance characteristics are theoretical and depend on hardware, data patterns, and compiler optimizations. Run cargo bench to measure on your specific system.

Roadmap

See IMPLEMENTATION_PLAN.md for complete roadmap.

Near Term

v0.21.0 (Phase 2): Explicit SIMD (AVX2, AVX-512, NEON)
v0.22.0 (Phase 3): Documentation & usability improvements
v1.0.0 (Phase 4): Production optimization

Future

v1.1.0 (Phase 5): GPU acceleration (CUDA, OpenCL, Vulkan)

Contributing

See CONTRIBUTING.md for contribution guidelines.

References

Component Decomposition: ADR-016
VSA Theory: Kanerva, P. (2009). "Hyperdimensional Computing"
Balanced Ternary: Knuth, D. (1998). "The Art of Computer Programming, Vol 2"

License

MIT

embeddenator-vsa 0.20.1