pub struct SparseVec {
pub pos: Vec<usize>,
pub neg: Vec<usize>,
}Expand description
Sparse ternary vector with positive and negative indices
Fields§
§pos: Vec<usize>Indices with +1 value
neg: Vec<usize>Indices with -1 value
Implementations§
Source§impl SparseVec
impl SparseVec
Sourcepub fn new() -> Self
pub fn new() -> Self
Create an empty sparse vector
§Examples
use embeddenator_vsa::SparseVec;
let vec = SparseVec::new();
assert!(vec.pos.is_empty());
assert!(vec.neg.is_empty());Sourcepub fn random() -> Self
pub fn random() -> Self
Generate a random sparse vector with ~1% density
§Examples
use embeddenator_vsa::SparseVec;
let vec = SparseVec::random();
// Vector should have approximately 1% density (100 positive + 100 negative)
assert!(vec.pos.len() > 0);
assert!(vec.neg.len() > 0);Sourcepub fn encode_data(
data: &[u8],
config: &ReversibleVSAConfig,
path: Option<&str>,
) -> Self
pub fn encode_data( data: &[u8], config: &ReversibleVSAConfig, path: Option<&str>, ) -> Self
Encode data into a reversible sparse vector using block-based mapping
This method implements hierarchical encoding with path-based permutations for lossless data recovery. The encoding process:
- Splits data into blocks of configurable size
- Applies path-based permutations to each block
- Combines blocks using hierarchical bundling
§Arguments
data- The data to encodeconfig- Configuration for encoding parameterspath- Optional path string for hierarchical encoding (affects permutation)
§Returns
A SparseVec that can be decoded back to the original data
§Examples
use embeddenator_vsa::{SparseVec, ReversibleVSAConfig};
let data = b"hello world";
let config = ReversibleVSAConfig::default();
let encoded = SparseVec::encode_data(data, &config, None);
// encoded vector contains reversible representation of the data
assert!(!encoded.pos.is_empty() || !encoded.neg.is_empty());Sourcepub fn decode_data(
&self,
config: &ReversibleVSAConfig,
path: Option<&str>,
expected_size: usize,
) -> Vec<u8> ⓘ
pub fn decode_data( &self, config: &ReversibleVSAConfig, path: Option<&str>, expected_size: usize, ) -> Vec<u8> ⓘ
Decode data from a reversible sparse vector
Reverses the encoding process to recover the original data. Requires the same configuration and path used during encoding.
§Arguments
config- Same configuration used for encodingpath- Same path string used for encodingexpected_size- Expected size of the decoded data (for validation)
§Returns
The original data bytes (may need correction layer for 100% fidelity)
§Examples
use embeddenator_vsa::{SparseVec, ReversibleVSAConfig};
let data = b"hello world";
let config = ReversibleVSAConfig::default();
let encoded = SparseVec::encode_data(data, &config, None);
let decoded = encoded.decode_data(&config, None, data.len());
// Note: For 100% fidelity, use CorrectionStore with EmbrFS
// Raw decode may have minor differences that corrections compensate forSourcepub fn from_data(data: &[u8]) -> Self
👎Deprecated since 0.2.0: Use encode_data() for reversible encoding
pub fn from_data(data: &[u8]) -> Self
Generate a deterministic sparse vector from data using SHA256 seed DEPRECATED: Use encode_data() for new code
§Examples
use embeddenator_vsa::{SparseVec, ReversibleVSAConfig};
let data = b"hello world";
let config = ReversibleVSAConfig::default();
let vec1 = SparseVec::encode_data(data, &config, None);
let vec2 = SparseVec::encode_data(data, &config, None);
// Same input produces same vector (deterministic)
assert_eq!(vec1.pos, vec2.pos);
assert_eq!(vec1.neg, vec2.neg);Sourcepub fn bundle_with_config(
&self,
other: &SparseVec,
config: Option<&ReversibleVSAConfig>,
) -> SparseVec
pub fn bundle_with_config( &self, other: &SparseVec, config: Option<&ReversibleVSAConfig>, ) -> SparseVec
Bundle operation: pairwise conflict-cancel superposition (A ⊕ B)
This is a fast, commutative merge for two vectors:
- same sign => keep
- opposite signs => cancel to 0
- sign vs 0 => keep sign
Note: While this is well-defined for two vectors, repeated application across 3+ vectors is generally not associative because early cancellation/thresholding can discard multiplicity information.
§Arguments
other- The vector to bundle with selfconfig- Optional ReversibleVSAConfig for controlling sparsity via thinning
§Examples
use embeddenator_vsa::{SparseVec, ReversibleVSAConfig};
let config = ReversibleVSAConfig::default();
let vec1 = SparseVec::encode_data(b"data1", &config, None);
let vec2 = SparseVec::encode_data(b"data2", &config, None);
let bundled = vec1.bundle_with_config(&vec2, Some(&config));
// Bundled vector contains superposition of both inputs
// Should be similar to both original vectors
let sim1 = vec1.cosine(&bundled);
let sim2 = vec2.cosine(&bundled);
assert!(sim1 > 0.3);
assert!(sim2 > 0.3);Sourcepub fn bundle(&self, other: &SparseVec) -> SparseVec
pub fn bundle(&self, other: &SparseVec) -> SparseVec
Bundle operation: pairwise conflict-cancel superposition (A ⊕ B)
See bundle() for semantic details; this wrapper optionally applies thinning via
ReversibleVSAConfig.
§Examples
use embeddenator_vsa::SparseVec;
let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec1 = SparseVec::encode_data(b"data1", &config, None);
let vec2 = SparseVec::encode_data(b"data2", &config, None);
let bundled = vec1.bundle(&vec2);
// Bundled vector contains superposition of both inputs
// Should be similar to both original vectors
let sim1 = vec1.cosine(&bundled);
let sim2 = vec2.cosine(&bundled);
assert!(sim1 > 0.3);
assert!(sim2 > 0.3);Sourcepub fn bundle_sum_many<'a, I>(vectors: I) -> SparseVecwhere
I: IntoIterator<Item = &'a SparseVec>,
pub fn bundle_sum_many<'a, I>(vectors: I) -> SparseVecwhere
I: IntoIterator<Item = &'a SparseVec>,
Associative bundle over many vectors: sums contributions per index, then thresholds to sign. This is order-independent because all contributions are accumulated before applying sign. Complexity: O(K log K) where K is total non-zero entries across inputs.
Sourcepub fn bundle_hybrid_many<'a, I>(vectors: I) -> SparseVecwhere
I: IntoIterator<Item = &'a SparseVec>,
pub fn bundle_hybrid_many<'a, I>(vectors: I) -> SparseVecwhere
I: IntoIterator<Item = &'a SparseVec>,
Hybrid bundle: choose a fast pairwise fold for very sparse regimes (to preserve sparsity), otherwise use the associative sum-then-threshold path (order-independent, more faithful to majority).
Heuristic: estimate expected overlap/collision count assuming uniform hashing into DIM.
If expected colliding dimensions is below a small budget, use pairwise bundle; else use
bundle_sum_many.
Sourcepub fn bind(&self, other: &SparseVec) -> SparseVec
pub fn bind(&self, other: &SparseVec) -> SparseVec
Bind operation: non-commutative composition (A ⊙ B) Performs element-wise multiplication. Self-inverse: A ⊙ A ≈ I
§Examples
use embeddenator_vsa::SparseVec;
let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec = SparseVec::encode_data(b"test", &config, None);
let bound = vec.bind(&vec);
// Bind with self should produce high similarity (self-inverse property)
let identity = SparseVec::encode_data(b"identity", &config, None);
let sim = bound.cosine(&identity);
// Result is approximately identity, so similarity varies
assert!(sim >= -1.0 && sim <= 1.0);Sourcepub fn cosine(&self, other: &SparseVec) -> f64
pub fn cosine(&self, other: &SparseVec) -> f64
Calculate cosine similarity between two sparse vectors Returns value in [-1, 1] where 1 is identical, 0 is orthogonal
When the simd feature is enabled, this will automatically use
AVX2 (x86_64) or NEON (aarch64) acceleration if available.
§Examples
use embeddenator_vsa::SparseVec;
let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec1 = SparseVec::encode_data(b"cat", &config, None);
let vec2 = SparseVec::encode_data(b"cat", &config, None);
let vec3 = SparseVec::encode_data(b"dog", &config, None);
// Identical data produces identical vectors
assert!((vec1.cosine(&vec2) - 1.0).abs() < 0.01);
// Different data produces low similarity
let sim = vec1.cosine(&vec3);
assert!(sim < 0.3);Sourcepub fn cosine_scalar(&self, other: &SparseVec) -> f64
pub fn cosine_scalar(&self, other: &SparseVec) -> f64
Scalar (non-SIMD) cosine similarity implementation.
This is the original implementation and serves as the baseline for SIMD optimizations. It’s also used when SIMD is not available.
Sourcepub fn permute(&self, shift: usize) -> SparseVec
pub fn permute(&self, shift: usize) -> SparseVec
Apply cyclic permutation to vector indices Used for encoding sequence order in hierarchical structures
§Arguments
shift- Number of positions to shift indices cyclically
§Examples
use embeddenator_vsa::SparseVec;
let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec = SparseVec::encode_data(b"test", &config, None);
let permuted = vec.permute(100);
// Permuted vector should have different indices but same structure
assert_eq!(vec.pos.len(), permuted.pos.len());
assert_eq!(vec.neg.len(), permuted.neg.len());Sourcepub fn inverse_permute(&self, shift: usize) -> SparseVec
pub fn inverse_permute(&self, shift: usize) -> SparseVec
Apply inverse cyclic permutation to vector indices Decodes sequence order by reversing the permutation shift
§Arguments
shift- Number of positions to reverse shift indices cyclically
§Examples
use embeddenator_vsa::SparseVec;
let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec = SparseVec::encode_data(b"test", &config, None);
let permuted = vec.permute(100);
let recovered = permuted.inverse_permute(100);
// Round-trip should recover original vector
assert_eq!(vec.pos, recovered.pos);
assert_eq!(vec.neg, recovered.neg);Sourcepub fn thin(&self, target_non_zero: usize) -> SparseVec
pub fn thin(&self, target_non_zero: usize) -> SparseVec
Context-Dependent Thinning Algorithm
Thinning controls vector sparsity during bundle operations to prevent exponential density growth that degrades VSA performance. The algorithm:
- Calculate current density = (pos.len() + neg.len()) as f32 / DIM as f32
- If current_density <= target_density, return unchanged
- Otherwise, randomly sample indices to reduce to target count
- Preserve pos/neg ratio to maintain signal polarity balance
- Use deterministic seeding for reproducible results
Edge Cases:
- Empty vector: return unchanged
- target_non_zero = 0: return empty vector (not recommended)
- target_non_zero >= current: return clone
- Single polarity vectors: preserve polarity distribution
Performance: O(n log n) due to sorting, where n = target_non_zero