Skip to main content

SparseVec

Struct SparseVec 

Source
pub struct SparseVec {
    pub pos: Vec<usize>,
    pub neg: Vec<usize>,
}
Expand description

Sparse ternary vector with positive and negative indices

Fields§

§pos: Vec<usize>

Indices with +1 value

§neg: Vec<usize>

Indices with -1 value

Implementations§

Source§

impl SparseVec

Source

pub fn from_seed(seed: &[u8; 32], dim: usize) -> Self

Create a sparse vector from a seed (deterministic)

Source

pub fn from_bytes(data: &[u8]) -> Self

Create a sparse vector directly from bytes

Source§

impl SparseVec

Source

pub fn new() -> Self

Create an empty sparse vector

§Examples
use embeddenator_vsa::SparseVec;

let vec = SparseVec::new();
assert!(vec.pos.is_empty());
assert!(vec.neg.is_empty());
Source

pub fn random() -> Self

Generate a random sparse vector with ~1% density

§Examples
use embeddenator_vsa::SparseVec;

let vec = SparseVec::random();
// Vector should have approximately 1% density (100 positive + 100 negative)
assert!(vec.pos.len() > 0);
assert!(vec.neg.len() > 0);
Source

pub fn encode_data( data: &[u8], config: &ReversibleVSAConfig, path: Option<&str>, ) -> Self

Encode data into a reversible sparse vector using block-based mapping

This method implements hierarchical encoding with path-based permutations for lossless data recovery. The encoding process:

  1. Splits data into blocks of configurable size
  2. Applies path-based permutations to each block
  3. Combines blocks using hierarchical bundling
§Arguments
  • data - The data to encode
  • config - Configuration for encoding parameters
  • path - Optional path string for hierarchical encoding (affects permutation)
§Returns

A SparseVec that can be decoded back to the original data

§Examples
use embeddenator_vsa::{SparseVec, ReversibleVSAConfig};

let data = b"hello world";
let config = ReversibleVSAConfig::default();
let encoded = SparseVec::encode_data(data, &config, None);

// encoded vector contains reversible representation of the data
assert!(!encoded.pos.is_empty() || !encoded.neg.is_empty());
Source

pub fn decode_data( &self, config: &ReversibleVSAConfig, path: Option<&str>, expected_size: usize, ) -> Vec<u8>

Decode data from a reversible sparse vector

Reverses the encoding process to recover the original data. Requires the same configuration and path used during encoding.

§Arguments
  • config - Same configuration used for encoding
  • path - Same path string used for encoding
  • expected_size - Expected size of the decoded data (for validation)
§Returns

The original data bytes (may need correction layer for 100% fidelity)

§Examples
use embeddenator_vsa::{SparseVec, ReversibleVSAConfig};

let data = b"hello world";
let config = ReversibleVSAConfig::default();
let encoded = SparseVec::encode_data(data, &config, None);
let decoded = encoded.decode_data(&config, None, data.len());

// Note: For 100% fidelity, use CorrectionStore with EmbrFS
// Raw decode may have minor differences that corrections compensate for
Source

pub fn from_data(data: &[u8]) -> Self

👎Deprecated since 0.2.0: Use encode_data() for reversible encoding

Generate a deterministic sparse vector from data using SHA256 seed DEPRECATED: Use encode_data() for new code

§Examples
use embeddenator_vsa::{SparseVec, ReversibleVSAConfig};

let data = b"hello world";
let config = ReversibleVSAConfig::default();
let vec1 = SparseVec::encode_data(data, &config, None);
let vec2 = SparseVec::encode_data(data, &config, None);

// Same input produces same vector (deterministic)
assert_eq!(vec1.pos, vec2.pos);
assert_eq!(vec1.neg, vec2.neg);
Source

pub fn bundle_with_config( &self, other: &SparseVec, config: Option<&ReversibleVSAConfig>, ) -> SparseVec

Bundle operation: pairwise conflict-cancel superposition (A ⊕ B)

This is a fast, commutative merge for two vectors:

  • same sign => keep
  • opposite signs => cancel to 0
  • sign vs 0 => keep sign

Note: While this is well-defined for two vectors, repeated application across 3+ vectors is generally not associative because early cancellation/thresholding can discard multiplicity information.

§Arguments
  • other - The vector to bundle with self
  • config - Optional ReversibleVSAConfig for controlling sparsity via thinning
§Examples
use embeddenator_vsa::{SparseVec, ReversibleVSAConfig};

let config = ReversibleVSAConfig::default();
let vec1 = SparseVec::encode_data(b"data1", &config, None);
let vec2 = SparseVec::encode_data(b"data2", &config, None);
let bundled = vec1.bundle_with_config(&vec2, Some(&config));

// Bundled vector contains superposition of both inputs
// Should be similar to both original vectors
let sim1 = vec1.cosine(&bundled);
let sim2 = vec2.cosine(&bundled);
assert!(sim1 > 0.3);
assert!(sim2 > 0.3);
Source

pub fn bundle(&self, other: &SparseVec) -> SparseVec

Bundle operation: pairwise conflict-cancel superposition (A ⊕ B)

See bundle() for semantic details; this wrapper optionally applies thinning via ReversibleVSAConfig.

§Examples
use embeddenator_vsa::SparseVec;

let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec1 = SparseVec::encode_data(b"data1", &config, None);
let vec2 = SparseVec::encode_data(b"data2", &config, None);
let bundled = vec1.bundle(&vec2);

// Bundled vector contains superposition of both inputs
// Should be similar to both original vectors
let sim1 = vec1.cosine(&bundled);
let sim2 = vec2.cosine(&bundled);
assert!(sim1 > 0.3);
assert!(sim2 > 0.3);
Source

pub fn bundle_sum_many<'a, I>(vectors: I) -> SparseVec
where I: IntoIterator<Item = &'a SparseVec>,

Associative bundle over many vectors: sums contributions per index, then thresholds to sign. This is order-independent because all contributions are accumulated before applying sign. Complexity: O(K log K) where K is total non-zero entries across inputs.

Source

pub fn bundle_hybrid_many<'a, I>(vectors: I) -> SparseVec
where I: IntoIterator<Item = &'a SparseVec>,

Hybrid bundle: choose a fast pairwise fold for very sparse regimes (to preserve sparsity), otherwise use the associative sum-then-threshold path (order-independent, more faithful to majority).

Heuristic: estimate expected overlap/collision count assuming uniform hashing into DIM. If expected colliding dimensions is below a small budget, use pairwise bundle; else use bundle_sum_many.

Source

pub fn bind(&self, other: &SparseVec) -> SparseVec

Bind operation: non-commutative composition (A ⊙ B) Performs element-wise multiplication. Self-inverse: A ⊙ A ≈ I

§Examples
use embeddenator_vsa::SparseVec;

let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec = SparseVec::encode_data(b"test", &config, None);
let bound = vec.bind(&vec);

// Bind with self should produce high similarity (self-inverse property)
let identity = SparseVec::encode_data(b"identity", &config, None);
let sim = bound.cosine(&identity);
// Result is approximately identity, so similarity varies
assert!(sim >= -1.0 && sim <= 1.0);
Source

pub fn cosine(&self, other: &SparseVec) -> f64

Calculate cosine similarity between two sparse vectors Returns value in [-1, 1] where 1 is identical, 0 is orthogonal

When the simd feature is enabled, this will automatically use AVX2 (x86_64) or NEON (aarch64) acceleration if available.

§Examples
use embeddenator_vsa::SparseVec;

let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec1 = SparseVec::encode_data(b"cat", &config, None);
let vec2 = SparseVec::encode_data(b"cat", &config, None);
let vec3 = SparseVec::encode_data(b"dog", &config, None);

// Identical data produces identical vectors
assert!((vec1.cosine(&vec2) - 1.0).abs() < 0.01);

// Different data produces low similarity
let sim = vec1.cosine(&vec3);
assert!(sim < 0.3);
Source

pub fn cosine_scalar(&self, other: &SparseVec) -> f64

Scalar (non-SIMD) cosine similarity implementation.

This is the original implementation and serves as the baseline for SIMD optimizations. It’s also used when SIMD is not available.

Source

pub fn permute(&self, shift: usize) -> SparseVec

Apply cyclic permutation to vector indices Used for encoding sequence order in hierarchical structures

§Arguments
  • shift - Number of positions to shift indices cyclically
§Examples
use embeddenator_vsa::SparseVec;

let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec = SparseVec::encode_data(b"test", &config, None);
let permuted = vec.permute(100);

// Permuted vector should have different indices but same structure
assert_eq!(vec.pos.len(), permuted.pos.len());
assert_eq!(vec.neg.len(), permuted.neg.len());
Source

pub fn inverse_permute(&self, shift: usize) -> SparseVec

Apply inverse cyclic permutation to vector indices Decodes sequence order by reversing the permutation shift

§Arguments
  • shift - Number of positions to reverse shift indices cyclically
§Examples
use embeddenator_vsa::SparseVec;

let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec = SparseVec::encode_data(b"test", &config, None);
let permuted = vec.permute(100);
let recovered = permuted.inverse_permute(100);

// Round-trip should recover original vector
assert_eq!(vec.pos, recovered.pos);
assert_eq!(vec.neg, recovered.neg);
Source

pub fn thin(&self, target_non_zero: usize) -> SparseVec

Context-Dependent Thinning Algorithm

Thinning controls vector sparsity during bundle operations to prevent exponential density growth that degrades VSA performance. The algorithm:

  1. Calculate current density = (pos.len() + neg.len()) as f32 / DIM as f32
  2. If current_density <= target_density, return unchanged
  3. Otherwise, randomly sample indices to reduce to target count
  4. Preserve pos/neg ratio to maintain signal polarity balance
  5. Use deterministic seeding for reproducible results

Edge Cases:

  • Empty vector: return unchanged
  • target_non_zero = 0: return empty vector (not recommended)
  • target_non_zero >= current: return clone
  • Single polarity vectors: preserve polarity distribution

Performance: O(n log n) due to sorting, where n = target_non_zero

Trait Implementations§

Source§

impl Clone for SparseVec

Source§

fn clone(&self) -> SparseVec

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for SparseVec

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for SparseVec

Source§

fn default() -> Self

Returns the “default value” for a type. Read more
Source§

impl<'de> Deserialize<'de> for SparseVec

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl Serialize for SparseVec

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,