Skip to main content

SparseVec

Struct SparseVec 

Source
pub struct SparseVec {
    pub pos: Vec<usize>,
    pub neg: Vec<usize>,
}
Expand description

Sparse ternary vector with positive and negative indices

Fields§

§pos: Vec<usize>

Indices with +1 value

§neg: Vec<usize>

Indices with -1 value

Implementations§

Source§

impl SparseVec

Source

pub fn from_seed(seed: &[u8; 32], dim: usize) -> Self

Create a sparse vector from a seed (deterministic)

Source

pub fn from_bytes(data: &[u8]) -> Self

Create a sparse vector directly from bytes

Source§

impl SparseVec

Source

pub fn new() -> Self

Create an empty sparse vector

§Examples
use embeddenator_vsa::SparseVec;

let vec = SparseVec::new();
assert!(vec.pos.is_empty());
assert!(vec.neg.is_empty());
Source

pub fn random() -> Self

Generate a random sparse vector with ~1% density

§Examples
use embeddenator_vsa::SparseVec;

let vec = SparseVec::random();
// Vector should have approximately 1% density (100 positive + 100 negative)
assert!(vec.pos.len() > 0);
assert!(vec.neg.len() > 0);
Source

pub fn random_with_config(config: &VsaConfig) -> Self

Generate a random sparse vector with configurable dimensions and density

This method allows creating vectors with custom dimensions, useful for benchmarking with different data types or when dimensions need to be determined at runtime.

§Arguments
  • config - Configuration specifying dimension and density
§Examples
use embeddenator_vsa::{SparseVec, VsaConfig};

// Create a small vector (1000 dims, 2% density)
let config = VsaConfig::small();
let vec = SparseVec::random_with_config(&config);
assert!(vec.pos.len() > 0);

// Create a custom-sized vector
let custom = VsaConfig::new(5000).with_density(0.02);
let vec = SparseVec::random_with_config(&custom);
Source

pub fn encode_data( data: &[u8], config: &ReversibleVSAConfig, path: Option<&str>, ) -> Self

Encode data into a reversible sparse vector using block-based mapping

This method implements hierarchical encoding with path-based permutations for lossless data recovery. The encoding process:

  1. Splits data into blocks of configurable size
  2. Applies path-based permutations to each block
  3. Combines blocks using hierarchical bundling
§Arguments
  • data - The data to encode
  • config - Configuration for encoding parameters
  • path - Optional path string for hierarchical encoding (affects permutation)
§Returns

A SparseVec that can be decoded back to the original data

§Examples
use embeddenator_vsa::{SparseVec, ReversibleVSAConfig};

let data = b"hello world";
let config = ReversibleVSAConfig::default();
let encoded = SparseVec::encode_data(data, &config, None);

// encoded vector contains reversible representation of the data
assert!(!encoded.pos.is_empty() || !encoded.neg.is_empty());
Source

pub fn decode_data( &self, config: &ReversibleVSAConfig, path: Option<&str>, expected_size: usize, ) -> Vec<u8>

Decode data from a reversible sparse vector

Reverses the encoding process to recover the original data. Requires the same configuration and path used during encoding.

§Arguments
  • config - Same configuration used for encoding
  • path - Same path string used for encoding
  • expected_size - Expected size of the decoded data (for validation)
§Returns

The original data bytes (may need correction layer for 100% fidelity)

§Examples
use embeddenator_vsa::{SparseVec, ReversibleVSAConfig};

let data = b"hello world";
let config = ReversibleVSAConfig::default();
let encoded = SparseVec::encode_data(data, &config, None);
let decoded = encoded.decode_data(&config, None, data.len());

// Note: For 100% fidelity, use CorrectionStore with EmbrFS
// Raw decode may have minor differences that corrections compensate for
Source

pub fn from_data(data: &[u8]) -> Self

👎Deprecated since 0.2.0: Use encode_data() for reversible encoding

Generate a deterministic sparse vector from data using SHA256 seed DEPRECATED: Use encode_data() for new code

§Examples
use embeddenator_vsa::{SparseVec, ReversibleVSAConfig};

let data = b"hello world";
let config = ReversibleVSAConfig::default();
let vec1 = SparseVec::encode_data(data, &config, None);
let vec2 = SparseVec::encode_data(data, &config, None);

// Same input produces same vector (deterministic)
assert_eq!(vec1.pos, vec2.pos);
assert_eq!(vec1.neg, vec2.neg);
Source

pub fn bundle_with_config( &self, other: &SparseVec, config: Option<&ReversibleVSAConfig>, ) -> SparseVec

Bundle operation: pairwise conflict-cancel superposition (A ⊕ B)

This is a fast, commutative merge for two vectors:

  • same sign => keep
  • opposite signs => cancel to 0
  • sign vs 0 => keep sign

Note: While this is well-defined for two vectors, repeated application across 3+ vectors is generally not associative because early cancellation/thresholding can discard multiplicity information.

§Arguments
  • other - The vector to bundle with self
  • config - Optional ReversibleVSAConfig for controlling sparsity via thinning
§Examples
use embeddenator_vsa::{SparseVec, ReversibleVSAConfig};

let config = ReversibleVSAConfig::default();
let vec1 = SparseVec::encode_data(b"data1", &config, None);
let vec2 = SparseVec::encode_data(b"data2", &config, None);
let bundled = vec1.bundle_with_config(&vec2, Some(&config));

// Bundled vector contains superposition of both inputs
// Should be similar to both original vectors
let sim1 = vec1.cosine(&bundled);
let sim2 = vec2.cosine(&bundled);
assert!(sim1 > 0.3);
assert!(sim2 > 0.3);
Source

pub fn bundle(&self, other: &SparseVec) -> SparseVec

Bundle operation: pairwise conflict-cancel superposition (A ⊕ B)

See bundle() for semantic details; this wrapper optionally applies thinning via ReversibleVSAConfig.

§Examples
use embeddenator_vsa::SparseVec;

let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec1 = SparseVec::encode_data(b"data1", &config, None);
let vec2 = SparseVec::encode_data(b"data2", &config, None);
let bundled = vec1.bundle(&vec2);

// Bundled vector contains superposition of both inputs
// Should be similar to both original vectors
let sim1 = vec1.cosine(&bundled);
let sim2 = vec2.cosine(&bundled);
assert!(sim1 > 0.3);
assert!(sim2 > 0.3);
Source

pub fn bundle_sum_many<'a, I>(vectors: I) -> SparseVec
where I: IntoIterator<Item = &'a SparseVec>,

Associative bundle over many vectors: sums contributions per index, then thresholds to sign. This is order-independent because all contributions are accumulated before applying sign. Complexity: O(K log K) where K is total non-zero entries across inputs.

Source

pub fn bundle_hybrid_many<'a, I>(vectors: I) -> SparseVec
where I: IntoIterator<Item = &'a SparseVec>,

Hybrid bundle: choose a fast pairwise fold for very sparse regimes (to preserve sparsity), otherwise use the associative sum-then-threshold path (order-independent, more faithful to majority).

Heuristic: estimate expected overlap/collision count assuming uniform hashing into DIM. If expected colliding dimensions is below a small budget, use pairwise bundle; else use bundle_sum_many.

Source

pub fn bind(&self, other: &SparseVec) -> SparseVec

Bind operation: non-commutative composition (A ⊙ B) Performs element-wise multiplication. Self-inverse: A ⊙ A ≈ I

§Examples
use embeddenator_vsa::SparseVec;

let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec = SparseVec::encode_data(b"test", &config, None);
let bound = vec.bind(&vec);

// Bind with self should produce high similarity (self-inverse property)
let identity = SparseVec::encode_data(b"identity", &config, None);
let sim = bound.cosine(&identity);
// Result is approximately identity, so similarity varies
assert!(sim >= -1.0 && sim <= 1.0);
Source

pub fn cosine(&self, other: &SparseVec) -> f64

Calculate cosine similarity between two sparse vectors Returns value in [-1, 1] where 1 is identical, 0 is orthogonal

When the simd feature is enabled, this will automatically use AVX2 (x86_64) or NEON (aarch64) acceleration if available.

§Examples
use embeddenator_vsa::SparseVec;

let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec1 = SparseVec::encode_data(b"cat", &config, None);
let vec2 = SparseVec::encode_data(b"cat", &config, None);
let vec3 = SparseVec::encode_data(b"dog", &config, None);

// Identical data produces identical vectors
assert!((vec1.cosine(&vec2) - 1.0).abs() < 0.01);

// Different data produces low similarity
let sim = vec1.cosine(&vec3);
assert!(sim < 0.3);
Source

pub fn cosine_scalar(&self, other: &SparseVec) -> f64

Scalar (non-SIMD) cosine similarity implementation.

This is the original implementation and serves as the baseline for SIMD optimizations. It’s also used when SIMD is not available.

Source

pub fn permute(&self, shift: usize) -> SparseVec

Apply cyclic permutation to vector indices Used for encoding sequence order in hierarchical structures

§Arguments
  • shift - Number of positions to shift indices cyclically
§Examples
use embeddenator_vsa::SparseVec;

let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec = SparseVec::encode_data(b"test", &config, None);
let permuted = vec.permute(100);

// Permuted vector should have different indices but same structure
assert_eq!(vec.pos.len(), permuted.pos.len());
assert_eq!(vec.neg.len(), permuted.neg.len());
Source

pub fn inverse_permute(&self, shift: usize) -> SparseVec

Apply inverse cyclic permutation to vector indices Decodes sequence order by reversing the permutation shift

§Arguments
  • shift - Number of positions to reverse shift indices cyclically
§Examples
use embeddenator_vsa::SparseVec;

let config = embeddenator_vsa::ReversibleVSAConfig::default();
let vec = SparseVec::encode_data(b"test", &config, None);
let permuted = vec.permute(100);
let recovered = permuted.inverse_permute(100);

// Round-trip should recover original vector
assert_eq!(vec.pos, recovered.pos);
assert_eq!(vec.neg, recovered.neg);
Source

pub fn thin(&self, target_non_zero: usize) -> SparseVec

Context-Dependent Thinning Algorithm

Thinning controls vector sparsity during bundle operations to prevent exponential density growth that degrades VSA performance. The algorithm:

  1. Calculate current density = (pos.len() + neg.len()) as f32 / DIM as f32
  2. If current_density <= target_density, return unchanged
  3. Otherwise, randomly sample indices to reduce to target count
  4. Preserve pos/neg ratio to maintain signal polarity balance
  5. Use deterministic seeding for reproducible results

Edge Cases:

  • Empty vector: return unchanged
  • target_non_zero = 0: return empty vector (not recommended)
  • target_non_zero >= current: return clone
  • Single polarity vectors: preserve polarity distribution

Performance: O(n log n) due to sorting, where n = target_non_zero

Trait Implementations§

Source§

impl Clone for SparseVec

Source§

fn clone(&self) -> SparseVec

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for SparseVec

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for SparseVec

Source§

fn default() -> Self

Returns the “default value” for a type. Read more
Source§

impl<'de> Deserialize<'de> for SparseVec

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl Serialize for SparseVec

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,