Crate simsimd

Source
Expand description

§SpatialSimilarity - Hardware-Accelerated Similarity Metrics and Distance Functions

  • Targets ARM NEON, SVE, x86 AVX2, AVX-512 (VNNI, FP16) hardware backends.
  • Handles f64 double-, f32 single-, and f16 half-precision, i8 integral, and binary vectors.
  • Zero-dependency header-only C 99 library with bindings for Rust and other langauges.

§Implemented distance functions include:

  • Euclidean (L2), Inner Distance, and Cosine (Angular) spatial distances.
  • Hamming (~ Manhattan) and Jaccard (~ Tanimoto) binary distances.
  • Kullback-Leibler and Jensen-Shannon divergences for probability distributions.

§Example

use simsimd::SpatialSimilarity;

let a = &[1, 2, 3];
let b = &[4, 5, 6];

// Compute cosine similarity
let cos_sim = i8::cos(a, b);

// Compute dot product distance
let dot_product = i8::dot(a, b);

// Compute squared Euclidean distance
let l2sq_dist = i8::l2sq(a, b);

§Traits

The SpatialSimilarity trait covers following methods:

  • cosine(a: &[Self], b: &[Self]) -> Option<Distance>: Computes cosine similarity between two slices.
  • dot(a: &[Self], b: &[Self]) -> Option<Distance>: Computes dot product distance between two slices.
  • sqeuclidean(a: &[Self], b: &[Self]) -> Option<Distance>: Computes squared Euclidean distance between two slices.

The BinarySimilarity trait covers following methods:

  • hamming(a: &[Self], b: &[Self]) -> Option<Distance>: Computes Hamming distance between two slices.
  • jaccard(a: &[Self], b: &[Self]) -> Option<Distance>: Computes Jaccard index between two slices.

The ProbabilitySimilarity trait covers following methods:

  • jensenshannon(a: &[Self], b: &[Self]) -> Option<Distance>: Computes Jensen-Shannon divergence between two slices.
  • kullbackleibler(a: &[Self], b: &[Self]) -> Option<Distance>: Computes Kullback-Leibler divergence between two slices.

Modules§

  • The capabilities module provides functions for detecting the hardware features available on the current system.

Structs§

  • A half-precision floating point number, called brain float.
  • A half-precision floating point number.

Traits§

  • BinarySimilarity provides trait methods for computing similarity metrics that are commonly used with binary data vectors, such as Hamming distance and Jaccard index.
  • ComplexProducts provides trait methods for computing products between complex number vectors. This includes standard and Hermitian dot products.
  • ProbabilitySimilarity provides trait methods for computing similarity or divergence measures between probability distributions, such as the Jensen-Shannon divergence and the Kullback-Leibler divergence.
  • SpatialSimilarity provides a set of trait methods for computing similarity or distance between spatial data vectors in SIMD (Single Instruction, Multiple Data) context. These methods can be used to calculate metrics like cosine similarity, dot product, and squared Euclidean distance between two slices of data.