Expand description
§NumKong - Hardware-Accelerated Numerics
Provides SIMD-accelerated distance metrics, elementwise operations, and tensor algebra targeting ARM NEON/SVE/SME and x86 AVX2/AVX-512 backends.
§Modules
types: Mixed-precision scalar types (f16,bf16, FP8, packed integers) andFloatLiketraitspatial: Dot products, angular (cosine), and Euclidean distanceseach: Elementwise operations and trigonometryreduce: Statistical reductions (moments, min/max)set: Binary set similarity (Hamming, Jaccard)probability: Probability divergences (KL, JS)curved: Curved metric spaces (Bilinear, Mahalanobis)mesh: Mesh alignment (Kabsch, Umeyama, RMSD)geospatial: Geospatial distances (Haversine, Vincenty)sparse: Sparse set operations- [
cast]: Type casting between scalar formats capabilities: Runtime SIMD feature detectionmatrix: Batch matrix operations (GEMM, packed spatial distances)tensor: N-dimensional tensors with elementwise/reduction operations
§Implemented operations include:
- Euclidean (L2), inner product, and angular (cosine) spatial distances.
- Hamming and Jaccard binary distances.
- Kullback-Leibler divergence and Jensen-Shannon distance.
- Elementwise scale, sum, blend, and FMA operations.
- Trigonometric functions (sin, cos, atan).
- Type casting between all scalar formats.
- Matrix multiplication with pre-packing (GEMM).
§Example
use numkong::{Dot, Angular, Euclidean};
let a = &[1.0_f32, 2.0, 3.0];
let b = &[4.0_f32, 5.0, 6.0];
let dot_product = f32::dot(a, b);
let angular_dist = f32::angular(a, b);
let l2sq_dist = f32::sqeuclidean(a, b);
// Enable AMX and other platform-specific SIMD features
numkong::capabilities::configure_thread();§Mixed Precision Support
use numkong::{Angular, f16, bf16};
// Work with half-precision floats
let half_a: Vec<f16> = vec![1.0, 2.0, 3.0].iter().map(|&x| f16::from_f32(x)).collect();
let half_b: Vec<f16> = vec![4.0, 5.0, 6.0].iter().map(|&x| f16::from_f32(x)).collect();
let half_angular_dist = f16::angular(&half_a, &half_b);
// Work with brain floats
let brain_a: Vec<bf16> = vec![1.0, 2.0, 3.0].iter().map(|&x| bf16::from_f32(x)).collect();
let brain_b: Vec<bf16> = vec![4.0, 5.0, 6.0].iter().map(|&x| bf16::from_f32(x)).collect();
let brain_angular_dist = bf16::angular(&brain_a, &brain_b);
// Direct bit manipulation
let half = f16::from_f32(3.14);
let bits = half.0; // Access raw u16 representation
let reconstructed = f16(bits);§Traits
The SpatialSimilarity trait (combining Dot, Angular, Euclidean) covers:
dot(a, b): Computes dot product between two slices.angular(a, b)/cosine(a, b): Computes angular distance (1 − cosine similarity).sqeuclidean(a, b): Computes squared Euclidean distance.euclidean(a, b): Computes Euclidean distance.
The BinarySimilarity trait (combining Hamming, Jaccard) covers:
hamming(a, b): Computes Hamming distance between two slices.jaccard(a, b): Computes Jaccard distance between two slices.
The ProbabilitySimilarity trait (combining KullbackLeibler, JensenShannon) covers:
jensenshannon(a, b): Computes Jensen-Shannon distance.kullbackleibler(a, b): Computes Kullback-Leibler divergence.
The elementwise traits (including EachScale, EachSum, EachBlend, EachFMA) covers:
scale(a, alpha, beta, result): Element-wiseresult[i] = α × a[i] + β.sum(a, b, result): Element-wiseresult[i] = a[i] + b[i].blend(a, b, alpha, beta, result): Blendresult[i] = α × a[i] + β × b[i].fma(a, b, c, alpha, beta, result): Fused multiply-addresult[i] = α × a[i] × b[i] + β × c[i].
The Trigonometry trait (combining EachSin, EachCos, EachATan) covers:
sin(input, result): Element-wise sine.cos(input, result): Element-wise cosine.atan(input, result): Element-wise arctangent.
Additional traits: VDot, Roots, SparseIntersect, SparseDot.
Re-exports§
pub use types::bf16;pub use types::bf16c;pub use types::e2m3;pub use types::e3m2;pub use types::e4m3;pub use types::e5m2;pub use types::f16;pub use types::f16c;pub use types::f32c;pub use types::f64c;pub use types::i4x2;pub use types::is_close;pub use types::u1x8;pub use types::u4x2;pub use types::DimMut;pub use types::DimRef;pub use types::FloatConvertible;pub use types::FloatLike;pub use types::NumberLike;pub use types::StorageElement;pub use spatial::Angular;pub use spatial::Dot;pub use spatial::Euclidean;pub use spatial::Roots;pub use spatial::SpatialSimilarity;pub use spatial::VDot;pub use set::BinarySimilarity;pub use set::Hamming;pub use set::Jaccard;pub use probability::JensenShannon;pub use probability::KullbackLeibler;pub use probability::ProbabilitySimilarity;pub use each::EachATan;pub use each::EachBlend;pub use each::EachCos;pub use each::EachFMA;pub use each::EachScale;pub use each::EachSin;pub use each::EachSum;pub use each::Trigonometry;pub use reduce::ReduceMinMax;pub use reduce::ReduceMoments;pub use reduce::Reductions;pub use curved::Bilinear;pub use curved::Mahalanobis;pub use mesh::MeshAlignment;pub use mesh::MeshAlignmentResult;pub use geospatial::Geospatial;pub use geospatial::Haversine;pub use geospatial::Vincenty;pub use sparse::SparseDot;pub use sparse::SparseIntersect;pub use cast::cast;pub use cast::CastDtype;pub use capabilities::cap;pub use capabilities::available;pub use capabilities::configure_thread;pub use capabilities::uses_dynamic_dispatch;pub use tensor::AllCloseOps;pub use tensor::Allocator;pub use tensor::AxisIterator;pub use tensor::AxisIteratorMut;pub use tensor::BlendOps;pub use tensor::CastOps;pub use tensor::FmaOps;pub use tensor::Global;pub use tensor::Matrix;pub use tensor::MatrixSpan;pub use tensor::MatrixView;pub use tensor::MinMaxOps;pub use tensor::MinMaxResult;pub use tensor::MomentsOps;pub use tensor::RangeStep;pub use tensor::ScaleOps;pub use tensor::SliceArg;pub use tensor::SliceRange;pub use tensor::SliceSpec;pub use tensor::SumOps;pub use tensor::Tensor;pub use tensor::TensorDims;pub use tensor::TensorError;pub use tensor::TensorIterator;pub use tensor::TensorMut;pub use tensor::TensorRef;pub use tensor::TensorSpan;pub use tensor::TensorSpanDims;pub use tensor::TensorSpanIterator;pub use tensor::TensorView;pub use tensor::TensorViewDims;pub use tensor::TensorViewIterator;pub use tensor::TrigAtanOps;pub use tensor::TrigCosOps;pub use tensor::TrigSinOps;pub use tensor::DEFAULT_MAX_RANK;pub use tensor::SIMD_ALIGNMENT;pub use matrix::Angulars;pub use matrix::Dots;pub use matrix::Euclideans;pub use matrix::Hammings;pub use matrix::Jaccards;pub use matrix::PackedMatrix;pub use matrix::SymmetricAngulars;pub use matrix::SymmetricDots;pub use matrix::SymmetricEuclideans;pub use matrix::SymmetricHammings;pub use matrix::SymmetricJaccards;pub use vector::Vector;pub use vector::VectorIndex;pub use vector::VectorIterator;pub use vector::VectorSpan;pub use vector::VectorSpanIterator;pub use vector::VectorView;pub use vector::VectorViewIterator;pub use maxsim::MaxSim;pub use maxsim::MaxSimPackedMatrix;
Modules§
- capabilities
- Runtime CPU capability detection.
- cast
- Type casting between scalar formats.
- curved
- Curved metric spaces: Bilinear forms and Mahalanobis distance.
- each
- Elementwise operations and trigonometry.
- geospatial
- Geospatial distance functions: Haversine and Vincenty.
- matrix
- Batch matrix operations: GEMM, packed spatial distances.
- maxsim
- MaxSim (ColBERT late-interaction) scoring with pre-packed matrices.
- mesh
- Mesh superposition and alignment: Kabsch, Umeyama, RMSD.
- probability
- Probability measures: Kullback-Leibler divergence and Jensen-Shannon distance.
- reduce
- Statistical reductions: moments (sum/sum-of-squares) and min/max.
- set
- Binary set similarity: Hamming and Jaccard distances.
- sparse
- Sparse set intersection and weighted dot products.
- spatial
- Spatial similarity: dot products, angular (cosine), and Euclidean distances.
- tensor
- Core N-dimensional tensor types with elementwise, trigonometric, reduction, and cast operations.
- types
- Scalar types and conversion trait for mixed-precision computing.
- vector
- Owning and non-owning vector types with signed indexing and sub-byte support.