Crate cfavml

Source
Expand description

§CFAVML

CF’s Accelerated Vector Math Library

Various accelerated vector operations over Rust primitives with SIMD.

This is the core base library, it has no dependencies and only depends on the core library, it does not perform any allocations.

This library is guaranteed to be no-std compatible and can be adjusted by disabling the std feature flag:

§Default Setup
cfavml = "0.3.0" 
§No-std Setup
cfavml = { version = "0.3.0", default-features = false }

§Important Version Upgrade Notes

If you are upgrading on a breaking release, i.e. 0.2.0 to 0.3.0 there may be some important changes that affects your system, although the public safe APIs I try my best to avoid breaking.

  • AVX512 required CPU features changed in 0.3.0+
    • In versions older than 0.3.0 avx512 was used when only the avx512f cpu feature was available since this is the base/foundation version of AVX512. However, in 0.3.0 we introduced more extensive cmp operations (eq/neq/lt/lte/gt/gte) which changed our required CPU features to include avx512bw
    • This means on unsafe APIs you must update your feature checks to include avx512bw.
    • Safe APIs do not require changes but may fallback to AVX2 on some of the first gen AVX512 CPUs, i.e. Skylake

§Available SIMD Architectures

  • AVX2
  • AVX2 + FMA
  • AVX512 (avx512f + avx512bw) nightly only
  • NEON
  • Fallback (Typically optimized to SSE automatically by LLVM on x86)

§Supported Primitives

  • f32
  • f64
  • i8
  • i16
  • i32
  • i64
  • u8
  • u16
  • u32
  • u64
§Note on non-f32/f64 division

Division operations on non-floating point primitives are currently still scalar operations, as performing integer division is incredibly hard to do anymore efficiently with SIMD and adds a significant amount of cognitive overhead when reading the code.

Although to be honest I have some serious questions about your application if you’re doing heavy integer division…

§Supported Operations

§Spacial distances

These are routines that can be used for things like KNN classification or index building.

  • Dot product of two vectors
  • Cosine distance of two vectors
  • Squared Euclidean distance of two vectors

§Arithmetic

  • Add single value to vector
  • Sub single value from vector
  • Mul vector by single value
  • Div vector by single value
  • Add two vectors vertically
  • Sub two vectors vertically
  • Mul two vectors vertically
  • Div two vectors vertically

§Comparison

  • Horizontal max element in a vector
  • Horizontal min element in a vector
  • Vertical max element of two vectors
  • Vertical min element of two vectors
  • Vertical max element of a vector and broadcast value
  • Vertical min element of a vector and broadcast value
  • EQ/NEQ/LT/LTE/GT/GTE cmp of a vector and broadcast value
  • EQ/NEQ/LT/LTE/GT/GTE cmp of two vectors

§Aggregation

  • Horizontal sum of a vector

§Misc

  • Squared L2 norm of a vector

§Dangerous routine naming convention

If you’ve looked at the danger folder at all, you’ll notice a few things, one SIMD operations are gated behind the SimdRegister<T> trait, this provides us with a generic abstraction over the various SIMD register types and architectures.

This trait, combined with the Math<T> trait form the core of all operations and are provided as generic functions (with no target features):

  • generic_dot
  • generic_squared_euclidean
  • generic_cosine
  • generic_squared_norm
  • generic_cmp_max
  • generic_cmp_max_vector
  • generic_cmp_max_value
  • generic_cmp_min
  • generic_cmp_min_vector
  • generic_cmp_min_value
  • generic_cmp_eq_vector
  • generic_cmp_eq_value
  • generic_cmp_neq_vector
  • generic_cmp_neq_value
  • generic_cmp_lt_vector
  • generic_cmp_lt_value
  • generic_cmp_lte_vector
  • generic_cmp_lte_value
  • generic_cmp_gt_vector
  • generic_cmp_gt_value
  • generic_cmp_gte_vector
  • generic_cmp_gte_value
  • generic_sum
  • generic_add_value
  • generic_sub_value
  • generic_mul_value
  • generic_div_value
  • generic_add_vector
  • generic_sub_vector
  • generic_mul_vector
  • generic_div_vector

We also export functions with the target_features pre-specified for each SIMD register type and is found under the cfavml::danger::export_* modules. Although it is not recommended to use these routines directly unless you know what you are doing.

§Features

  • nightly Enables optimizations available only on nightly platforms.
    • This is required for AVX512 support due to it currently being unstable.

§Is this a replacement for BLAS?

No. At least, not unless you’re only doing dot product… BLAS and LAPACK are huge and I am certainly not in the market for implementing all BLAS routines in Rust, but that being said if your application is similar to that of ndarray where it is only using BLAS for the dot product, then maybe.

Modules§

buffer
The core buffer trait used to represent writeable buffers.
danger
dispatch
math
safe_trait_agg_ops
Safe but somewhat low-level variants of the aggregation operations in CFAVML.
safe_trait_arithmetic_ops
Safe but somewhat low-level variants of the arithmetic operations in CFAVML.
safe_trait_cmp_ops
Safe but somewhat low-level variants of the comparison operations in CFAVML.
safe_trait_distance_ops
Safe but somewhat low-level variants of the distance operations in CFAVML.

Macros§

dispatch
Dispatches a set of functions based on the available CPU features.

Functions§

add_value
Performs an element wise addition of each element of vector a and the provided broadcast value, writing the result to result.
add_vector
Performs an element wise addition of each element pair of vector a and b, writing the result to result.
cosine
Calculates the cosine similarity distance of vectors a and b.
div_value
Performs an element wise division of each element of vector a and the provided broadcast value, writing the result to result.
div_vector
Performs an element wise division on each element pair from vectors a and b, writing the result to result.
dot
Calculates the cosine similarity distance of vectors a and b.
eq_value
Checks each element within vector a of size dims against a provided broadcast value comparing if they are equal returning a mask vector of the same type.
eq_vector
Checks each element pair from vectors a and b of size dims comparing if element a is equal to element b returning a mask vector of the same type.
gt_value
Checks each element within vector a of size dims against a provided broadcast value comparing if they are less than or equal returning a mask vector of the same type.
gt_vector
Checks each element pair from vectors a and b of size dims comparing if element a is greater than element b returning a mask vector of the same type.
gte_value
Checks each element within vector a of size dims against a provided broadcast value comparing if they are less than or equal returning a mask vector of the same type.
gte_vector
Checks each element pair from vectors a and b of size dims comparing if element a is greater than element b returning a mask vector of the same type.
lt_value
Checks each element within vector a of size dims against a provided broadcast value comparing if they are less than returning a mask vector of the same type.
lt_vector
Checks each element pair from vectors a and b of size dims comparing if element a is less than element b returning a mask vector of the same type.
lte_value
Checks each element within vector a of size dims against a provided broadcast value comparing if they are less than or equal returning a mask vector of the same type.
lte_vector
Checks each element pair from vectors a and b of size dims comparing if element a is less than or equal to element b returning a mask vector of the same type.
max
Finds the horizontal max element of a given vector and returns the result.
max_value
Performs an element wise max on each element of vector a and the provided broadcast value, writing the result to result.
max_vector
Performs an element wise max on each element pair from vectors a and b, writing the result to result.
min
Finds the horizontal min element of a given vector and returns the result.
min_value
Performs an element wise min on each element of vector a and the provided broadcast value, writing the result to result.
min_vector
Performs an element wise min on each element pair from vectors a and b, writing the result to result.
mul_value
Performs an element wise multiplication of each element of vector a and the provided broadcast value, writing the result to result.
mul_vector
Performs an element wise multiplication of each element pair from vectors a and b, writing the result to result.
neq_value
Checks each element within vector a of size dims against a provided broadcast value comparing if they are not equal returning a mask vector of the same type.
neq_vector
Checks each element pair from vectors a and b of size dims comparing if element a is not equal to element b returning a mask vector of the same type.
squared_euclidean
Calculates the squared Euclidean distance of vectors a and b.
squared_norm
Calculates the squared L2 norm of vector a.
sub_value
Performs an element wise subtraction of each element of vector a and the provided broadcast value, writing the result to result.
sub_vector
Performs an element wise subtraction of each element pair from vectors a and b, writing the result to result.
sum
Performs a horizontal sum of all elements in a returning the result.