Crate simd_aligned

source ·
Expand description

NOTE - Do not use this crate for now. It has been reactivated to make FFSVM compile again, but needs some architectural work.

In One Sentence

You want to use std::simd but realized there is no simple, safe and fast way to align your f32x8 (and friends) in memory and treat them as regular f32 slices for easy loading and manipulation; simd_aligned to the rescue.

Highlights

  • built on top of std::simd for easy data handling
  • supports everything from u8x2 to f64x8
  • think in flat slices (&[f32]), but get performance of properly aligned SIMD vectors (&[f32x16])
  • defines u8s, …, f36s as “best guess” for current platform (WIP)
  • provides N-dimensional VectorD and NxM-dimensional MatrixD.

Note: Right now this is an experimental crate. Features might be added or removed depending on how std::simd evolves. At the end of the day it’s just about being able to load and manipulate data without much fuzz.

Examples

Produces a vector that can hold 10 elements of type f64. Might internally allocate 5 elements of type f64x2, or 3 of type f64x4, depending on the platform. All elements are guaranteed to be properly aligned for fast access.

#![feature(portable_simd)]
use std::simd::*;
use simd_aligned::*;

// Create vectors of `10` f64 elements with value `0.0`.
let mut v1 = VectorD::<f64s>::with(0.0, 10);
let mut v2 = VectorD::<f64s>::with(0.0, 10);

// Get "flat", mutable view of the vector, and set individual elements:
let v1_m = v1.flat_mut();
let v2_m = v2.flat_mut();

// Set some elements on v1
v1_m[0] = 0.0;
v1_m[4] = 4.0;
v1_m[8] = 8.0;

// Set some others on v2
v2_m[1] = 0.0;
v2_m[5] = 5.0;
v2_m[9] = 9.0;

let mut sum = f64s::splat(0.0);

// Eventually, do something with the actual SIMD types. Does
// `std::simd` vector math, e.g., f64x8 + f64x8 in one operation:
sum = v1[0] + v2[0];

Benchmarks

There is no performance penalty for using simd_aligned, while retaining all the simplicity of handling flat arrays.

test vectors::packed       ... bench:          77 ns/iter (+/- 4)
test vectors::scalar       ... bench:       1,177 ns/iter (+/- 464)
test vectors::simd_aligned ... bench:          71 ns/iter (+/- 5)

FAQ

How does it relate to faster and std::simd?

  • simd_aligned builds on top of std::simd. At aims to provide common, SIMD-aligned data structure that support simple and safe scalar access patterns.

  • faster (as of today) is really good if you already have exiting flat slices in your code and want operate them “full SIMD ahead”. However, in particular when dealing with multiple slices at the same time (e.g., kernel computations) the performance impact of unaligned arrays can become a bit more noticeable (e.g., in the case of ffsvm up to 10% - 20%).

Re-exports

Modules

  • Contains vector definitions with a fixed bit width.
  • Unified views on SIMD types.

Macros

  • simd_swizzleExperimental
    Constructs a new SIMD vector by copying elements from selected lanes in other vectors.

Structs

  • LaneCountExperimental
    Specifies the number of lanes in a SIMD vector as a type.
  • MaskExperimental
    A SIMD vector mask for LANES elements of width specified by Element.
  • A dynamic (heap allocated) matrix with one axis aligned for fast and safe SIMD access that also provides a flat view on its data.
  • Produced by MatrixD::flat, this allow for flat matrix access.
  • Provided by MatrixD::flat_mut, this allow for flat, mutable matrix access.
  • SimdExperimental
    A SIMD vector of LANES elements of type T. Simd<T, N> has the same shape as [T; N], but operates like T.
  • A dynamic (heap allocated) vector aligned for fast and safe SIMD access that also provides a flat view on its data.

Enums

  • WhichExperimental
    Specifies a lane index into one of two SIMD vectors.

Traits

  • MaskElementExperimental
    Marker trait for types that may be used as SIMD mask elements.
  • SimdElementExperimental
    Marker trait for types that may be used as SIMD vector elements.
  • SimdFloatExperimental
    Operations on SIMD vectors of floats.
  • SimdIntExperimental
    Operations on SIMD vectors of signed integers.
  • SimdOrdExperimental
    Parallel Ord.
  • SimdPartialEqExperimental
    Parallel PartialEq.
  • SimdPartialOrdExperimental
    Parallel PartialOrd.
  • SimdUintExperimental
    Operations on SIMD vectors of unsigned integers.
  • StdFloatExperimental
    This trait provides a possibly-temporary implementation of float functions that may, in the absence of hardware support, canonicalize to calling an operating system’s math.h dynamically-loaded library (also known as a shared object). As these conditionally require runtime support, they should only appear in binaries built assuming OS support: std.
  • SupportedLaneCountExperimental
    Statically guarantees that a lane count is marked as supported.
  • SwizzleExperimental
    Create a vector from the elements of another vector.
  • Swizzle2Experimental
    Create a vector from the elements of two other vectors.
  • ToBitMaskExperimental
    Converts masks to and from integer bitmasks.

Functions

  • Converts an slice of SIMD vectors into a flat slice of elements.
  • Converts a mutable slice of SIMD vectors into a flat slice of elements.

Type Definitions

  • f32x2Experimental
    A 64-bit SIMD vector with two elements of type f32.
  • f32x4Experimental
    A 128-bit SIMD vector with four elements of type f32.
  • f32x8Experimental
    A 256-bit SIMD vector with eight elements of type f32.
  • f32x16Experimental
    A 512-bit SIMD vector with 16 elements of type f32.
  • f64x2Experimental
    A 128-bit SIMD vector with two elements of type f64.
  • f64x4Experimental
    A 256-bit SIMD vector with four elements of type f64.
  • f64x8Experimental
    A 512-bit SIMD vector with eight elements of type f64.
  • i8x4Experimental
    A 32-bit SIMD vector with four elements of type i8.
  • i8x8Experimental
    A 64-bit SIMD vector with eight elements of type i8.
  • i8x16Experimental
    A 128-bit SIMD vector with 16 elements of type i8.
  • i8x32Experimental
    A 256-bit SIMD vector with 32 elements of type i8.
  • i8x64Experimental
    A 512-bit SIMD vector with 64 elements of type i8.
  • i16x2Experimental
    A 32-bit SIMD vector with two elements of type i16.
  • i16x4Experimental
    A 64-bit SIMD vector with four elements of type i16.
  • i16x8Experimental
    A 128-bit SIMD vector with eight elements of type i16.
  • i16x16Experimental
    A 256-bit SIMD vector with 16 elements of type i16.
  • i16x32Experimental
    A 512-bit SIMD vector with 32 elements of type i16.
  • i32x2Experimental
    A 64-bit SIMD vector with two elements of type i32.
  • i32x4Experimental
    A 128-bit SIMD vector with four elements of type i32.
  • i32x8Experimental
    A 256-bit SIMD vector with eight elements of type i32.
  • i32x16Experimental
    A 512-bit SIMD vector with 16 elements of type i32.
  • i64x2Experimental
    A 128-bit SIMD vector with two elements of type i64.
  • i64x4Experimental
    A 256-bit SIMD vector with four elements of type i64.
  • i64x8Experimental
    A 512-bit SIMD vector with eight elements of type i64.
  • isizex2Experimental
    A SIMD vector with two elements of type isize.
  • isizex4Experimental
    A SIMD vector with four elements of type isize.
  • isizex8Experimental
    A SIMD vector with eight elements of type isize.
  • mask8x8Experimental
    A mask for SIMD vectors with eight elements of 8 bits.
  • mask8x16Experimental
    A mask for SIMD vectors with 16 elements of 8 bits.
  • mask8x32Experimental
    A mask for SIMD vectors with 32 elements of 8 bits.
  • mask8x64Experimental
    A mask for SIMD vectors with 64 elements of 8 bits.
  • mask16x4Experimental
    A mask for SIMD vectors with four elements of 16 bits.
  • mask16x8Experimental
    A mask for SIMD vectors with eight elements of 16 bits.
  • mask16x16Experimental
    A mask for SIMD vectors with 16 elements of 16 bits.
  • mask16x32Experimental
    A mask for SIMD vectors with 32 elements of 16 bits.
  • mask32x2Experimental
    A mask for SIMD vectors with two elements of 32 bits.
  • mask32x4Experimental
    A mask for SIMD vectors with four elements of 32 bits.
  • mask32x8Experimental
    A mask for SIMD vectors with eight elements of 32 bits.
  • mask32x16Experimental
    A mask for SIMD vectors with 16 elements of 32 bits.
  • mask64x2Experimental
    A mask for SIMD vectors with two elements of 64 bits.
  • mask64x4Experimental
    A mask for SIMD vectors with four elements of 64 bits.
  • mask64x8Experimental
    A mask for SIMD vectors with eight elements of 64 bits.
  • masksizex2Experimental
    A mask for SIMD vectors with two elements of pointer width.
  • masksizex4Experimental
    A mask for SIMD vectors with four elements of pointer width.
  • masksizex8Experimental
    A mask for SIMD vectors with eight elements of pointer width.
  • u8x4Experimental
    A 32-bit SIMD vector with four elements of type u8.
  • u8x8Experimental
    A 64-bit SIMD vector with eight elements of type u8.
  • u8x16Experimental
    A 128-bit SIMD vector with 16 elements of type u8.
  • u8x32Experimental
    A 256-bit SIMD vector with 32 elements of type u8.
  • u8x64Experimental
    A 512-bit SIMD vector with 64 elements of type u8.
  • u16x2Experimental
    A 32-bit SIMD vector with two elements of type u16.
  • u16x4Experimental
    A 64-bit SIMD vector with four elements of type u16.
  • u16x8Experimental
    A 128-bit SIMD vector with eight elements of type u16.
  • u16x16Experimental
    A 256-bit SIMD vector with 16 elements of type u16.
  • u16x32Experimental
    A 512-bit SIMD vector with 32 elements of type u16.
  • u32x2Experimental
    A 64-bit SIMD vector with two elements of type u32.
  • u32x4Experimental
    A 128-bit SIMD vector with four elements of type u32.
  • u32x8Experimental
    A 256-bit SIMD vector with eight elements of type u32.
  • u32x16Experimental
    A 512-bit SIMD vector with 16 elements of type u32.
  • u64x2Experimental
    A 128-bit SIMD vector with two elements of type u64.
  • u64x4Experimental
    A 256-bit SIMD vector with four elements of type u64.
  • u64x8Experimental
    A 512-bit SIMD vector with eight elements of type u64.
  • usizex2Experimental
    A SIMD vector with two elements of type usize.
  • usizex4Experimental
    A SIMD vector with four elements of type usize.
  • usizex8Experimental
    A SIMD vector with eight elements of type usize.