Skip to main content

Module simd

Module simd 

Source
Expand description

SIMD abstraction layer for video codec implementations.

This module provides a unified interface for SIMD operations used in video encoding and decoding. It abstracts over different SIMD instruction sets (AVX2, AVX-512, NEON) while providing a scalar fallback for portability.

§Architecture

The SIMD abstraction consists of:

  • Types (types.rs): Vector types like I16x8, I32x4, U8x16
  • Traits (traits.rs): SimdOps and SimdOpsExt for SIMD operations
  • Architecture-specific: x86 (AVX2/AVX-512), ARM (NEON), scalar fallback
  • Codec-specific: AV1 and VP9 optimized operations
  • Operations: Domain-specific modules for codec operations

§Usage

use oximedia_codec::simd::{detect_simd, select_transform_impl};

// Detect SIMD capabilities
let caps = detect_simd();
println!("Best SIMD: {}", caps.best_level());

// Use codec-specific SIMD operations
use oximedia_codec::simd::av1::TransformSimd;
let transform = TransformSimd::new(select_transform_impl());
transform.forward_dct_8x8(&input, &mut output);

§Feature Detection and Dispatch

The SIMD implementation is selected at runtime based on CPU capabilities:

use oximedia_codec::simd::{SimdCapabilities, detect_simd};

let caps = detect_simd();
if caps.avx512 {
    // Use AVX-512 optimized path
} else if caps.avx2 {
    // Use AVX2 path
} else if caps.neon {
    // Use ARM NEON path
} else {
    // Use scalar fallback
}

§SIMD Dispatch Mechanism

OxiMedia uses a two-tier dispatch strategy to guarantee correctness on every target while achieving maximum throughput on modern hardware.

Tier 1: Compile-time cfg selection. Target-specific code paths are gated with #[cfg(target_arch = "...")], so only the code relevant to the current build target is compiled in:

  • x86_64 — AVX-512 (avx512f + avx512bw + avx512dq), AVX2, SSE4.2 paths
  • aarch64 — ARM NEON path (always present on AArch64)
  • wasm32 — WASM SIMD128 path (simd/wasm.rs, core::arch::wasm32 intrinsics)
  • All other targets — scalar fallback only

Tier 2: Runtime SimdCapabilities detection. Even on x86_64, AVX-512 may not be available at runtime. detect_simd probes the CPU at startup using is_x86_feature_detected! and fills a SimdCapabilities struct:

use oximedia_codec::simd::{SimdCapabilities, detect_simd};

let caps: SimdCapabilities = detect_simd();
if caps.avx512 {
    // 512-bit vector path — Ice Lake, Skylake-X, Zen 4+
} else if caps.avx2 {
    // 256-bit vector path — Haswell 2013+, Excavator 2015+
} else if caps.neon {
    // ARM NEON path — all ARMv8/AArch64
} else {
    // Pure scalar fallback
}

The get_simd() helper encapsulates the dispatch and returns a &'static dyn SimdOps:

use oximedia_codec::simd::get_simd;

let ops = get_simd();  // picks AVX-512 → AVX2 → NEON → scalar
ops.sad_8x8(&src, &ref_block); // calls fastest available path

Tier 3: Scalar fallback. ScalarFallback provides a 100% pure-Rust implementation of every SimdOps operation. It is always compiled in and always selected when no SIMD extension is detected. This means OxiMedia:

  • compiles on any Rust target (including wasm32, riscv64, mips, etc.)
  • runs correctly on any hardware, even without SIMD support
  • achieves SIMD acceleration silently when the extension is available

No unsafe dispatch tables or runtime dynamic linking are used; all dispatch paths are statically allocated (static AVX2_INSTANCE: Avx2Simd = Avx2Simd) and accessed via a single &'static dyn SimdOps fat pointer.

Re-exports§

pub use blend::blend_ops;
pub use blend::BlendOps;
pub use dct::dct_ops;
pub use dct::DctOps;
pub use filter::filter_ops;
pub use filter::FilterOps;
pub use sad::sad_ops;
pub use sad::SadOps;
pub use traits::SimdOps;
pub use traits::SimdOpsExt;
pub use traits::SimdSelector;
pub use types::I16x16;
pub use types::I16x8;
pub use types::I32x4;
pub use types::I32x8;
pub use types::U8x16;
pub use types::U8x32;
pub use arm::NeonSimd;
pub use scalar::ScalarFallback;
pub use x86::Avx2Simd;
pub use x86::Avx512Simd;
pub use av1::CdefSimd;
pub use av1::IntraPredSimd;
pub use av1::LoopFilterSimd;
pub use av1::MotionCompSimd;
pub use av1::TransformSimd;
pub use vp9::Vp9DctSimd;
pub use vp9::Vp9InterpolateSimd;
pub use vp9::Vp9IntraPredSimd;
pub use vp9::Vp9LoopFilterSimd;

Modules§

arm
ARM NEON SIMD implementations.
av1
AV1-specific SIMD operations.
blend
Blending operations for video codec implementations.
dct
Discrete Cosine Transform (DCT) operations.
filter
Filter operations for video codec implementations.
pixel_convert
SIMD-accelerated pixel format conversion.
sad
Sum of Absolute Differences (SAD) operations.
scalar
Portable scalar fallback implementation.
traits
SIMD operation traits for video codec implementations.
types
Common SIMD types and type aliases.
vp9
VP9-specific SIMD operations.
x86
x86_64 SIMD implementations (AVX2 and AVX-512).
yuv_convert
SIMD-accelerated YUV subsampling format conversion.

Structs§

SimdCapabilities
CPU SIMD capabilities.

Enums§

TransformImpl
Transform implementation selection.

Functions§

detect_capabilitiesDeprecated
Legacy capabilities detection (deprecated, use detect_simd instead).
detect_simd
Detect CPU SIMD capabilities at runtime.
get_simd
Get the best SIMD implementation for the current CPU.
get_simd_ext
Get the best extended SIMD implementation for the current CPU.
scalar_simdDeprecated
Legacy scalar SIMD accessor (deprecated, use ScalarFallback directly).
select_transform_impl
Select the best transform implementation for the current CPU.