Expand description
SIMD abstraction layer for video codec implementations.
This module provides a unified interface for SIMD operations used in video encoding and decoding. It abstracts over different SIMD instruction sets (AVX2, AVX-512, NEON) while providing a scalar fallback for portability.
§Architecture
The SIMD abstraction consists of:
- Types (
types.rs): Vector types likeI16x8,I32x4,U8x16 - Traits (
traits.rs):SimdOpsandSimdOpsExtfor SIMD operations - Architecture-specific: x86 (AVX2/AVX-512), ARM (NEON), scalar fallback
- Codec-specific: AV1 and VP9 optimized operations
- Operations: Domain-specific modules for codec operations
§Usage
use oximedia_codec::simd::{detect_simd, select_transform_impl};
// Detect SIMD capabilities
let caps = detect_simd();
println!("Best SIMD: {}", caps.best_level());
// Use codec-specific SIMD operations
use oximedia_codec::simd::av1::TransformSimd;
let transform = TransformSimd::new(select_transform_impl());
transform.forward_dct_8x8(&input, &mut output);§Feature Detection and Dispatch
The SIMD implementation is selected at runtime based on CPU capabilities:
use oximedia_codec::simd::{SimdCapabilities, detect_simd};
let caps = detect_simd();
if caps.avx512 {
// Use AVX-512 optimized path
} else if caps.avx2 {
// Use AVX2 path
} else if caps.neon {
// Use ARM NEON path
} else {
// Use scalar fallback
}§SIMD Dispatch Mechanism
OxiMedia uses a two-tier dispatch strategy to guarantee correctness on every target while achieving maximum throughput on modern hardware.
Tier 1: Compile-time cfg selection.
Target-specific code paths are gated with #[cfg(target_arch = "...")], so only the
code relevant to the current build target is compiled in:
x86_64— AVX-512 (avx512f+avx512bw+avx512dq), AVX2, SSE4.2 pathsaarch64— ARM NEON path (always present on AArch64)wasm32— WASM SIMD128 path (simd/wasm.rs,core::arch::wasm32intrinsics)- All other targets — scalar fallback only
Tier 2: Runtime SimdCapabilities detection.
Even on x86_64, AVX-512 may not be available at runtime. detect_simd probes the
CPU at startup using is_x86_feature_detected! and fills a SimdCapabilities struct:
use oximedia_codec::simd::{SimdCapabilities, detect_simd};
let caps: SimdCapabilities = detect_simd();
if caps.avx512 {
// 512-bit vector path — Ice Lake, Skylake-X, Zen 4+
} else if caps.avx2 {
// 256-bit vector path — Haswell 2013+, Excavator 2015+
} else if caps.neon {
// ARM NEON path — all ARMv8/AArch64
} else {
// Pure scalar fallback
}The get_simd() helper encapsulates the dispatch and returns a &'static dyn SimdOps:
use oximedia_codec::simd::get_simd;
let ops = get_simd(); // picks AVX-512 → AVX2 → NEON → scalar
ops.sad_8x8(&src, &ref_block); // calls fastest available pathTier 3: Scalar fallback.
ScalarFallback provides a 100% pure-Rust implementation of every SimdOps
operation. It is always compiled in and always selected when no SIMD extension is
detected. This means OxiMedia:
- compiles on any Rust target (including
wasm32,riscv64,mips, etc.) - runs correctly on any hardware, even without SIMD support
- achieves SIMD acceleration silently when the extension is available
No unsafe dispatch tables or runtime dynamic linking are used; all dispatch paths are
statically allocated (static AVX2_INSTANCE: Avx2Simd = Avx2Simd) and accessed
via a single &'static dyn SimdOps fat pointer.
Re-exports§
pub use blend::blend_ops;pub use blend::BlendOps;pub use dct::dct_ops;pub use dct::DctOps;pub use filter::filter_ops;pub use filter::FilterOps;pub use sad::sad_ops;pub use sad::SadOps;pub use traits::SimdOps;pub use traits::SimdOpsExt;pub use traits::SimdSelector;pub use types::I16x16;pub use types::I16x8;pub use types::I32x4;pub use types::I32x8;pub use types::U8x16;pub use types::U8x32;pub use arm::NeonSimd;pub use scalar::ScalarFallback;pub use x86::Avx2Simd;pub use x86::Avx512Simd;pub use av1::CdefSimd;pub use av1::IntraPredSimd;pub use av1::LoopFilterSimd;pub use av1::MotionCompSimd;pub use av1::TransformSimd;pub use vp9::Vp9DctSimd;pub use vp9::Vp9InterpolateSimd;pub use vp9::Vp9IntraPredSimd;pub use vp9::Vp9LoopFilterSimd;
Modules§
- arm
- ARM NEON SIMD implementations.
- av1
- AV1-specific SIMD operations.
- blend
- Blending operations for video codec implementations.
- dct
- Discrete Cosine Transform (DCT) operations.
- filter
- Filter operations for video codec implementations.
- pixel_
convert - SIMD-accelerated pixel format conversion.
- sad
- Sum of Absolute Differences (SAD) operations.
- scalar
- Portable scalar fallback implementation.
- traits
- SIMD operation traits for video codec implementations.
- types
- Common SIMD types and type aliases.
- vp9
- VP9-specific SIMD operations.
- x86
- x86_64 SIMD implementations (AVX2 and AVX-512).
- yuv_
convert - SIMD-accelerated YUV subsampling format conversion.
Structs§
- Simd
Capabilities - CPU SIMD capabilities.
Enums§
- Transform
Impl - Transform implementation selection.
Functions§
- detect_
capabilities Deprecated - Legacy capabilities detection (deprecated, use
detect_simdinstead). - detect_
simd - Detect CPU SIMD capabilities at runtime.
- get_
simd - Get the best SIMD implementation for the current CPU.
- get_
simd_ ext - Get the best extended SIMD implementation for the current CPU.
- scalar_
simd Deprecated - Legacy scalar SIMD accessor (deprecated, use
ScalarFallbackdirectly). - select_
transform_ impl - Select the best transform implementation for the current CPU.