Skip to main content

Crate oximedia_simd

Crate oximedia_simd 

Source
Expand description

Hand-written assembly SIMD kernels for OxiMedia

This crate provides highly optimized assembly implementations of critical performance paths in the OxiMedia video codec, including:

  • DCT (Discrete Cosine Transform) in various sizes
  • Interpolation kernels (bilinear, bicubic, 8-tap)
  • SAD (Sum of Absolute Differences) for motion estimation

All assembly is wrapped in safe Rust APIs with proper alignment checks, buffer validation, and runtime CPU feature detection.

Modules§

accumulator
Running statistics accumulator for audio/video sample streams.
alpha_premul
Alpha pre-multiplication and un-pre-multiplication helpers.
audio_ops
SIMD-optimized audio buffer operations.
avx512
AVX-512 SIMD implementations for high-throughput media processing.
bitwise_ops
Bitwise SIMD-style operations for multimedia processing.
blend
Alpha blending and compositing operations.
blend_simd
SIMD-optimized blending operations
color_convert_simd
BT.709 RGB ↔ YUV colour conversion using scalar SIMD-friendly code.
color_space
Color space conversion operations optimized for SIMD-friendly access patterns.
convolution
SIMD convolution operations
deblock_filter
SIMD-accelerated deblocking filter for codec post-processing.
dispatch
Runtime-dispatched SIMD wrappers for core media processing operations.
dot_product
Dot product operations for SIMD-accelerated signal processing.
entropy_coding
SIMD helpers for bit packing / unpacking used in entropy coding.
filter
SIMD-friendly image filtering operations.
fixed_point
Fixed-point arithmetic helpers for SIMD-friendly computations.
gather_scatter
SIMD-style gather and scatter memory operations.
histogram
Histogram computation for pixel data analysis.
interleave
Audio channel interleaving and de-interleaving.
lookup_table
Lookup table (LUT) primitives for fast pixel transformations.
math_ops
SIMD-optimized mathematical operations on f32 slices.
matrix
SIMD-friendly 4×4 matrix operations.
min_max
SIMD-optimized minimum and maximum operations for pixel/sample data.
motion_search
SIMD-accelerated motion search using diamond and hexagonal search patterns.
neon
ARM NEON SIMD backend for OxiMedia.
pack_unpack
Pixel packing and unpacking utilities for SIMD pipelines.
pixel_ops
SIMD-optimized pixel / image operations.
portable
Portable SIMD-friendly operations using scalar code structured for auto-vectorization by the compiler.
prefix_sum
Parallel prefix sum (scan) operations for multimedia processing.
psnr
Peak Signal-to-Noise Ratio (PSNR) computation kernel.
reduce
Reduction operations (sum, min, max, mean, variance, normalize) on f32 slices.
resize
SIMD-accelerated image scaling (resize) with bilinear and Lanczos filters.
satd
Sum of Absolute Transformed Differences (SATD) kernel.
saturate
Saturating arithmetic primitives for media pixel pipelines.
simd_bench
Runtime benchmark utilities for oximedia-simd.
ssim
Structural Similarity Index Measure (SSIM) kernel with scalar and SIMD-accelerated implementations.
swizzle
SIMD lane swizzle, permute, and shuffle operations.
threshold
SIMD threshold operations
transpose
Block transpose primitives for matrix and image data.
vector_math
Vector math types: Vec2, Vec3, Vec4.
yuv_ops
YUV colour-space conversion helpers.

Structs§

CpuFeatures
CPU features detected at runtime.

Enums§

BlockSize
Block sizes for SAD operations
DctSize
DCT transform sizes
InterpolationFilter
Interpolation filter types
SimdError
Error types for SIMD operations

Functions§

detect_cpu_features
Detect CPU features at runtime
forward_dct
Perform forward DCT transform
has_neon
Returns true when the executing CPU provides NEON (always true on aarch64).
interpolate
Perform interpolation for motion compensation
inverse_dct
Perform inverse DCT transform
is_aligned
Check if a pointer is properly aligned for SIMD operations
sad
Calculate Sum of Absolute Differences (SAD)
validate_avx2_alignment
Validate buffer alignment for AVX2 (32-byte alignment)
validate_avx512_alignment
Validate buffer alignment for AVX-512 (64-byte alignment)
validate_neon_alignment
Validate buffer alignment for NEON (16-byte alignment)

Type Aliases§

Result