zoomvtools 1.1.1

Video motion vector analysis utilities in pure Rust
Documentation
# SAD KERNELS KNOWLEDGE BASE

**Generated:** 2026-04-17

## OVERVIEW

Sum of Absolute Differences kernels for block matching. Core metric for motion search.

## STRUCTURE

```
src/sad/
├── rust.rs    # Scalar implementation
├── avx2.rs    # AVX2 SIMD implementation
├── avx512.rs  # AVX-512 SIMD implementation with AVX2 fallback below thresholds
├── tests.rs   # Test generator macro (get_sad_tests!)
└── sad.rs     # Module root, dispatch
```

## WHERE TO LOOK

| Area          | File        | Notes                                                                                                    |
| ------------- | ----------- | -------------------------------------------------------------------------------------------------------- |
| Scalar SAD    | `rust.rs`   | Baseline reference implementation                                                                        |
| AVX2 SAD      | `avx2.rs`   | SIMD-optimized, compiled with `avx2` feature                                                             |
| AVX-512 SAD   | `avx512.rs` | Compiled with `avx512` feature; preferred on `has_avx512_skylake()`, falls back to AVX2 for small blocks |
| Test coverage | `tests.rs`  | `get_sad_tests!` macro generates tests for both backends                                                 |
| Dispatch      | `sad.rs`    | `#[cfg]` + runtime CPU feature check                                                                     |

## CONVENTIONS

- Follows standard `rust.rs` + `avx2.rs` + `avx512.rs` + `tests.rs` pattern.
- `avx2` feature gates AVX2 backend compilation; `avx512` feature gates AVX-512 compilation and implies `avx2`.
- Dispatch via `#[cfg]` and checking `crate::util::has_avx512_skylake()` before `has_avx2()` at runtime.
- AVX-512 thresholds: u8 widths below 32 and u16 widths below 16 pass through to AVX2 via const-generic dispatch.
- Test generator `get_sad_tests!($module)` produces tests for `rust`, `avx2`, and `avx512` backends.
- Tested across sizes: 2x2 through 128x128, plus edge cases (non-square, odd dimensions).
- Both `u8` and `u16` pixel types covered.
- Pitch/padding scenarios tested systematically.

## ANTI-PATTERNS

- Diverging results between scalar and SIMD paths (validated by `verify_asm!` macro).
- Missing edge case coverage for non-square blocks.
- Using direct indexing instead of `semisafe_get()` in test code.