poulpy-cpu-avx 0.6.0

A crate providing concrete AVX accelerated CPU implementations of poulpy-hal through its open extension points
docs.rs failed to build poulpy-cpu-avx-0.6.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build: poulpy-cpu-avx-0.1.1

🐙 Poulpy-CPU-AVX

Poulpy-CPU-AVX is a Rust crate that provides an AVX2 + FMA accelerated CPU backend for Poulpy.

This backend implements the Poulpy HAL extension traits and can be used by:

🚩 Safety and Requirements

To avoid illegal hardware instructions (SIGILL) on unsupported CPUs, this backend is opt-in and only builds when explicitly requested.

Requirement Status
Cargo feature flag --features enable-avx must be enabled
CPU architecture x86_64
CPU target features AVX2 + FMA

If enable-avx is enabled but the target does not provide these capabilities, the build fails immediately with a clear error message, rather than generating invalid binaries.

When enable-avx is not enabled, this crate is simply skipped and Poulpy automatically falls back to the portable poulpy-cpu-ref backend. This ensures that Poulpy's workspace remains portable (e.g. for macOS ARM).

⚙️ Building with the AVX backend enabled

Because the compiler must generate AVX2 + FMA instructions, both the Cargo feature and CPU target flags must be specified:

RUSTFLAGS="-C target-feature=+avx2,+fma" \
cargo build --features enable-avx

Running an example

RUSTFLAGS="-C target-feature=+avx2,+fma" \
cargo run --example <name> --features enable-avx

Running benchmarks

RUSTFLAGS="-C target-feature=+avx2,+fma" \
cargo bench --features enable-avx

Running Tests

RUSTFLAGS="-C target-feature=+avx2,+fma" \
cargo test -p poulpy-cpu-avx --features enable-avx

To include CKKS backend wiring in the AVX test build:

RUSTFLAGS="-C target-feature=+avx2,+fma" \
cargo test -p poulpy-cpu-avx --features enable-avx,enable-ckks

Basic Usage

This crate exposes two AVX2-accelerated backends:

use poulpy_cpu_avx::{FFT64Avx, NTT120Avx};
use poulpy_hal::{api::ModuleNew, layouts::Module};

let log_n: usize = 10;

// f64 FFT backend (AVX2 + FMA)
let module: Module<FFT64Avx> = Module::<FFT64Avx>::new(1 << log_n);

// Q120 NTT backend (AVX2, CRT over four ~30-bit primes)
let module: Module<NTT120Avx> = Module::<NTT120Avx>::new(1 << log_n);

Once compiled with enable-avx, both backends are usable anywhere Poulpy expects a backend type in the HAL/core/CKKS layers. poulpy-bin-fhe has a separate enable-avx feature for its current crate-local integration.

🤝 Contributors

To implement your own Poulpy backend (SIMD or accelerator):

  1. Define a backend struct and implement the Backend trait from poulpy-hal.
  2. For each HAL operation family, either call the blanket default or implement the OEP trait directly with a custom dispatch.
  3. For each poulpy-core operation family, either call the corresponding impl_*_defaults_full! macro to inherit the portable implementation, or implement the OEP trait directly to override it.
  4. Optionally, do the same for poulpy-ckks behind a backend-owned enable-ckks feature using the impl_ckks_*_defaults! macros or direct OEP trait implementations.

At every layer the macro and the direct implementation are mutually exclusive per operation family: the macro opts the backend into the portable default path, while a direct OEP impl replaces it entirely. There is no requirement to use the macros — a backend that needs full control can implement every OEP trait by hand.

Your backend will automatically integrate with the backend-generic layers:

  • poulpy-hal
  • poulpy-core
  • poulpy-ckks

No modifications to those crates are required — the HAL provides the extension points. Scheme crates that still carry crate-specific backend glue, such as parts of poulpy-bin-fhe in v0.6.0, may need follow-up integration work. Only operations that need a faster implementation require explicit overrides; everything else is inherited from the default layer for free.


For questions or guidance, feel free to open an issue or discussion in the repository.