Skip to main content

gmcrypto_simd/
lib.rs

1//! SIMD backends for `gmcrypto-core` (v0.5 W4).
2//!
3//! This crate quarantines the unavoidable SIMD `unsafe` (AVX2
4//! intrinsics on `x86_64`, NEON on `aarch64` in W4 phase 3) so that
5//! `gmcrypto-core` itself can keep `unsafe_code = "forbid"`. The
6//! posture mirrors the established [`gmcrypto-c`] precedent (FFI
7//! shim with `unsafe_code = "warn"`).
8//!
9//! The crate exposes a small Rust-internal API surface only (no raw
10//! pointers, no C ABI). It is `rlib`-only; the single C-ABI surface
11//! for downstream callers remains [`gmcrypto-c`].
12//!
13//! # v0.5 W4 phase 2 scope
14//!
15//! - x86_64 AVX2 8-way packed bitsliced SM4 S-box
16//!   ([`sm4::sbox_x8::sbox_x8`]), with runtime AVX2 detection via the
17//!   `cpufeatures` crate and silent scalar fallback on non-AVX2 CPUs.
18//! - Scalar fallback path delegates to `gmcrypto-core`'s v0.4 W3
19//!   single-block bitslice ([`gmcrypto_core::sm4::sbox_bitsliced::sbox`]),
20//!   so the byte-output is identical across AVX2-on / AVX2-off /
21//!   non-x86_64 dispatch.
22//!
23//! # v0.5 W4 phase 3 scope (deferred)
24//!
25//! - aarch64 NEON 4-way bitsliced S-box (NEON is baseline on
26//!   `aarch64` — no runtime detection needed).
27//! - `Sm4CbcDecryptor::process_chunk` SIMD fanout per Q5.10 — the
28//!   public-API surface that batches 8 (or 4 on NEON) ciphertext
29//!   blocks at once. Until phase 3 lands, the phase 2 SIMD path is
30//!   exercised with 7 of 8 lanes carrying replicated input.
31//!
32//! [`gmcrypto-c`]: https://docs.rs/gmcrypto-c
33
34#![no_std]
35
36pub mod sm4;
37
38mod detect;
39
40pub use detect::has_avx2;