# gm-crypto-rs
Constant-time-designed pure-Rust SM2 / SM3 / SM4 SDK for Chinese national
cryptography (GB/T 32905 / 32918 / 32907 / GM/T 0009). Sign / verify,
public-key encrypt / decrypt, SM4-CBC, SM4-CTR (single-shot + streaming),
length-flexible batched SM4 block encryption, HMAC-SM3, PBKDF2-HMAC-SM3 —
all secret-touching paths guarded by an in-CI `dudect-bencher`
detectable-leak regression harness.
[](https://crates.io/crates/gmcrypto-core)
[](https://docs.rs/gmcrypto-core)
[](https://crates.io/crates/gmcrypto-core)
**Personal project notice:** not affiliated with, endorsed by, sponsored by, or
certified by any upstream cryptography project, payment gateway, standards body,
or vendor.
## What this is
A small, auditable, pure-Rust SM2 / SM3 / SM4 SDK whose central
differentiating commitment is that secret-touching code paths are
**constant-time-designed and guarded by an in-CI [`dudect-bencher`](https://docs.rs/dudect-bencher/)
detectable-leak regression harness**: 17 real `ct_*` targets (12
always-on + 2 cfg-gated under `sm4-bitsliced-simd` + 3 cfg-gated under
`sm4-aead`) plus a deliberately-leaky `negative_control` that proves
the harness can detect leaks. Most real targets gate at `|tau| < 0.20`;
`ct_sign_k_class` and the direct `ct_fn_invert` / `ct_fp_invert` invert
diagnostics carry target-specific gate policy after the 2026-05-12
recalibration — see [`SECURITY.md`](SECURITY.md) and
[`docs/v0.5-dudect-recalibration.md`](docs/v0.5-dudect-recalibration.md).
The harness reports timing-leak detection events. **It does not prove
constant-time.** Low `|tau|` values mean the test could not detect a leak with
the budget given, not that no leak exists. Language taken directly from
`dudect-bencher`'s own docs.
The harness covers: SM2 sign (split by both private key `d` and nonce
`k` magnitude, with both retry nonces class-tied), SM2 decrypt (split
by recipient `d_B`), SM4 key schedule + single-block encrypt (split by
master key, under default linear-scan and `sm4-bitsliced` paths), the
v0.5 SIMD-packed dispatch (`ct_sm4_encrypt_block_bitsliced_simd`,
cfg-gated), v0.6's batched CBC-decrypt fanout
(`ct_sm4_cbc_decrypt_fanout`, cfg-gated), v0.7's SM4-CTR encrypt
(`ct_sm4_ctr_encrypt`, exercising the public batch path on every
cipher matrix entry), v0.8's SM4-GCM + SM4-CCM decrypt
(`ct_sm4_gcm_decrypt` and `ct_sm4_ccm_decrypt`, cfg-gated on
`sm4-aead`), v0.9's incremental-input buffered SM4-GCM decrypt
(`ct_sm4_gcm_decrypt_buffered`, cfg-gated on `sm4-aead`), HMAC-SM3
(split by key), encrypted-PKCS#8
decrypt (split by password bytes — both classes' blobs valid for their
class's password so both succeed via identical control flow), plus
direct `Fn::invert` and `Fp::invert` diagnostics. The `ct_sign_k_class`
target closes v0.1's structural blind spot to nonce-only leaks.
The `crypto-bigint 0.6 → 0.7.3` upgrade resolved the v0.1-era
`ConstMontyForm::invert` leak directly: on the v0.2 W0 harness both
direct invert diagnostics measured under `|tau| ≈ 0.01`, two orders of
magnitude below the gate. Subsequent GH Actions runner-image drift on
2026-05-12 raised the empirical noise floor on `ct_fn_invert` /
`ct_fp_invert` — both targets moved to PR-smoke telemetry + a nightly
gross-regression sentinel at `|tau| ≥ 0.55`. See
[`docs/v0.5-dudect-recalibration.md`](docs/v0.5-dudect-recalibration.md)
for the data and posture. See [`SECURITY.md`](SECURITY.md) for the full
constant-time discipline.
The differentiator vs. existing Rust SM2 crates (notably
[`RustCrypto/sm2`](https://docs.rs/sm2/), which already aims for constant-time
secret-dependent operations in its design) is **the in-CI regression gate**, not
the design intent in isolation.
## What this isn't
- Not a TLS/TLCP implementation.
- Not SM9, ZUC, post-quantum.
- Not an HSM/SDF/SKF integration.
- Not a certified cryptographic module.
- Not constant-time on CPUs with data-dependent multiply latencies (some older
x86, some embedded).
- Not a comprehensive SM-crypto library yet — see the milestone roadmap.
## v0.12 scope (shipping)
**SM4-XTS — tweakable mode for disk/sector encryption.** v0.12 adds
`sm4::mode_xts` behind the new opt-in `sm4-xts` feature: single-shot, full
ciphertext stealing, GB/T 17964-2021 (GM-T OID `1.2.156.10197.1.104.10`),
byte-identical to OpenSSL 3.x EVP `SM4-XTS` (`xts_standard=GB`). Design
rationale: [`docs/v0.12-scope.md`](docs/v0.12-scope.md); KAT sourcing:
[`docs/v0.12-xts-kat-sourcing.md`](docs/v0.12-xts-kat-sourcing.md).
- **Default-features users are unaffected** — additive, opt-in, **no new
dependency** (the XTS tweak doubling is a trivial bit-reflected
multiply-by-x, not GHASH, so no `gmcrypto-simd` dep).
- **GB/T 17964, not IEEE 1619** — the two standards differ in the GF(2¹²⁸)
tweak-doubling convention (GB is the bit-reflected / GHASH-style one), so
they produce different ciphertext for multi-block / non-aligned data. v0.12
targets GB (the SM4 national standard + OpenSSL's default for SM4-XTS).
- **Confidentiality only — no authentication.** XTS has no tag; callers needing
integrity use an AEAD mode (GCM/CCM). The per-data-unit tweak-uniqueness
contract is the caller's responsibility.
- 32-byte key (`Key1 ‖ Key2`) + raw 16-byte tweak; lengths `[16 B, 16 MiB]`;
single `None` failure mode. New dudect target `ct_sm4_xts_decrypt`. The whole-
block bulk rides the `Sm4Cipher::encrypt_blocks` batch API (picks up the SIMD
fanout under `sm4-bitsliced-simd`).
**Deferred to v0.13** (per [`docs/v0.12-scope.md`](docs/v0.12-scope.md) §5/§6):
C FFI for SM4-XTS, RustCrypto `aead` trait fit, pinned/noise-isolated dudect
runner, AVX-512 `sbox_x64`, CCM incremental input.
## v0.11 scope (shipped)
**RustCrypto trait-fit modernization.** v0.11 migrates the opt-in
`digest-traits` / `cipher-traits` impls from `digest 0.10` / `cipher 0.4` to
`digest 0.11` / `cipher 0.5` (the `crypto-common 0.2` / `hybrid-array`
generation), in-place. Design rationale:
[`docs/v0.11-scope.md`](docs/v0.11-scope.md).
- **Default-features users are unaffected** — the trait fit is opt-in;
`generic-array` / `hybrid-array` never enter the default dep graph, and every
SM2 / SM3 / SM4 / HMAC / AEAD output is byte-identical (validated against the
full KAT suite + gmssl 3.1.1 interop).
- **BREAKING for trait-fit consumers only:** code enabling
`digest-traits` / `cipher-traits` must bump its own `digest` / `cipher` deps
to `0.11` / `0.5`. HMAC construction via the `Mac` trait moves to
`digest::KeyInit::new_from_slice` (`digest 0.11`'s `Mac` dropped `KeyInit`);
the `cipher` block traits renamed `BlockEncrypt` / `BlockDecrypt` →
`BlockCipherEncrypt` / `BlockCipherDecrypt`.
- **MSRV stays 1.85.** The RustCrypto `aead 0.6` trait fit remains deferred
(still `0.6.0-rc.10`); v0.11 lands the `crypto-common 0.2` line it will need.
**Deferred to v0.12** (per [`docs/v0.11-scope.md`](docs/v0.11-scope.md) §5/§6):
RustCrypto `aead` trait fit, pinned/noise-isolated dudect runner, AVX-512
`sbox_x64`, SM4-XTS, CCM incremental input, Argon2-with-SM3.
## v0.10 scope (shipped)
**Streaming AEAD FFI.** v0.10 exposes the v0.9 incremental-input buffered
SM4-GCM encryptor/decryptor through the `gmcrypto-c` C ABI — the item
v0.9 deferred (Q9.6) now that the Rust streaming API is proven. Additive
behind the existing `sm4-aead` feature. Design rationale:
[`docs/v0.10-scope.md`](docs/v0.10-scope.md).
- **9 streaming AEAD C FFI symbols + 2 opaque handle types** —
`gmcrypto_sm4_gcm_encryptor_t` (output-streaming: `new` / `update` →
ciphertext per chunk / `finalize` + `finalize_with_tag_len` → tag /
`free`) and `gmcrypto_sm4_gcm_decryptor_t` (commit-on-verify: `new` /
`update` buffers and emits **nothing** / `finalize_verify` releases
plaintext only after the constant-time tag check / `free`).
`_finalize*` consume+free the handle; single `GMCRYPTO_ERR` on every
failure (no tag-/length-oracle across the boundary). Mirrors the v0.5
CBC-streaming lifecycle. C example:
[`examples/sm4_gcm_streaming.c`](crates/gmcrypto-c/examples/sm4_gcm_streaming.c).
**No public API breakage — purely additive.** v0.9.0 callers can
`cargo update` to v0.10.0 without migration. No new `gmcrypto-core` API;
no new dudect target (the FFI is a thin wrapper over the v0.9
`ct_sm4_gcm_decrypt_buffered`-gated path).
**Deferred to v0.11** (per [`docs/v0.10-scope.md`](docs/v0.10-scope.md)
§5/§6): streaming/incremental CCM, RustCrypto `aead` trait fit (upstream
still `0.6.0-rc.10`), pinned dudect runner, AVX-512 `sbox_x64`, SM4-XTS,
Argon2-with-SM3.
## v0.9 scope (shipped)
**AEAD ergonomics.** v0.9 extends the v0.8 AEAD core with the three
items v0.8 deferred: GCM tag-length parameterization, incremental-input
buffered SM4-GCM, and single-shot AEAD C FFI. All additive behind the
existing `sm4-aead` flag. Design rationale: [`docs/v0.9-scope.md`](docs/v0.9-scope.md).
- **`sm4::GcmTagLen` + `mode_gcm::encrypt_with_tag_len` /
`decrypt_with_tag_len`** — W1. Caller-chosen GCM tag length per NIST
SP 800-38D §5.2.1.2 (`{4, 8, 12, 13, 14, 15, 16}` bytes; truncated
tag = `MSB_t(full_tag)`). `GcmTagLen::new(usize) -> Option<Self>`
centralizes the valid-length policy. The fixed-16-byte `encrypt` /
`decrypt` are unchanged.
- **`sm4::Sm4GcmEncryptor` / `Sm4GcmDecryptor`** — W2. Incremental-
input buffered SM4-GCM (deliberately NOT "streaming"). The
**encryptor** is output-streaming: `update(chunk) -> Option<Vec<u8>>`
emits each chunk's ciphertext (`None` once the cumulative plaintext
would exceed the NIST §5.2.1.1 ceiling `2^36 − 32` bytes);
`finalize()` / `finalize_with_tag_len()` emit the tag. The
**decryptor** is input-incremental but output-BUFFERED:
`update(chunk)` buffers ciphertext + folds GHASH, and
`finalize_verify(tag) -> Option<Vec<u8>>` releases the plaintext only
after the constant-time tag check (commit-on-verify — never leaks
pre-verify bytes). AAD is supplied at construction. Driven with any
chunking, both reproduce the single-shot path byte-for-byte.
- **6 single-shot AEAD C FFI entry points** — W4. `gmcrypto_sm4_gcm_
encrypt` / `_decrypt` / `_encrypt_with_tag_len` / `_decrypt_with_tag_
len` + `gmcrypto_sm4_ccm_encrypt` / `_decrypt`, behind a new
forwarding `sm4-aead` feature on `gmcrypto-c`. Every error path
returns `GMCRYPTO_ERR` (single failure code). Streaming AEAD FFI is
deferred to v0.10.
- **New dudect target `ct_sm4_gcm_decrypt_buffered`** — W3. Class-split
by master key, drives `Sm4GcmDecryptor`; `|tau| < 0.20` (5K-sample
smoke `|τ| ≈ 0.029`). No new CI matrix slot — rides the existing
`sm4-aead` entries.
**No public API breakage — purely additive.** v0.8.0 callers can
`cargo update` to v0.9.0 without migration; `sm4-aead` is opt-in.
**Deferred to v0.10** (per [`docs/v0.9-scope.md`](docs/v0.9-scope.md)
§5/§6): CCM incremental input, streaming AEAD FFI, RustCrypto `aead`
trait fit (upstream still on `0.6.0-rc`), pinned dudect runner,
AVX-512 `sbox_x64`, SM4-XTS, Argon2-with-SM3.
## v0.8 scope (shipped)
The **AEAD core**. v0.8 cashed in the cipher-mode surface that v0.7
opened up: SM4-GCM and SM4-CCM single-shot, plus a constant-time
GHASH primitive in `gmcrypto-simd`.
- **`sm4::mode_gcm::encrypt` / `decrypt`** — W2. Single-shot SM4-GCM
per NIST SP 800-38D / GM/T 0009 / RFC 8998. `encrypt(key, nonce,
aad, pt) -> (Vec<u8>, [u8; 16])` returns `(ciphertext, tag)`.
`decrypt(key, nonce, aad, ct, tag) -> Option<Vec<u8>>` —
`Some(plaintext)` only when the tag verifies (constant-time
compare via `subtle::ConstantTimeEq`). Both 12-byte canonical
and arbitrary-length nonce paths supported. Tag length fixed at
128 bits in v0.8 (parameterized in v0.9 via `GcmTagLen`).
**Byte-identical to gmssl 3.1.1 `sm4 -gcm`** — bidirectional
interop validated.
- **`sm4::mode_ccm::encrypt` / `decrypt`** — W3. Single-shot SM4-CCM
per NIST SP 800-38C / RFC 3610 / GM/T 0009 (OID
`1.2.156.10197.1.104.9`). `encrypt(key, nonce, aad, pt, tag_len)
-> Option<Vec<u8>>` (output: `ciphertext ‖ tag`). `tag_len ∈
{4, 6, 8, 10, 12, 14, 16}` per spec, validated at API entry.
`nonce.len() ∈ [7, 13]`. Pure-Rust CBC-MAC + CTR over the
existing `Sm4Cipher` path — no GHASH. **Byte-identical to OpenSSL
3.x EVP `SM4-CCM`** across 8 KAT scenarios (gmssl 3.1.1 doesn't
ship `sm4 -ccm` so the CCM reference oracle comes from OpenSSL;
see [`docs/v0.8-ccm-kat-sourcing.md`](docs/v0.8-ccm-kat-sourcing.md)).
- **`gmcrypto_simd::ghash::ghash_mul(h, x) -> [u8; 16]`** — W1.
Constant-time GHASH multiplication over `GF(2^128) /
(x^128 + x^7 + x^2 + x + 1)`. Single dispatch entry point:
- `ghash_mul_clmul` on `x86_64` (PCLMULQDQ + SSE2; runtime
cpufeatures detect; Intel Westmere+ / AMD Bulldozer+).
- `ghash_mul_pmull` on `aarch64` (ARMv8.0 AES extension
`vmull_p64`; runtime cpufeatures detect; Apple Silicon /
most modern ARM chips).
- `ghash_mul_software` (bit-serial mask-XOR; constant-time over
both inputs; available everywhere as fallback).
- **New `sm4-aead` feature flag** — default-off; opt-in.
`sm4-aead = ["dep:gmcrypto-simd"]` activates `mode_gcm` and
`mode_ccm`. Additive on the default-features build.
- **New dudect targets `ct_sm4_gcm_decrypt` + `ct_sm4_ccm_decrypt`**
— W4. Class-split by master key over a fixed 256-byte
plaintext + 16-byte AAD. Both classes' `(ct, tag)` pairs are
valid encrypts under their **own** keys, so both decrypt paths
reach the tag-compare via identical control flow. Same
`|tau| < 0.20` gate as the rest of the SM4 surface; new CI
matrix slot `sm4-bitsliced-simd,sm4-aead` exercises the
most-demanding cipher-stack combination.
**No public API breakage — purely additive.** v0.7.0 callers can
`cargo update` to v0.8.0 without migration; `sm4-aead` is opt-in.
Everything v0.4 shipped (`wasm32-unknown-unknown` build, RustCrypto
trait fit behind `digest-traits` / `cipher-traits`, bitsliced SM4
S-box behind `sm4-bitsliced`, `gmcrypto-c` C ABI crate) is unchanged
— see the Roadmap row for the compact reference and `CHANGELOG.md`
`[0.4.0]` for detail.
Everything v0.3 shipped is unchanged:
- Reusable strict-canonical DER reader / writer subset
(`gmcrypto_core::asn1::{reader, writer, oid}`).
- PEM + encrypted PKCS#8 + X.509 SPKI + SEC1 codecs
(`gmcrypto_core::{pem, pkcs8, spki, sec1}`).
- Full bidirectional gmssl 3.1.1 interop (SM2 sign / verify, SM2
encrypt / decrypt, SM4-CBC). Gated on `GMCRYPTO_GMSSL=1`.
- Raw byte-concat SM2 ciphertext helpers
(`gmcrypto_core::sm2::raw_ciphertext`): `C1 || C3 || C2`
emit + decode; legacy `C1 || C2 || C3` decrypt-only.
- Streaming `HmacSm3` + `Sm4Cbc{En,De}cryptor`. In-crate
`Hash` / `Mac` / `BlockCipher` traits (`gmcrypto_core::traits`).
- Comb-table `mul_g` (~5× sign-side speedup). 64 sub-tables of 16
entries each, lazily built once per process via `spin::Once`.
Everything v0.2 shipped is unchanged:
- SM3 hash function (`#![no_std]` + `alloc`).
- SM2 sign / verify with custom signer ID (default `1234567812345678` per GM/T 0009).
- SM2 public-key encrypt / decrypt with GM/T 0009-2012 ciphertext DER
(`SEQUENCE { x, y, hash, ciphertext }`). Invalid-curve attack defense
via on-curve check on `C1` before scalar mult; non-branching
KDF-zero detection so a chosen-ciphertext attacker cannot distinguish
it from a normal MAC failure.
- SM4 block cipher (GB/T 32907-2016) and SM4-CBC (PKCS#7 padding,
caller-supplied unpredictable IV per NIST SP 800-38A Appendix C).
Constant-time-designed `subtle` linear-scan S-box (~1-2M blocks/s);
opt-in bitsliced (table-less, gate-only) S-box via the
`sm4-bitsliced` feature (v0.4 W3). PKCS#7 strip uses a
constant-time scan over the final block; `decrypt` collapses every
failure mode to a single `None` against padding-oracle attacks.
- HMAC-SM3 per RFC 2104, gmssl-cross-validated KAT vectors. Hash-first
long-key path. v0.3 adds the streaming `HmacSm3` shape alongside
single-shot `hmac_sm3`.
- PBKDF2-HMAC-SM3 per RFC 8018 §5.2. Caller-supplied output buffer
(no internal allocation, no iteration-count default).
- Constant-time-designed `Fp` and `Fn` field arithmetic via
`crypto-bigint = 0.7.3`.
- Renes-Costello-Batina complete addition formulas for the SM2 curve (a=-3 specialized).
- Fixed-base (v0.3 comb-table) and variable-base scalar multiplication,
both constant-time-designed with `subtle::ConditionallySelectable`
linear-scan table lookup.
- Fixed-K masked-select signing retry: the retry loop runs `K=2` iterations
unconditionally, regardless of which iteration produced a valid signature.
The constant-time contract holds for any RNG that respects `CryptoRng`;
pathological RNGs cannot leak the secret via observable retry count.
- Strict canonical ASN.1 DER for `SEQUENCE { r, s }` (signatures), the
GM/T 0009 SM2 ciphertext SEQUENCE, and all v0.3 PEM / PKCS#8 / SPKI
/ SEC1 wire formats. Rejects non-canonical leading-zero padding,
sign-bit-set first bytes, empty content, and (for ciphertext
coordinates) values `≥ p`.
- KAT vectors from GB/T 32905-2016 (SM3), GB/T 32918.2-2017 / .5-2017
(SM2), GB/T 32907-2016 Appendix A.1 (SM4 single-block + 1M-round),
GM/T 0042-2015 (HMAC-SM3), GM/T 0091-2020 (PBKDF2-HMAC-SM3).
- `gmssl` CLI cross-validation for HMAC-SM3, PBKDF2-HMAC-SM3, and
(new in v0.3) SM2 sign/verify, SM2 encrypt/decrypt, and SM4-CBC
in both directions. Gated on `GMCRYPTO_GMSSL=1`.
- `dudect-bencher` harness — 17 real `ct_*` targets (12 always-on + 2
cfg-gated under `sm4-bitsliced-simd` + 3 cfg-gated under `sm4-aead`)
plus a deliberately-leaky `negative_control` that proves the harness
can detect leaks. Matrix-run under `features=default`,
`sm4-bitsliced`, `sm4-bitsliced-simd`, and `sm4-bitsliced-simd,sm4-aead`
— PR-smoke 10⁴ samples; nightly 10⁵ samples (more samples = tighter
empirical confidence at the same threshold). Most real targets gate
at `|tau| < 0.20`; per-target policy in [`SECURITY.md`](SECURITY.md).
- Failure-mode invariant: every `Result`-returning public API uses
the workspace-wide `gmcrypto_core::Error` (single `Failed` variant,
`#[non_exhaustive]`); per-module aliases `sm2::Error`, `pem::Error`,
`pkcs8::Error` all point at the same type. `verify_with_id` returns
`bool`; DER decode returns `Option`. Defense against padding-oracle,
malleability, and invalid-curve attacks.
- Zeroization on private keys, SM4 round keys, HMAC `K'` /
`K' XOR ipad` / `K' XOR opad`, PBKDF2 intermediates, SM2 KDF
buffers, and PKCS#8 inner-key scratch.
## Roadmap
| Version | Scope |
|---|---|
| v0.2 (shipped) | SM4 + SM4-CBC, HMAC-SM3, PBKDF2-HMAC-SM3, SM2 encrypt/decrypt + GM/T 0009 ciphertext DER, dudect harness expansion to 11 targets. See [`CHANGELOG.md`](CHANGELOG.md) `[0.2.0]`. |
| v0.3 (shipped) | Reusable ASN.1 reader/writer subset; PEM, encrypted PKCS#8, X.509 SPKI, SEC1; full bidirectional gmssl interop (incl. SM2 sign/verify + SM2 encrypt/decrypt with PEM-wrapped keys + SM4-CBC); raw byte-concat ciphertext helpers (`C1\|\|C3\|\|C2` modern + legacy `C1\|\|C2\|\|C3` decrypt); streaming `HmacSm3` / `Sm4CbcEncryptor` / `Sm4CbcDecryptor` + in-crate `Hash`/`Mac`/`BlockCipher` traits; comb-table `mul_g` (~5× sign-side speedup); dudect harness expanded to 12 targets. See [`CHANGELOG.md`](CHANGELOG.md) `[0.3.0]`. |
| v0.4 (shipped) | `wasm32-unknown-unknown` build target; RustCrypto-trait fit (`digest::Digest` / `digest::Mac` / `cipher::BlockEncrypt`/`BlockDecrypt`) behind opt-in `digest-traits` / `cipher-traits` feature flags; bitsliced (table-less, gate-only) SM4 S-box behind the opt-in `sm4-bitsliced` feature; new `gmcrypto-c` workspace member exposing the SM2/SM3/SM4/HMAC/PBKDF2 surface as a C ABI (cdylib + staticlib + cbindgen-generated header). See [`CHANGELOG.md`](CHANGELOG.md) `[0.4.0]`. |
| v0.5.0 (shipped) | C-ABI completeness (streaming CBC + raw-byte SM2 ciphertext + caller-supplied RNG callback); `sm4-bitsliced-simd` feature-flag scaffolding — v0.5.0 ships no SIMD fast path (the feature transparently delegates to the v0.4 single-block bitslice); BREAKING ergonomic cleanup — workspace-wide `gmcrypto_core::Error`, `Sm2PrivateKey::new(U256)` → `from_scalar(U256)` (gated behind `crypto-bigint-scalar`) + always-on `from_bytes_be(&[u8; 32])` constructor, `std` feature removed. See [`CHANGELOG.md`](CHANGELOG.md) `[0.5.0]`. |
| v0.5.1 (shipped) | W4 phase 2 — new sibling crate `gmcrypto-simd` carrying an **AVX2 8-way packed bitsliced SM4 S-box** behind opt-in `sm4-bitsliced-simd`, with runtime CPU detection (`cpufeatures`) and silent scalar fallback on non-AVX2 hosts. v0.5.1's `tau` dispatch fed the AVX2 path with 7 wasted lanes; production throughput matched v0.4 single-block bitslice. Dudect calibration update — `ct_fn_invert` / `ct_fp_invert` moved to PR-smoke telemetry + 100K nightly gross-regression sentinel after a GH Actions `ubuntu-24.04` runner-image shift on 2026-05-12 raised the empirical noise floor; see `docs/v0.5-dudect-recalibration.md`. See [`CHANGELOG.md`](CHANGELOG.md) `[0.5.1]`. |
| v0.6.0 (shipped) | **W4 milestone close-out — the throughput-win release.** W4 phase 3: NEON 4-way bitsliced SM4 on `aarch64` (compile-time baseline) + AVX2 32-byte full-width packed S-box (`sbox_x32`) + `Sm4CbcDecryptor::process_chunk` SIMD fanout. Per round of the SM4 decrypt, batched blocks' `tau` inputs pack into one SIMD register (32 bytes on x86_64 / 8-block batch, 16 bytes on aarch64 / 4-block batch) — 32× fewer SIMD dispatches per 8-block batch than v0.5.1. CBC encryption stays single-block (chain-of-blocks defeats SIMD packing). New dudect target `ct_sm4_cbc_decrypt_fanout` (Q6.7) gates the fanout path at `\|tau\| < 0.20`. Exhaustive lane-position-shifted SIMD tests (8192 + 4096 cases) per Q6.8. **No public API changes; no breaking changes — additive only.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.6.0]` and `docs/v0.6-scope.md`. |
| v0.7.0 (shipped) | **Cipher-mode surface expansion.** First version where v0.6's SIMD machinery is callable from user code outside the CBC-decrypt internal path. New: public length-flexible `Sm4Cipher::encrypt_blocks` / `decrypt_blocks` (W1; Q7.7); single-shot `sm4::mode_ctr::encrypt` / `decrypt` (W2; GM/T 0002-2012 §5.4); streaming `sm4::ctr_streaming::Sm4CtrCipher` (W3); new dudect target `ct_sm4_ctr_encrypt` (gates `\|tau\| < 0.20` on every cipher path). Plus the v0.8 AEAD scope doc (`docs/v0.7-aead-scope.md`, Q8.1–Q8.8 sign-off + v0.9 candidate Q-list). **No public API breakage — additive only.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.7.0]`. |
| v0.8.0 (shipped) | **AEAD core — SM4-GCM + SM4-CCM.** Per `docs/v0.7-aead-scope.md` Q8.1–Q8.8. New: `gmcrypto_simd::ghash::ghash_mul` constant-time GHASH primitive (CLMUL on `x86_64` / PMULL on `aarch64` / software Karatsuba fallback; W1); `sm4::mode_gcm::encrypt` / `decrypt` byte-identical to gmssl 3.1.1 `sm4 -gcm` with bidirectional interop (W2); `sm4::mode_ccm::encrypt` / `decrypt` byte-identical to OpenSSL 3.x EVP `SM4-CCM` across 8 KAT scenarios (W3; gmssl 3.1.1 lacks `sm4 -ccm` so OpenSSL is the oracle — see `docs/v0.8-ccm-kat-sourcing.md`); new dudect targets `ct_sm4_gcm_decrypt` + `ct_sm4_ccm_decrypt` + new CI matrix slot `sm4-bitsliced-simd,sm4-aead` (W4). Behind opt-in `sm4-aead` feature flag (additive; default-off). **No public API breakage — additive only.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.8.0]`. |
| v0.9.0 (shipped) | **AEAD ergonomics.** Per `docs/v0.9-scope.md` Q9.1–Q9.10. New: `sm4::GcmTagLen` + `mode_gcm::encrypt_with_tag_len` / `decrypt_with_tag_len` (NIST SP 800-38D §5.2.1.2 truncated tags; W1); incremental-input buffered `sm4::Sm4GcmEncryptor` (output-streaming) / `Sm4GcmDecryptor` (output-buffered, commit-on-verify) — differential-KAT-equal to single-shot across arbitrary chunking (W2); new dudect target `ct_sm4_gcm_decrypt_buffered` (W3); 6 single-shot AEAD C FFI symbols (`gmcrypto_sm4_gcm_*` / `gmcrypto_sm4_ccm_*`) behind a forwarding `sm4-aead` feature on `gmcrypto-c` (W4). Behind the existing `sm4-aead` flag. **No public API breakage — additive only.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.9.0]`. |
| v0.10.0 (shipped) | **Streaming AEAD FFI — SM4-GCM.** Per `docs/v0.10-scope.md` Q10.1–Q10.11. New: 9 `gmcrypto-c` FFI symbols + 2 opaque handle types exposing the v0.9 incremental-input buffered SM4-GCM encryptor (output-streaming) / decryptor (commit-on-verify) to C/C++/Go/Zig/Python — `gmcrypto_sm4_gcm_encryptor_{new,update,finalize,finalize_with_tag_len,free}` + `gmcrypto_sm4_gcm_decryptor_{new,update,finalize_verify,free}`, behind the existing `sm4-aead` feature on `gmcrypto-c`; `_finalize*` consume+free, single `GMCRYPTO_ERR`; C example `examples/sm4_gcm_streaming.c`. `regen-header` now implies `sm4-aead` (cbindgen drops cfg-gated opaque structs otherwise). No new `gmcrypto-core` API; no new dudect target. **No public API breakage — additive only.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.10.0]`. |
| v0.11.0 (shipped) | **RustCrypto trait-fit modernization.** Per `docs/v0.11-scope.md` Q11.1–Q11.11. Migrates the opt-in `digest-traits` / `cipher-traits` impls from `digest 0.10` / `cipher 0.4` to `digest 0.11` / `cipher 0.5` (the `crypto-common 0.2` / `hybrid-array` generation), in-place: `cipher` block backend reshaped to cipher 0.5's separate `BlockCipherEncBackend` / `BlockCipherDecBackend`; HMAC construction via `digest::KeyInit::new_from_slice` (`digest 0.11` `Mac` dropped `KeyInit`). **BREAKING for trait-fit consumers only** (bump your own `digest`/`cipher`); default-features users unaffected, output byte-identical (full KAT + gmssl interop). MSRV stays 1.85; no new dudect target. See [`CHANGELOG.md`](CHANGELOG.md) `[0.11.0]`. |
| v0.12.0 (shipping) | **SM4-XTS — tweakable disk/sector mode.** Per `docs/v0.12-scope.md` Q12.1–Q12.13. New: `sm4::mode_xts::{encrypt, decrypt}` + `XTS_KEY_SIZE` behind the opt-in `sm4-xts` feature — GB/T 17964-2021 (GM-T OID `1.2.156.10197.1.104.10`), full ciphertext stealing, byte-identical to OpenSSL 3.x EVP `SM4-XTS` (`xts_standard=GB`; **not** IEEE 1619 — they differ in the GF(2¹²⁸) tweak doubling). 32-byte key (`Key1 ‖ Key2`) + raw 16-byte tweak, lengths `[16 B, 16 MiB]`, single `None` failure mode, confidentiality-only (no auth). Pure-core (**no new dependency**); rides the `Sm4Cipher::encrypt_blocks` batch API + SIMD fanout. New dudect target `ct_sm4_xts_decrypt`. Also fixes a latent CI bug where the feature-conditional dudect gates never fired. C FFI deferred to v0.13. **Additive — no public API breakage.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.12.0]`. |
| v0.13+ | Per `docs/v0.12-scope.md` §5/§6 (Q13.x): C FFI for SM4-XTS (`gmcrypto_sm4_xts_*`); RustCrypto `aead` trait fit (upstream still on `0.6.0-rc.10`; v0.11 landed the `crypto-common 0.2` line it needs); pinned / noise-isolated dudect runner; AVX-512 16-way `sbox_x64`; CCM incremental input (consumer-driven); Argon2-with-SM3 alternative KDF (research-only); `wasm-bindgen-test` KAT runner. Each lands behind its own scope-doc cycle. |
| v1.0 | API stabilization. |
## Quick-start
```rust
use gmcrypto_core::sm2::{
sign_with_id, verify_with_id, Sm2PrivateKey, Sm2PublicKey, DEFAULT_SIGNER_ID,
};
use getrandom::SysRng;
use hex_literal::hex;
use rand_core::UnwrapErr;
// v0.5 W5 — `from_bytes_be` is the recommended public constructor
// (always-on, doesn't expose `crypto_bigint::U256` to callers).
let d_be: [u8; 32] = hex!(
"3945208F7B2144B13F36E38AC6D39F95889393692860B51A42FB81EF4DF7C5B8"
);
let key = Sm2PrivateKey::from_bytes_be(&d_be).expect("d in [1, n-2]");
let public = Sm2PublicKey::from_point(key.public_key());
let mut rng = UnwrapErr(SysRng);
let sig = sign_with_id(&key, DEFAULT_SIGNER_ID, b"hello", &mut rng).unwrap();
assert!(verify_with_id(&public, DEFAULT_SIGNER_ID, b"hello", &sig));
```
## Threat model
See [`SECURITY.md`](SECURITY.md). Briefly: server-side use, dedicated host,
operator-trusted, network MITM in scope, side-channel attacks beyond what the
dudect harness covers are NOT in scope.
## Build & test
```bash
cargo test --workspace # unit + integration
cargo bench --bench timing_leaks --features crypto-bigint-scalar # local timing harness (~75s)
DUDECT_SAMPLES=10000 cargo bench --bench timing_leaks --features crypto-bigint-scalar # match CI smoke budget
```
`gmssl` interop test (gated; install [`gmssl`](https://github.com/guanzhi/GmSSL)
v3.1.1 to enable):
```bash
GMCRYPTO_GMSSL=1 cargo test --test interop_gmssl
```
## wasm32 support
`gmcrypto-core` builds on `wasm32-unknown-unknown` as of v0.4. CI gates
both stable and MSRV (1.85) builds on the target.
```bash
rustup target add wasm32-unknown-unknown
cargo build -p gmcrypto-core --target wasm32-unknown-unknown --no-default-features
```
The crate is `no_std + alloc` only and does NOT pull `getrandom`'s
`wasm_js` backend or `wasm-bindgen` / `js-sys` into its default dep
graph. Wasm callers wire their own `rand_core::Rng` impl — typically
by enabling `getrandom`'s `wasm_js` feature in *their* `Cargo.toml`:
```toml
[dependencies]
gmcrypto-core = "0.7"
rand_core = { version = "0.10", default-features = false }
getrandom = { version = "0.4", default-features = false, features = ["wasm_js"] }
```
```rust
use gmcrypto_core::sm2::{sign_with_id, Sm2PrivateKey, DEFAULT_SIGNER_ID};
use rand_core::UnwrapErr;
use getrandom::SysRng;
let mut rng = UnwrapErr(SysRng); // wasm_js-backed when targeting wasm32
let sig = sign_with_id(&priv_key, DEFAULT_SIGNER_ID, b"msg", &mut rng).unwrap();
```
A `wasm-bindgen-test`-driven test runner (running KAT vectors under
Node or a headless browser) is post-v0.4 — v0.4 ships the build-target
gate only.
## License
Apache-2.0. See [`LICENSE`](LICENSE).
Some reference outputs use the upstream [`gmssl`](https://github.com/guanzhi/GmSSL)
tool. This project is independent of that project.