gmcrypto-simd 1.2.0

SIMD backends for gmcrypto-core — AVX2 (x86_64) and NEON (aarch64) packed bitsliced SM4 S-box, quarantined to keep `gmcrypto-core` `unsafe_code = forbid`
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
# gm-crypto-rs

Constant-time-designed pure-Rust SM2 / SM3 / SM4 SDK for Chinese national
cryptography (GB/T 32905 / 32918 / 32907 / GM/T 0009). Sign / verify,
public-key encrypt / decrypt, SM4-CBC, SM4-CTR (single-shot + streaming),
length-flexible batched SM4 block encryption, HMAC-SM3, PBKDF2-HMAC-SM3 —
all secret-touching paths guarded by an in-CI `dudect-bencher`
detectable-leak regression harness.

[![Crates.io](https://img.shields.io/crates/v/gmcrypto-core.svg)](https://crates.io/crates/gmcrypto-core)
[![Documentation](https://docs.rs/gmcrypto-core/badge.svg)](https://docs.rs/gmcrypto-core)
[![License](https://img.shields.io/crates/l/gmcrypto-core.svg)](https://crates.io/crates/gmcrypto-core)

**Personal project notice:** not affiliated with, endorsed by, sponsored by, or
certified by any upstream cryptography project, payment gateway, standards body,
or vendor.

> ⚠️ **Not independently audited.** No third-party / external security audit has
> been performed. Assurance is internal: a multi-model adversarial pre-publish
> re-audit (see [`docs/v1.0-reaudit.md`](docs/v1.0-reaudit.md)), in-CI KAT vectors,
> maintainer-run gmssl 3.1.1 interop (11/11, gated on `GMCRYPTO_GMSSL` — not run in
> CI), an in-CI `dudect` timing-leak harness, and a 19-target `cargo-fuzz` suite. This is a solo-maintained, best-effort open-source
> project with no support SLA. Review the code and **use at your own risk.** See
> [`SECURITY.md`](SECURITY.md) for the threat model and disclosure process.

## What this is

A small, auditable, pure-Rust SM2 / SM3 / SM4 SDK whose central
differentiating commitment is that secret-touching code paths are
**constant-time-designed and guarded by an in-CI [`dudect-bencher`](https://docs.rs/dudect-bencher/)
detectable-leak regression harness**: 19 real `ct_*` targets (12
always-on + 2 cfg-gated under `sm4-bitsliced-simd` + 3 cfg-gated under
`sm4-aead` + 1 cfg-gated under `sm4-xts` + 1 cfg-gated under
`sm2-key-exchange`) plus a deliberately-leaky
`negative_control` that proves
the harness can detect leaks. Most real targets gate at `|tau| < 0.20`;
`ct_sign_k_class` and the direct `ct_fn_invert` / `ct_fp_invert` invert
diagnostics carry target-specific gate policy after the 2026-05-12
recalibration — see [`SECURITY.md`](SECURITY.md) and
[`docs/v0.5-dudect-recalibration.md`](docs/v0.5-dudect-recalibration.md).

The harness reports timing-leak detection events. **It does not prove
constant-time.** Low `|tau|` values mean the test could not detect a leak with
the budget given, not that no leak exists. Language taken directly from
`dudect-bencher`'s own docs.

The harness covers: SM2 sign (split by both private key `d` and nonce
`k` magnitude, with both retry nonces class-tied), SM2 decrypt (split
by recipient `d_B`), SM4 key schedule + single-block encrypt (split by
master key, under default linear-scan and `sm4-bitsliced` paths), the
v0.5 SIMD-packed dispatch (`ct_sm4_encrypt_block_bitsliced_simd`,
cfg-gated), v0.6's batched CBC-decrypt fanout
(`ct_sm4_cbc_decrypt_fanout`, cfg-gated), v0.7's SM4-CTR encrypt
(`ct_sm4_ctr_encrypt`, exercising the public batch path on every
cipher matrix entry), v0.8's SM4-GCM + SM4-CCM decrypt
(`ct_sm4_gcm_decrypt` and `ct_sm4_ccm_decrypt`, cfg-gated on
`sm4-aead`), v0.9's incremental-input buffered SM4-GCM decrypt
(`ct_sm4_gcm_decrypt_buffered`, cfg-gated on `sm4-aead`), v1.1's full
SM2 key-exchange initiator flow (`ct_sm2_key_exchange`, cfg-gated on
`sm2-key-exchange` — split by static `d_A` with per-class valid
responder transcripts), HMAC-SM3
(split by key), encrypted-PKCS#8
decrypt (split by password bytes — both classes' blobs valid for their
class's password so both succeed via identical control flow), plus
direct `Fn::invert` and `Fp::invert` diagnostics. The `ct_sign_k_class`
target closes v0.1's structural blind spot to nonce-only leaks.

The `crypto-bigint 0.6 → 0.7.3` upgrade resolved the v0.1-era
`ConstMontyForm::invert` leak directly: on the v0.2 W0 harness both
direct invert diagnostics measured under `|tau| ≈ 0.01`, two orders of
magnitude below the gate. Subsequent GH Actions runner-image drift on
2026-05-12 raised the empirical noise floor on `ct_fn_invert` /
`ct_fp_invert` — both targets moved to PR-smoke telemetry + a nightly
gross-regression sentinel at `|tau| ≥ 0.55`. See
[`docs/v0.5-dudect-recalibration.md`](docs/v0.5-dudect-recalibration.md)
for the data and posture. See [`SECURITY.md`](SECURITY.md) for the full
constant-time discipline.

The differentiator vs. existing Rust SM2 crates (notably
[`RustCrypto/sm2`](https://docs.rs/sm2/), which already aims for constant-time
secret-dependent operations in its design) is **the in-CI regression gate**, not
the design intent in isolation.

## What this isn't

- Not a TLS/TLCP implementation.
- Not SM9, ZUC, post-quantum.
- Not an HSM/SDF/SKF integration.
- Not a certified cryptographic module.
- Not constant-time on CPUs with data-dependent multiply latencies (some older
  x86, some embedded).
- Not a comprehensive SM-crypto library yet — see the milestone roadmap.

## Stability & SemVer

The line graduated to **1.0 (stable)** with the **1.0.0** release; the current release is
**1.2.0** (the C FFI for SM2 key exchange — see the v1.2 scope above). crates.io history
goes **0.16.0 → 1.0.0 → 1.0.1 → 1.1.0 → 1.2.0**, skipping 0.17.0–0.23.0 (those were
non-publishing assurance + API-finalization milestones; their changes all shipped together
in the first stable `1.0.0`). Every post-1.0 release has been additive (SemVer-checked);
the only migration ever required is 0.16 → 1.0, a single major bump — no published 0.x
consumer ever saw an intermediate break. The public API had been stable in
practice since v0.5; the **v1.0 readiness audit** (v0.21) froze and tooling-guarded
it, the **v0.22 API-tightening cycle** decoupled it from `crypto-bigint 0.7`, and
the **v0.23 pre-1.0 re-audit remediation cycle** applied the API/ABI-finality +
hardening fixes from a multi-model adversarial re-audit
([`docs/v1.0-reaudit.md`](docs/v1.0-reaudit.md)) —
see [`docs/v1.0-readiness.md`](docs/v1.0-readiness.md).

**From 1.0, SemVer is enforced**: breaking changes to the covered surface require a
major bump, and `cargo-semver-checks` runs as the forward breaking-change gate in
CI (the three crates always release together at one lockstep version, with
intra-workspace deps pinned exactly — `=1.2.0`). The runtime wire output (SM2
signatures / ciphertexts, SM4 mode bytes) is byte-identical to 0.16.0.

- **What's covered by SemVer:** the public Rust API of `gmcrypto-core` (the
  surface snapshotted in [`docs/api-baseline/gmcrypto-core.txt`](docs/api-baseline/gmcrypto-core.txt),
  drift-checked in CI) and the `gmcrypto-c` **C ABI** (the committed
  `crates/gmcrypto-c/include/gmcrypto.h`, drift-checked in CI).
- **What's NOT covered:** anything `#[doc(hidden)]` — `sm2::sign_raw_with_id` (the
  dudect harness hook), `Sm4Cbc{Encryptor,Decryptor}::take_output` (FFI-shim drains),
  (v0.22) the low-level SM2 curve arithmetic `sm2::curve` / `sm2::scalar_mul` /
  `ProjectivePoint::to_affine`, and (v0.23) the raw EC point surface
  `sm2::point` / `ProjectivePoint` (the type + module + re-export) +
  `Sm2PublicKey::{from_point, point}`, the low-level `asn1::{reader, writer, oid}`
  modules, and the in-crate `traits::{Hash, Mac, BlockCipher}` module (all kept
  `pub` only for in-repo dev crates); and the entire **`gmcrypto-simd`** crate, which
  is an internal acceleration backend with **no stable Rust API** (use `gmcrypto-core`
  from Rust, `gmcrypto-c` from C). These may change or be removed in any release.
- **High-level key path speaks keys, not points (v0.23).**
  `Sm2PrivateKey::public_key()` returns `Sm2PublicKey` (not the now-internal
  `ProjectivePoint`); `Sm2PublicKey::from_sec1_bytes` is the on-curve-checked public
  point constructor. `spki::{encode, decode}` and `sec1::EcPrivateKey.public` speak
  `Sm2PublicKey`.
- **RNG bound (v0.23).** `sm2::{sign_with_id, encrypt}` name the **fallible**
  `rand_core::TryCryptoRng` bound — a deliberate, documented ecosystem coupling
  (`rand_core` is the RNG interop point, the RustCrypto-wide convention; unlike the
  v0.22 `crypto-bigint` decoupling, replacing it would hurt interop). An RNG failure
  collapses to the single `Failed`, never a panic.
- **Single-shot SM4-GCM `encrypt` is fallible (v0.23).**
  `mode_gcm::{encrypt, encrypt_with_tag_len}` return `Option<…>`, rejecting plaintext
  past the `2^36 − 32`-byte GCM counter ceiling (matching the streaming path and
  `decrypt`).
- **Features are additive** (`default = []`; all 8 are opt-in) and the build is
  `no_std` + `alloc`-only with `unsafe_code = "forbid"` on the core.
- **MSRV is 1.85** (edition 2024); an MSRV bump is treated as a minor, not a patch.
- **`crypto-bigint` decoupling (v0.22):** the **always-on** (default-features) public
  API names **no** `crypto-bigint` types — the byte-adjacent types
  (`asn1::{encode,decode}_sig`, `Sm2Ciphertext::{x,y}`) take/return `[u8; 32]`, and
  the curve/scalar arithmetic is `#[doc(hidden)]` (above). The **only** place a
  `crypto-bigint 0.7` type appears in the public API is the **opt-in**
  `crypto-bigint-scalar` feature's `Sm2PrivateKey::from_scalar(U256)` — enabling that
  feature is an explicit opt-in to the `crypto-bigint 0.7` type contract (a
  `crypto-bigint` major bump would be breaking for that feature). The recommended
  always-on path (`Sm2PrivateKey::from_bytes_be`) avoids it entirely. See
  [`docs/v1.0-readiness.md`](docs/v1.0-readiness.md) §3.A.

## v1.2 scope — C FFI for SM2 key exchange

**Completes the core-in-vN / FFI-in-vN+1 cadence for v1.1**: the GM/T 0003.3
key exchange is now reachable from C / C++ / Python / Go / Zig through
`gmcrypto-c` (per `docs/v1.2-scope.md` Q2.1–Q2.10). **9 new symbols, 2 opaque
handle types, 1 new const** (`GMCRYPTO_SM2_KX_CONFIRM_SIZE` = 32) — 72 FFI
entry points total, always-on per the v0.23 posture (the committed
`gmcrypto.h` == a default build; `gmcrypto-core`'s own `sm2-key-exchange`
feature stays opt-in for Rust callers).

- **Handle shape:** the Rust consume-on-transition typestate collapses to two
  opaque handles. The **initiator** is born waiting —
  `gmcrypto_sm2_kx_initiator_new` samples the ephemeral internally and writes
  `R_A` immediately; `_confirm` verifies `S_B`, emits `K` + `S_A`, and
  **consumes + frees**. The **responder**: `_new` → `_respond` (takes `R_A`,
  emits `R_B` + `S_B`; a failed respond spends the handle, a stray second
  respond errors without disturbing the in-flight state) → `_finish` (verifies
  `S_A`, releases `K`, consumes + frees). Misuse ordering collapses to the
  single `GMCRYPTO_ERR`.
- **RNG:** OS (`getrandom::SysRng`) by default, plus `_with_rng` variants
  taking the v0.5 `gmcrypto_rng_callback` — which lets the test suite drive
  fixed standard ephemerals through the ABI.
- **Assurance:** the **GM/T 0003.5 recommended-curve KAT reproduced
  byte-for-byte through the C ABI** (`R_A`/`R_B`/`S_B`/`K`/`S_A` all
  asserted), FFI↔Rust cross-handshakes in both directions, tamper/misuse/null
  negative tests (c_smoke 65 → 76); `fuzz_c_abi` grows a KX op (attacker peer
  wire bytes, asserted spent-handle semantics) + a valid-transcript seed. **No
  new dudect target** (thin shim — core's `ct_sm2_key_exchange` covers the
  secret-dependent path; the v0.13/v0.16 precedent). Doc-only example
  [`crates/gmcrypto-c/examples/sm2_key_exchange.c`](crates/gmcrypto-c/examples/sm2_key_exchange.c)
  (full two-party handshake).
- **The caller owns wiping `key_out`** — the library zeroizes its internal
  copies only.

## v1.1 scope — SM2 key exchange (GM/T 0003.3)

**Completes the SM2 family**: GM/T 0003.2 sign + 0003.4/.5 encrypt shipped long
ago; v1.1 adds the missing third — **GM/T 0003.3 ≡ GB/T 32918.3-2016 key
agreement with key confirmation** — behind the opt-in **`sm2-key-exchange`**
feature (pure-core, **no new dependency**; the default-features build is
byte-identical).

- **API:** two role state-machines — `Sm2KxInitiator` → `produce_ephemeral` →
  `confirm` → `(Sm2SharedKey, S_A)`, and `Sm2KxResponder` → `respond` →
  `finish` → `Sm2SharedKey`. Each step consumes `self`: an ephemeral cannot be
  reused, and the key is unreachable before confirmation passes. The agreed key
  is `ZeroizeOnDrop`; every failure (off-curve peer `R`, bad tag, RNG failure,
  identity `U`, bad `klen`/`id`) collapses to the single `Error::Failed`.
- **Constant-time posture:** ephemeral via the existing fixed-budget masked
  sampler; `t = (d + x̄·r) mod n` and the scalar mults branch-free; confirmation
  tags compared with `subtle::ConstantTimeEq` only; `t`, the KDF input, and
  `x_U`/`y_U` wiped after use. New dudect target **`ct_sm2_key_exchange`**
  (10K smoke `|tau| ≈ 0.02`, gate `< 0.20`).
- **KAT:** byte-identical to the **GM/T 0003.5-2012 recommended-curve worked
  example** (`K`, `S_A`, `S_B`, all intermediate points) — note the example
  uses the default ID `1234567812345678` for both parties; see
  [`docs/v1.1-sm2kx-kat-sourcing.md`](docs/v1.1-sm2kx-kat-sourcing.md).
- **Assurance:** new fuzz target `fuzz_sm2_kx` (adversarial peer `R_B`/`S_B`
  bytes, no-panic invariant); `sm2-key-exchange` legs across the
  clippy/deny/MSRV/wasm32/dudect CI matrices.
- **C FFI shipped in v1.2** (the core-in-vN / FFI-in-vN+1 cadence — see the
  v1.2 scope section above).

## v1.0.1 scope (shipped)

**Readiness-cleanup patch — the first post-1.0 publish.** v1.0.1 ships the
GO-WITH-FOLLOWUP cleanup from a release-readiness synthesis of the prior audits
([`docs/audits/2026-06-02-release-readiness-synthesis.md`](docs/audits/2026-06-02-release-readiness-synthesis.md)):
0 blockers, all non-blocking polish.

- **Functional fix (the one behavior change):** the `gmcrypto-c` C ABI
  `gmcrypto_version()` returned a hardcoded `"0.4.0"` regardless of the built
  version — it now reports the real `CARGO_PKG_VERSION` (so a C caller linking
  1.0.1 reads `"1.0.1"`). This is the single reason 1.0.1 is a crates.io release
  rather than a docs-only update.
- **Doc improvements:** raw-block "not a cipher mode" ECB warnings on
  `Sm4Cipher::{encrypt,decrypt}_block` and the corresponding block FFI; cbindgen
  header pointer/length preconditions; FFI notes on the fallible RNG path and the
  XTS `start_sector` range; pre-1.0-stability caveats on the `digest-traits` /
  `cipher-traits` trait impls; and `SECURITY.md` / `README.md` / `deny.toml`
  corrections.
- **CI-health fixes:** `sm4-xts` added to the MSRV / wasm32 / `cargo deny` passes;
  the dudect path-allowlist gained `gmcrypto-simd/src/**`; `cargo generate-lockfile`
  runs before `cargo deny`; a new `simd-x86` job (`cargo test -p gmcrypto-simd` on
  `ubuntu-latest`) that immediately **caught a real latent bug** — the x86-only SIMD
  test files lacked `#![allow(unsafe_code)]`, so they had never compiled under CI's
  `-D warnings` (fixed); and the `pull_request` `paths-ignore` was removed from
  `ci.yml` so docs-only PRs are no longer permanently blocked by branch-protection
  required checks.

**No API or ABI change; runtime crypto wire output is byte-identical to 1.0.0** —
`cargo-semver-checks` runs enforced as the patch-non-breaking gate. 6 merged PRs
(#87–#92). Consumers move 1.0.0 → 1.0.1 with a plain `cargo update`.

## v0.16 scope (shipped)

**C FFI for the SM4-XTS multi-sector helper.** v0.16 exposes the v0.15
`sm4::mode_xts::{encrypt_sectors, decrypt_sectors}` through the `gmcrypto-c` C
ABI (behind the existing forwarding `sm4-xts` feature): two new symbols
`gmcrypto_sm4_xts_encrypt_sectors` / `gmcrypto_sm4_xts_decrypt_sectors` that
transform a contiguous run of equal-size sectors **in place** (`buf: *mut u8` +
`buf_len`), deriving sector `i`'s tweak as little-endian-128(`start_sector + i`)
— `start_sector` is a `uint64_t` LBA. Unlike the single-shot XTS FFI (uniformly
out-of-place), these are **in-place** — mirroring the core's `&mut [u8]` API so
disk callers never double-allocate. Byte-identical to the core helper; single
`GMCRYPTO_ERR` with `buf` untouched on error; confidentiality only (no auth).
The deferred FFI half of v0.15, on the established core-in-vN / FFI-in-vN+1
cadence — every cipher mode is now FFI-complete. **No new dependency, no new
feature flag, no new `gmcrypto-core` API, no new dudect target.** Design
rationale: [`docs/v0.16-scope.md`](docs/v0.16-scope.md) (Q16.1–Q16.12).

## v0.20 scope (infra-assurance, not a crates.io release) — streaming-decryptor differential fuzzing + coverage

**Two new differential fuzz targets + `cargo fuzz coverage` + a codified v1.0
constant-time baseline.** `fuzz_sm4_cbc_streaming_decrypt` and
`fuzz_sm4_gcm_streaming_decrypt` feed the ciphertext to the **streaming**
decryptors (`Sm4CbcDecryptor` / `Sm4GcmDecryptor`) in **arbitrary chunk
boundaries** and assert the result is **byte-identical** to the single-shot
`mode_{cbc,gcm}::decrypt` oracle — a *differential* invariant (catches the CBC
buffer-back-by-one PKCS#7 boundary and the GCM commit-on-verify GHASH
accumulator), stronger than v0.14's no-panic property. The nightly fuzz sweep
grows to **18 targets** (initial sweep: zero crashes, zero divergences) and gains
a **non-gating `cargo fuzz coverage`** job that renders per-target `llvm-cov`
TOTALS over the committed seed corpus and uploads them (the report is the
deliverable, not a coverage-% gate). v0.20 also **codifies the settled v1.0
constant-time baseline** in [`SECURITY.md`](SECURITY.md): composite dudect
targets stay gated `|tau| < 0.20`; the two single-inversion micro-diagnostics
remain telemetry + a `|tau| ≥ 0.55` sentinel (the v0.19 falsification is the
evidence), with a *narrow* revisit door (a class-split-twin without the inversion
op, or offline/dedicated hardware — never PR-executing public self-hosted CI).
The theme was chosen after a Codex + Grok strategy discussion (one more assurance
cycle that feeds v1.0 readiness, over a third dudect cycle or new features). A
*repository / infra-assurance* milestone — only the workspace-excluded `fuzz/`
crate + `fuzz-nightly.yml` + docs change (workspace stays `0.16.0`; crates.io
skips `0.20.0` per the v0.14/v0.17/v0.18/v0.19 precedent). Design + result:
[`docs/v0.20-scope.md`](docs/v0.20-scope.md) (Q20.1–Q20.5). **Next: v0.21 = the
v1.0 readiness audit**, with v0.20's harnesses + coverage as input evidence.

## v0.19 scope (infra-assurance, not a crates.io release) — relative gate tested and falsified

**Self-calibrating relative dudect gate — TESTED and FALSIFIED → honest fallback.**
v0.19 set out to re-promote the two direct-invert diagnostics
(`ct_fn_invert` / `ct_fp_invert`) off the v0.18 telemetry/sentinel posture by
adding two **fix-vs-fix noise-floor probes** (`noise_floor_fn_invert` /
`noise_floor_fp_invert` — each runs the same `Fn`/`Fp` inversion as its suspect
but feeds both dudect classes one identical input, so its `|tau|` is pure
measurement noise) and gating each target *relatively*:
`median(target) ≤ max(0.20, 4·median(probe))` — a threshold that adapts to the
runner's own noise floor.

The 100K calibration on `main` **falsified the matched-sensitivity premise**: the
probes stay uniformly quiet (~0.005) while the real class-split targets spike
intermittently into [0.26–0.32] (`ct_fp_invert` reached a **median of 0.2606** on
the `sm4-bitsliced-simd` leg, ratio 50). The runner noise lives in the **two-input
class-split difference** (`z_small` vs `z_large`), *not* the operation duration a
same-input probe can observe — so the probe cannot track it and the relative
threshold just pins at the `0.20` the noise already breaks. Per the pre-committed
honest-fallback path, the relative gate is demoted to non-blocking telemetry, the
two targets revert to telemetry (PR) / gross-regression **sentinel @0.55**
(nightly), and the probes are **kept as telemetry** — they are the evidence that
the noise is class-split-specific, the input to a v0.20 **class-split-aware
"noise-twin"** reference. A **repository / infra-assurance** milestone — the only
crate change is the dev-only bench harness (published library byte-unchanged;
workspace stays `0.16.0`; crates.io skips `0.19.0` per the v0.14 / v0.17 / v0.18
precedent). Design + result:
[`docs/v0.19-scope.md`](docs/v0.19-scope.md) (Q19.1–Q19.7) +
[`docs/v0.5-dudect-recalibration.md`](docs/v0.5-dudect-recalibration.md) (v0.19
resolution).

**Deferred to v0.20+**: a class-split-aware "noise-twin" dudect reference (the
v0.19 successor that could finally re-promote the invert diagnostics);
round-trip / differential + streaming-decryptor parser fuzzing; RustCrypto `aead`
trait fit (still `0.6.0-rc.10`); `cargo fuzz coverage`; AVX-512 `sbox_x64`; CCM
buffered input; a v1.0 readiness pass.

## v0.18 scope (shipped — infra-assurance, not a crates.io release)

**dudect-gate hardening.** v0.18 pins the dudect CI workflows' drift axes
(`ubuntu-24.04` OS-label + exact `dtolnay/rust-toolchain@1.95.0`) and gates on a
**CI-level multi-run median** `|tau|` (PR 3 runs / nightly 5 runs; the
`required_low` gates + the nightly gross-regression sentinel use the **median**,
`negative_control` uses the **min**, and any required target not measured on
every run fails). The bench harness `timing_leaks.rs` is **byte-unchanged** — the
loop and median live entirely in CI. A 100K×5 calibration measured the
`ct_fn_invert`/`ct_fp_invert` diagnostics back near their ~0.006 baseline, but
they were **kept on the telemetry / sentinel posture (not re-promoted)**: the
noise that demoted them is runner-image-sensitive and would re-flake a tight gate
if it returns — robustness over a tighter gate. A **repository / infra-assurance**
milestone — no crate code change (workspace stays `0.16.0`; crates.io skips
`0.18.0` per the v0.14 / v0.17 precedent). Design rationale:
[`docs/v0.18-scope.md`](docs/v0.18-scope.md) (Q18.1–Q18.7) +
[`docs/v0.5-dudect-recalibration.md`](docs/v0.5-dudect-recalibration.md) (v0.18
resolution).

**Deferred to v0.19+** (per [`docs/v0.18-scope.md`](docs/v0.18-scope.md) §5/§6):
a self-calibrating relative dudect gate (the change that could safely re-promote
the invert diagnostics); round-trip / differential + streaming-decryptor parser
fuzzing; RustCrypto `aead` trait fit (still `0.6.0-rc.10`); `cargo fuzz coverage`;
AVX-512 `sbox_x64`; CCM buffered input; a v1.0 readiness pass.

## v0.15 scope (shipped)

**SM4-XTS multi-sector (disk) helper.** v0.15 adds
`sm4::mode_xts::{encrypt_sectors, decrypt_sectors}` (opt-in `sm4-xts` feature):
encrypt/decrypt a contiguous run of equal-size disk sectors **in place**
(`&mut [u8] -> Option<()>`), deriving sector `i`'s tweak as the **little-endian
128-bit** encoding of `start_sector + i` (the standard disk-XTS data-unit
convention). It owns the sector-number → tweak encoding the single-shot v0.12 API
left to the caller, and is byte-identical to looping that API per sector. Single
`None` failure mode (`buf` untouched on validation failure); confidentiality
only (no authentication). **Pure-core: no new dependency, no new feature flag, no
new SIMD, no new dudect target.** Design rationale:
[`docs/v0.15-scope.md`](docs/v0.15-scope.md) (Q15.1–Q15.12). The C FFI for the
sector helper **shipped in v0.16** (above), on the established core-in-vN /
FFI-in-vN+1 cadence.

crates.io goes **0.13.0 → 0.15.0**: `0.14.0` names the unpublished
parser-fuzzing assurance cycle (below) and is intentionally never published.

## v0.14 — parser fuzzing (assurance; not a crates.io release)

**Pre-v1.0 hardening.** v0.14 adds a `cargo-fuzz` (libFuzzer) harness over the
**entire untrusted-input decode/decrypt surface** of `gmcrypto-core` — 16
targets covering PEM, PKCS#8 (incl. PBES2 decrypt), SPKI, SEC1, the DER reader
primitives, SM2 DER + raw ciphertext, SM2 decrypt + signature-verify, and the
SM4-CBC/GCM/CCM/XTS decrypts — proving the failure-mode invariant on adversarial
bytes: **no panic, no unbounded allocation, no hang.** A capped nightly job
(`.github/workflows/fuzz-nightly.yml`) runs them on a schedule.

The initial sweep found **zero crashes** across all 16 targets, so v0.14 makes
**no code change to the published crates** and is **not cut as a crates.io
release** (publishing byte-identical crypto is release noise) — it lands as an
assurance/infra change. The fuzz crate lives in a workspace-excluded `fuzz/`
(nightly-only; never enters the published dependency graph). Design rationale:
[`docs/v0.14-scope.md`](docs/v0.14-scope.md). Run it yourself:
[`fuzz/README.md`](fuzz/README.md).

**Deferred to v0.15+** (per [`docs/v0.14-scope.md`](docs/v0.14-scope.md) §5/§6):
the SM4-XTS per-sector helper (**shipped in v0.15**, above); round-trip /
differential parser fuzzing, streaming-decryptor fuzzing, RustCrypto `aead`
trait fit (still `0.6.0-rc.10`), pinned dudect runner, `cargo fuzz coverage` in
CI, AVX-512 `sbox_x64`, a v1.0 readiness pass (now v0.16+).

## v0.13 scope (shipped)

**C ABI for SM4-XTS.** v0.13 exposes the v0.12 `sm4::mode_xts` core through the
`gmcrypto-c` C ABI (`gmcrypto_sm4_xts_encrypt` / `_decrypt`) behind a new
forwarding `sm4-xts` feature — the deferred FFI half of v0.12, on the
established core-then-FFI cadence (SM4-GCM/CCM core in v0.8 → FFI in v0.10).
Design rationale: [`docs/v0.13-scope.md`](docs/v0.13-scope.md).

- **Additive only — no public API breakage, no new dependency.** The default
  build of both crates is byte-unchanged; `sm4-xts` forwards to the pure-core
  `gmcrypto-core/sm4-xts`.
- Single-shot, mirroring the single-shot SM4-GCM FFI shape minus nonce/AAD/tag:
  32-byte key (`Key1 ‖ Key2`), 16-byte tweak, length-preserving output via the
  `(out, out_capacity, out_actual_len)` convention. Byte-identical to
  `gmcrypto_core::sm4::mode_xts`. New `GMCRYPTO_SM4_XTS_KEY_SIZE` header
  constant; single `GMCRYPTO_ERR` failure mode. **Confidentiality only.**
- Doc-only C example `crates/gmcrypto-c/examples/sm4_xts_sector.c`; 5 new
  `c_smoke` Rust-equivalence tests. No new `gmcrypto-core` API, no new dudect
  target (the FFI is a thin shim over the v0.12 core path).

**Followed by v0.14** (per [`docs/v0.13-scope.md`](docs/v0.13-scope.md) §5/§6):
parser fuzzing — the recommended pre-v1.0 assurance gate — landed as the v0.14
assurance cycle above. RustCrypto `aead` trait fit (still `0.6.0-rc.10`),
pinned/noise-isolated dudect runner, and AVX-512 `sbox_x64` remain deferred.

## v0.12 scope (shipped)

**SM4-XTS — tweakable mode for disk/sector encryption.** v0.12 adds
`sm4::mode_xts` behind the new opt-in `sm4-xts` feature: single-shot, full
ciphertext stealing, GB/T 17964-2021 (GM-T OID `1.2.156.10197.1.104.10`),
byte-identical to OpenSSL 3.x EVP `SM4-XTS` (`xts_standard=GB`). Design
rationale: [`docs/v0.12-scope.md`](docs/v0.12-scope.md); KAT sourcing:
[`docs/v0.12-xts-kat-sourcing.md`](docs/v0.12-xts-kat-sourcing.md).

- **Default-features users are unaffected** — additive, opt-in, **no new
  dependency** (the XTS tweak doubling is a trivial bit-reflected
  multiply-by-x, not GHASH, so no `gmcrypto-simd` dep).
- **GB/T 17964, not IEEE 1619** — the two standards differ in the GF(2¹²⁸)
  tweak-doubling convention (GB is the bit-reflected / GHASH-style one), so
  they produce different ciphertext for multi-block / non-aligned data. v0.12
  targets GB (the SM4 national standard + OpenSSL's default for SM4-XTS).
- **Confidentiality only — no authentication.** XTS has no tag; callers needing
  integrity use an AEAD mode (GCM/CCM). The per-data-unit tweak-uniqueness
  contract is the caller's responsibility.
- 32-byte key (`Key1 ‖ Key2`) + raw 16-byte tweak; lengths `[16 B, 16 MiB]`;
  single `None` failure mode. New dudect target `ct_sm4_xts_decrypt`. The whole-
  block bulk rides the `Sm4Cipher::encrypt_blocks` batch API (picks up the SIMD
  fanout under `sm4-bitsliced-simd`).

**Deferred to v0.13** (per [`docs/v0.12-scope.md`](docs/v0.12-scope.md) §5/§6):
C FFI for SM4-XTS, RustCrypto `aead` trait fit, pinned/noise-isolated dudect
runner, AVX-512 `sbox_x64`, CCM incremental input.

## v0.11 scope (shipped)

**RustCrypto trait-fit modernization.** v0.11 migrates the opt-in
`digest-traits` / `cipher-traits` impls from `digest 0.10` / `cipher 0.4` to
`digest 0.11` / `cipher 0.5` (the `crypto-common 0.2` / `hybrid-array`
generation), in-place. Design rationale:
[`docs/v0.11-scope.md`](docs/v0.11-scope.md).

- **Default-features users are unaffected** — the trait fit is opt-in;
  `generic-array` / `hybrid-array` never enter the default dep graph, and every
  SM2 / SM3 / SM4 / HMAC / AEAD output is byte-identical (validated against the
  full KAT suite + gmssl 3.1.1 interop).
- **BREAKING for trait-fit consumers only:** code enabling
  `digest-traits` / `cipher-traits` must bump its own `digest` / `cipher` deps
  to `0.11` / `0.5`. HMAC construction via the `Mac` trait moves to
  `digest::KeyInit::new_from_slice` (`digest 0.11`'s `Mac` dropped `KeyInit`);
  the `cipher` block traits renamed `BlockEncrypt` / `BlockDecrypt` →
  `BlockCipherEncrypt` / `BlockCipherDecrypt`.
- **MSRV stays 1.85.** The RustCrypto `aead 0.6` trait fit remains deferred
  (still `0.6.0-rc.10`); v0.11 lands the `crypto-common 0.2` line it will need.

**Deferred to v0.12** (per [`docs/v0.11-scope.md`](docs/v0.11-scope.md) §5/§6):
RustCrypto `aead` trait fit, pinned/noise-isolated dudect runner, AVX-512
`sbox_x64`, SM4-XTS, CCM incremental input, Argon2-with-SM3.

## v0.10 scope (shipped)

**Streaming AEAD FFI.** v0.10 exposes the v0.9 incremental-input buffered
SM4-GCM encryptor/decryptor through the `gmcrypto-c` C ABI — the item
v0.9 deferred (Q9.6) now that the Rust streaming API is proven. Additive
behind the existing `sm4-aead` feature. Design rationale:
[`docs/v0.10-scope.md`](docs/v0.10-scope.md).

- **9 streaming AEAD C FFI symbols + 2 opaque handle types** —
  `gmcrypto_sm4_gcm_encryptor_t` (output-streaming: `new` / `update` →
  ciphertext per chunk / `finalize` + `finalize_with_tag_len` → tag /
  `free`) and `gmcrypto_sm4_gcm_decryptor_t` (commit-on-verify: `new` /
  `update` buffers and emits **nothing** / `finalize_verify` releases
  plaintext only after the constant-time tag check / `free`).
  `_finalize*` consume+free the handle; single `GMCRYPTO_ERR` on every
  failure (no tag-/length-oracle across the boundary). Mirrors the v0.5
  CBC-streaming lifecycle. C example:
  [`examples/sm4_gcm_streaming.c`](crates/gmcrypto-c/examples/sm4_gcm_streaming.c).

**No public API breakage — purely additive.** v0.9.0 callers can
`cargo update` to v0.10.0 without migration. No new `gmcrypto-core` API;
no new dudect target (the FFI is a thin wrapper over the v0.9
`ct_sm4_gcm_decrypt_buffered`-gated path).

**Deferred to v0.11** (per [`docs/v0.10-scope.md`](docs/v0.10-scope.md)
§5/§6): streaming/incremental CCM, RustCrypto `aead` trait fit (upstream
still `0.6.0-rc.10`), pinned dudect runner, AVX-512 `sbox_x64`, SM4-XTS,
Argon2-with-SM3.

## v0.9 scope (shipped)

**AEAD ergonomics.** v0.9 extends the v0.8 AEAD core with the three
items v0.8 deferred: GCM tag-length parameterization, incremental-input
buffered SM4-GCM, and single-shot AEAD C FFI. All additive behind the
existing `sm4-aead` flag. Design rationale: [`docs/v0.9-scope.md`](docs/v0.9-scope.md).

- **`sm4::GcmTagLen` + `mode_gcm::encrypt_with_tag_len` /
  `decrypt_with_tag_len`** — W1. Caller-chosen GCM tag length per NIST
  SP 800-38D §5.2.1.2 (`{4, 8, 12, 13, 14, 15, 16}` bytes; truncated
  tag = `MSB_t(full_tag)`). `GcmTagLen::new(usize) -> Option<Self>`
  centralizes the valid-length policy. The fixed-16-byte `encrypt` /
  `decrypt` are unchanged.
- **`sm4::Sm4GcmEncryptor` / `Sm4GcmDecryptor`** — W2. Incremental-
  input buffered SM4-GCM (deliberately NOT "streaming"). The
  **encryptor** is output-streaming: `update(chunk) -> Option<Vec<u8>>`
  emits each chunk's ciphertext (`None` once the cumulative plaintext
  would exceed the NIST §5.2.1.1 ceiling `2^36 − 32` bytes);
  `finalize()` / `finalize_with_tag_len()` emit the tag. The
  **decryptor** is input-incremental but output-BUFFERED:
  `update(chunk)` buffers ciphertext + folds GHASH, and
  `finalize_verify(tag) -> Option<Vec<u8>>` releases the plaintext only
  after the constant-time tag check (commit-on-verify — never leaks
  pre-verify bytes). AAD is supplied at construction. Driven with any
  chunking, both reproduce the single-shot path byte-for-byte.
- **6 single-shot AEAD C FFI entry points** — W4. `gmcrypto_sm4_gcm_
  encrypt` / `_decrypt` / `_encrypt_with_tag_len` / `_decrypt_with_tag_
  len` + `gmcrypto_sm4_ccm_encrypt` / `_decrypt`, behind a new
  forwarding `sm4-aead` feature on `gmcrypto-c`. Every error path
  returns `GMCRYPTO_ERR` (single failure code). Streaming AEAD FFI is
  deferred to v0.10.
- **New dudect target `ct_sm4_gcm_decrypt_buffered`** — W3. Class-split
  by master key, drives `Sm4GcmDecryptor`; `|tau| < 0.20` (5K-sample
  smoke `|τ| ≈ 0.029`). No new CI matrix slot — rides the existing
  `sm4-aead` entries.

**No public API breakage — purely additive.** v0.8.0 callers can
`cargo update` to v0.9.0 without migration; `sm4-aead` is opt-in.

**Deferred to v0.10** (per [`docs/v0.9-scope.md`](docs/v0.9-scope.md)
§5/§6): CCM incremental input, streaming AEAD FFI, RustCrypto `aead`
trait fit (upstream still on `0.6.0-rc`), pinned dudect runner,
AVX-512 `sbox_x64`, SM4-XTS, Argon2-with-SM3.

## v0.8 scope (shipped)

The **AEAD core**. v0.8 cashed in the cipher-mode surface that v0.7
opened up: SM4-GCM and SM4-CCM single-shot, plus a constant-time
GHASH primitive in `gmcrypto-simd`.

- **`sm4::mode_gcm::encrypt` / `decrypt`** — W2. Single-shot SM4-GCM
  per NIST SP 800-38D / GM/T 0009 / RFC 8998. `encrypt(key, nonce,
  aad, pt) -> (Vec<u8>, [u8; 16])` returns `(ciphertext, tag)`.
  `decrypt(key, nonce, aad, ct, tag) -> Option<Vec<u8>>` —
  `Some(plaintext)` only when the tag verifies (constant-time
  compare via `subtle::ConstantTimeEq`). Both 12-byte canonical
  and arbitrary-length nonce paths supported. Tag length fixed at
  128 bits in v0.8 (parameterized in v0.9 via `GcmTagLen`).
  **Byte-identical to gmssl 3.1.1 `sm4 -gcm`** — bidirectional
  interop validated.
- **`sm4::mode_ccm::encrypt` / `decrypt`** — W3. Single-shot SM4-CCM
  per NIST SP 800-38C / RFC 3610 / GM/T 0009 (OID
  `1.2.156.10197.1.104.9`). `encrypt(key, nonce, aad, pt, tag_len)
  -> Option<Vec<u8>>` (output: `ciphertext ‖ tag`). `tag_len ∈
  {4, 6, 8, 10, 12, 14, 16}` per spec, validated at API entry.
  `nonce.len() ∈ [7, 13]`. Pure-Rust CBC-MAC + CTR over the
  existing `Sm4Cipher` path — no GHASH. **Byte-identical to OpenSSL
  3.x EVP `SM4-CCM`** across 8 KAT scenarios (gmssl 3.1.1 doesn't
  ship `sm4 -ccm` so the CCM reference oracle comes from OpenSSL;
  see [`docs/v0.8-ccm-kat-sourcing.md`](docs/v0.8-ccm-kat-sourcing.md)).
- **`gmcrypto_simd::ghash::ghash_mul(h, x) -> [u8; 16]`** — W1.
  Constant-time GHASH multiplication over `GF(2^128) /
  (x^128 + x^7 + x^2 + x + 1)`. Single dispatch entry point:
  - `ghash_mul_clmul` on `x86_64` (PCLMULQDQ + SSE2; runtime
    cpufeatures detect; Intel Westmere+ / AMD Bulldozer+).
  - `ghash_mul_pmull` on `aarch64` (ARMv8.0 AES extension
    `vmull_p64`; runtime cpufeatures detect; Apple Silicon /
    most modern ARM chips).
  - `ghash_mul_software` (bit-serial mask-XOR; constant-time over
    both inputs; available everywhere as fallback).
- **New `sm4-aead` feature flag** — default-off; opt-in.
  `sm4-aead = ["dep:gmcrypto-simd"]` activates `mode_gcm` and
  `mode_ccm`. Additive on the default-features build.
- **New dudect targets `ct_sm4_gcm_decrypt` + `ct_sm4_ccm_decrypt`**
  — W4. Class-split by master key over a fixed 256-byte
  plaintext + 16-byte AAD. Both classes' `(ct, tag)` pairs are
  valid encrypts under their **own** keys, so both decrypt paths
  reach the tag-compare via identical control flow. Same
  `|tau| < 0.20` gate as the rest of the SM4 surface; new CI
  matrix slot `sm4-bitsliced-simd,sm4-aead` exercises the
  most-demanding cipher-stack combination.

**No public API breakage — purely additive.** v0.7.0 callers can
`cargo update` to v0.8.0 without migration; `sm4-aead` is opt-in.

Everything v0.4 shipped (`wasm32-unknown-unknown` build, RustCrypto
trait fit behind `digest-traits` / `cipher-traits`, bitsliced SM4
S-box behind `sm4-bitsliced`, `gmcrypto-c` C ABI crate) is unchanged
— see the Roadmap row for the compact reference and `CHANGELOG.md`
`[0.4.0]` for detail.

Everything v0.3 shipped is unchanged:

- Reusable strict-canonical DER reader / writer subset
  (`gmcrypto_core::asn1::{reader, writer, oid}`).
- PEM + encrypted PKCS#8 + X.509 SPKI + SEC1 codecs
  (`gmcrypto_core::{pem, pkcs8, spki, sec1}`).
- Full bidirectional gmssl 3.1.1 interop (SM2 sign / verify, SM2
  encrypt / decrypt, SM4-CBC). Gated on `GMCRYPTO_GMSSL=1`.
- Raw byte-concat SM2 ciphertext helpers
  (`gmcrypto_core::sm2::raw_ciphertext`): `C1 || C3 || C2`
  emit + decode; legacy `C1 || C2 || C3` decrypt-only.
- Streaming `HmacSm3` + `Sm4Cbc{En,De}cryptor`. In-crate
  `Hash` / `Mac` / `BlockCipher` traits (`gmcrypto_core::traits`).
- Comb-table `mul_g` (~5× sign-side speedup). 64 sub-tables of 16
  entries each, lazily built once per process via `spin::Once`.

Everything v0.2 shipped is unchanged:

- SM3 hash function (`#![no_std]` + `alloc`).
- SM2 sign / verify with custom signer ID (default `1234567812345678` per GM/T 0009).
- SM2 public-key encrypt / decrypt with GM/T 0009-2012 ciphertext DER
  (`SEQUENCE { x, y, hash, ciphertext }`). Invalid-curve attack defense
  via on-curve check on `C1` before scalar mult; non-branching
  KDF-zero detection so a chosen-ciphertext attacker cannot distinguish
  it from a normal MAC failure.
- SM4 block cipher (GB/T 32907-2016) and SM4-CBC (PKCS#7 padding,
  caller-supplied unpredictable IV per NIST SP 800-38A Appendix C).
  Constant-time-designed `subtle` linear-scan S-box (~1-2M blocks/s);
  opt-in bitsliced (table-less, gate-only) S-box via the
  `sm4-bitsliced` feature (v0.4 W3). PKCS#7 strip uses a
  constant-time scan over the final block; `decrypt` collapses every
  failure mode to a single `None` against padding-oracle attacks.
- HMAC-SM3 per RFC 2104, gmssl-cross-validated KAT vectors. Hash-first
  long-key path. v0.3 adds the streaming `HmacSm3` shape alongside
  single-shot `hmac_sm3`.
- PBKDF2-HMAC-SM3 per RFC 8018 §5.2. Caller-supplied output buffer
  (no internal allocation, no iteration-count default).
- Constant-time-designed `Fp` and `Fn` field arithmetic via
  `crypto-bigint = 0.7.3`.
- Renes-Costello-Batina complete addition formulas for the SM2 curve (a=-3 specialized).
- Fixed-base (v0.3 comb-table) and variable-base scalar multiplication,
  both constant-time-designed with `subtle::ConditionallySelectable`
  linear-scan table lookup.
- Fixed-K masked-select signing retry: the retry loop runs `K=2` iterations
  unconditionally, regardless of which iteration produced a valid signature.
  The constant-time contract holds for any RNG that respects `CryptoRng`;
  pathological RNGs cannot leak the secret via observable retry count.
- Strict canonical ASN.1 DER for `SEQUENCE { r, s }` (signatures), the
  GM/T 0009 SM2 ciphertext SEQUENCE, and all v0.3 PEM / PKCS#8 / SPKI
  / SEC1 wire formats. Rejects non-canonical leading-zero padding,
  sign-bit-set first bytes, empty content, and (for ciphertext
  coordinates) values `≥ p`.
- KAT vectors from GB/T 32905-2016 (SM3), GB/T 32918.2-2017 / .5-2017
  (SM2), GB/T 32907-2016 Appendix A.1 (SM4 single-block + 1M-round),
  GM/T 0042-2015 (HMAC-SM3), GM/T 0091-2020 (PBKDF2-HMAC-SM3).
- `gmssl` CLI cross-validation for HMAC-SM3, PBKDF2-HMAC-SM3, and
  (new in v0.3) SM2 sign/verify, SM2 encrypt/decrypt, and SM4-CBC
  in both directions. Gated on `GMCRYPTO_GMSSL=1`.
- `dudect-bencher` harness — 19 real `ct_*` targets (12 always-on + 2
  cfg-gated under `sm4-bitsliced-simd` + 3 cfg-gated under `sm4-aead` + 1
  cfg-gated under `sm4-xts` + 1 cfg-gated under `sm2-key-exchange`) plus a
  deliberately-leaky `negative_control`
  that proves the harness can detect leaks. Matrix-run under
  `features=default`, `sm4-bitsliced`, `sm4-bitsliced-simd`, and
  `sm4-bitsliced-simd,sm4-aead,sm4-xts,sm2-key-exchange`
  — PR-smoke 10⁴ samples; nightly 10⁵ samples (more samples = tighter
  empirical confidence at the same threshold). Most real targets gate
  at `|tau| < 0.20`; per-target policy in [`SECURITY.md`](SECURITY.md).
- Failure-mode invariant: every `Result`-returning public API uses
  the workspace-wide `gmcrypto_core::Error` (single `Failed` variant,
  `#[non_exhaustive]`); per-module aliases `sm2::Error`, `pem::Error`,
  `pkcs8::Error` all point at the same type. `verify_with_id` returns
  `bool`; DER decode returns `Option`. Defense against padding-oracle,
  malleability, and invalid-curve attacks.
- Zeroization on private keys, SM4 round keys, HMAC `K'` /
  `K' XOR ipad` / `K' XOR opad`, PBKDF2 intermediates, SM2 KDF
  buffers, and PKCS#8 inner-key scratch.

## Roadmap

| Version | Scope |
|---|---|
| v0.2 (shipped) | SM4 + SM4-CBC, HMAC-SM3, PBKDF2-HMAC-SM3, SM2 encrypt/decrypt + GM/T 0009 ciphertext DER, dudect harness expansion to 11 targets. See [`CHANGELOG.md`](CHANGELOG.md) `[0.2.0]`. |
| v0.3 (shipped) | Reusable ASN.1 reader/writer subset; PEM, encrypted PKCS#8, X.509 SPKI, SEC1; full bidirectional gmssl interop (incl. SM2 sign/verify + SM2 encrypt/decrypt with PEM-wrapped keys + SM4-CBC); raw byte-concat ciphertext helpers (`C1\|\|C3\|\|C2` modern + legacy `C1\|\|C2\|\|C3` decrypt); streaming `HmacSm3` / `Sm4CbcEncryptor` / `Sm4CbcDecryptor` + in-crate `Hash`/`Mac`/`BlockCipher` traits; comb-table `mul_g` (~5× sign-side speedup); dudect harness expanded to 12 targets. See [`CHANGELOG.md`](CHANGELOG.md) `[0.3.0]`. |
| v0.4 (shipped) | `wasm32-unknown-unknown` build target; RustCrypto-trait fit (`digest::Digest` / `digest::Mac` / `cipher::BlockEncrypt`/`BlockDecrypt`) behind opt-in `digest-traits` / `cipher-traits` feature flags; bitsliced (table-less, gate-only) SM4 S-box behind the opt-in `sm4-bitsliced` feature; new `gmcrypto-c` workspace member exposing the SM2/SM3/SM4/HMAC/PBKDF2 surface as a C ABI (cdylib + staticlib + cbindgen-generated header). See [`CHANGELOG.md`](CHANGELOG.md) `[0.4.0]`. |
| v0.5.0 (shipped) | C-ABI completeness (streaming CBC + raw-byte SM2 ciphertext + caller-supplied RNG callback); `sm4-bitsliced-simd` feature-flag scaffolding — v0.5.0 ships no SIMD fast path (the feature transparently delegates to the v0.4 single-block bitslice); BREAKING ergonomic cleanup — workspace-wide `gmcrypto_core::Error`, `Sm2PrivateKey::new(U256)` → `from_scalar(U256)` (gated behind `crypto-bigint-scalar`) + always-on `from_bytes_be(&[u8; 32])` constructor, `std` feature removed. See [`CHANGELOG.md`](CHANGELOG.md) `[0.5.0]`. |
| v0.5.1 (shipped) | W4 phase 2 — new sibling crate `gmcrypto-simd` carrying an **AVX2 8-way packed bitsliced SM4 S-box** behind opt-in `sm4-bitsliced-simd`, with runtime CPU detection (`cpufeatures`) and silent scalar fallback on non-AVX2 hosts. v0.5.1's `tau` dispatch fed the AVX2 path with 7 wasted lanes; production throughput matched v0.4 single-block bitslice. Dudect calibration update — `ct_fn_invert` / `ct_fp_invert` moved to PR-smoke telemetry + 100K nightly gross-regression sentinel after a GH Actions `ubuntu-24.04` runner-image shift on 2026-05-12 raised the empirical noise floor; see `docs/v0.5-dudect-recalibration.md`. See [`CHANGELOG.md`](CHANGELOG.md) `[0.5.1]`. |
| v0.6.0 (shipped) | **W4 milestone close-out — the throughput-win release.** W4 phase 3: NEON 4-way bitsliced SM4 on `aarch64` (compile-time baseline) + AVX2 32-byte full-width packed S-box (`sbox_x32`) + `Sm4CbcDecryptor::process_chunk` SIMD fanout. Per round of the SM4 decrypt, batched blocks' `tau` inputs pack into one SIMD register (32 bytes on x86_64 / 8-block batch, 16 bytes on aarch64 / 4-block batch) — 32× fewer SIMD dispatches per 8-block batch than v0.5.1. CBC encryption stays single-block (chain-of-blocks defeats SIMD packing). New dudect target `ct_sm4_cbc_decrypt_fanout` (Q6.7) gates the fanout path at `\|tau\| < 0.20`. Exhaustive lane-position-shifted SIMD tests (8192 + 4096 cases) per Q6.8. **No public API changes; no breaking changes — additive only.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.6.0]` and `docs/v0.6-scope.md`. |
| v0.7.0 (shipped) | **Cipher-mode surface expansion.** First version where v0.6's SIMD machinery is callable from user code outside the CBC-decrypt internal path. New: public length-flexible `Sm4Cipher::encrypt_blocks` / `decrypt_blocks` (W1; Q7.7); single-shot `sm4::mode_ctr::encrypt` / `decrypt` (W2; GM/T 0002-2012 §5.4); streaming `sm4::ctr_streaming::Sm4CtrCipher` (W3); new dudect target `ct_sm4_ctr_encrypt` (gates `\|tau\| < 0.20` on every cipher path). Plus the v0.8 AEAD scope doc (`docs/v0.7-aead-scope.md`, Q8.1–Q8.8 sign-off + v0.9 candidate Q-list). **No public API breakage — additive only.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.7.0]`. |
| v0.8.0 (shipped) | **AEAD core — SM4-GCM + SM4-CCM.** Per `docs/v0.7-aead-scope.md` Q8.1–Q8.8. New: `gmcrypto_simd::ghash::ghash_mul` constant-time GHASH primitive (CLMUL on `x86_64` / PMULL on `aarch64` / software Karatsuba fallback; W1); `sm4::mode_gcm::encrypt` / `decrypt` byte-identical to gmssl 3.1.1 `sm4 -gcm` with bidirectional interop (W2); `sm4::mode_ccm::encrypt` / `decrypt` byte-identical to OpenSSL 3.x EVP `SM4-CCM` across 8 KAT scenarios (W3; gmssl 3.1.1 lacks `sm4 -ccm` so OpenSSL is the oracle — see `docs/v0.8-ccm-kat-sourcing.md`); new dudect targets `ct_sm4_gcm_decrypt` + `ct_sm4_ccm_decrypt` + new CI matrix slot `sm4-bitsliced-simd,sm4-aead` (W4). Behind opt-in `sm4-aead` feature flag (additive; default-off). **No public API breakage — additive only.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.8.0]`. |
| v0.9.0 (shipped) | **AEAD ergonomics.** Per `docs/v0.9-scope.md` Q9.1–Q9.10. New: `sm4::GcmTagLen` + `mode_gcm::encrypt_with_tag_len` / `decrypt_with_tag_len` (NIST SP 800-38D §5.2.1.2 truncated tags; W1); incremental-input buffered `sm4::Sm4GcmEncryptor` (output-streaming) / `Sm4GcmDecryptor` (output-buffered, commit-on-verify) — differential-KAT-equal to single-shot across arbitrary chunking (W2); new dudect target `ct_sm4_gcm_decrypt_buffered` (W3); 6 single-shot AEAD C FFI symbols (`gmcrypto_sm4_gcm_*` / `gmcrypto_sm4_ccm_*`) behind a forwarding `sm4-aead` feature on `gmcrypto-c` (W4). Behind the existing `sm4-aead` flag. **No public API breakage — additive only.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.9.0]`. |
| v0.10.0 (shipped) | **Streaming AEAD FFI — SM4-GCM.** Per `docs/v0.10-scope.md` Q10.1–Q10.11. New: 9 `gmcrypto-c` FFI symbols + 2 opaque handle types exposing the v0.9 incremental-input buffered SM4-GCM encryptor (output-streaming) / decryptor (commit-on-verify) to C/C++/Go/Zig/Python — `gmcrypto_sm4_gcm_encryptor_{new,update,finalize,finalize_with_tag_len,free}` + `gmcrypto_sm4_gcm_decryptor_{new,update,finalize_verify,free}`, behind the existing `sm4-aead` feature on `gmcrypto-c`; `_finalize*` consume+free, single `GMCRYPTO_ERR`; C example `examples/sm4_gcm_streaming.c`. `regen-header` now implies `sm4-aead` (cbindgen drops cfg-gated opaque structs otherwise). No new `gmcrypto-core` API; no new dudect target. **No public API breakage — additive only.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.10.0]`. |
| v0.11.0 (shipped) | **RustCrypto trait-fit modernization.** Per `docs/v0.11-scope.md` Q11.1–Q11.11. Migrates the opt-in `digest-traits` / `cipher-traits` impls from `digest 0.10` / `cipher 0.4` to `digest 0.11` / `cipher 0.5` (the `crypto-common 0.2` / `hybrid-array` generation), in-place: `cipher` block backend reshaped to cipher 0.5's separate `BlockCipherEncBackend` / `BlockCipherDecBackend`; HMAC construction via `digest::KeyInit::new_from_slice` (`digest 0.11` `Mac` dropped `KeyInit`). **BREAKING for trait-fit consumers only** (bump your own `digest`/`cipher`); default-features users unaffected, output byte-identical (full KAT + gmssl interop). MSRV stays 1.85; no new dudect target. See [`CHANGELOG.md`](CHANGELOG.md) `[0.11.0]`. |
| v0.12.0 (shipped) | **SM4-XTS — tweakable disk/sector mode.** Per `docs/v0.12-scope.md` Q12.1–Q12.13. New: `sm4::mode_xts::{encrypt, decrypt}` + `XTS_KEY_SIZE` behind the opt-in `sm4-xts` feature — GB/T 17964-2021 (GM-T OID `1.2.156.10197.1.104.10`), full ciphertext stealing, byte-identical to OpenSSL 3.x EVP `SM4-XTS` (`xts_standard=GB`; **not** IEEE 1619 — they differ in the GF(2¹²⁸) tweak doubling). 32-byte key (`Key1 ‖ Key2`) + raw 16-byte tweak, lengths `[16 B, 16 MiB]`, single `None` failure mode, confidentiality-only (no auth). Pure-core (**no new dependency**); rides the `Sm4Cipher::encrypt_blocks` batch API + SIMD fanout. New dudect target `ct_sm4_xts_decrypt`. Also fixes a latent CI bug where the feature-conditional dudect gates never fired. C FFI deferred to v0.13. **Additive — no public API breakage.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.12.0]`. |
| v0.13.0 (shipped) | **C ABI for SM4-XTS.** Per `docs/v0.13-scope.md` Q13.1–Q13.12. New: `gmcrypto_sm4_xts_encrypt` / `_decrypt` + `GMCRYPTO_SM4_XTS_KEY_SIZE` in `gmcrypto-c`, behind a forwarding `sm4-xts` feature — single-shot, mirroring the single-shot SM4-GCM FFI shape minus nonce/AAD/tag (32-byte key, 16-byte tweak, length-preserving `(out, out_capacity, out_actual_len)` output), byte-identical to `gmcrypto_core::sm4::mode_xts`, single `GMCRYPTO_ERR`, confidentiality-only. The deferred FFI half of v0.12 (the v0.8-core → v0.10-FFI cadence). 5 new `c_smoke` tests + doc-only C example `examples/sm4_xts_sector.c`; regenerated header (no `regen-header` change needed — free fns + always-on const). No new `gmcrypto-core` API, no new dudect target, **no new dependency**. **Additive — no public API breakage.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.13.0]`. |
| v0.14 (assurance; not published) | **Parser fuzzing.** Per `docs/v0.14-scope.md` Q14.1–Q14.12. A `cargo-fuzz` (libFuzzer) harness over the full untrusted-input decode/decrypt surface of `gmcrypto-core` (16 targets: PEM, PKCS#8 decode/decrypt, SPKI, SEC1, DER reader primitives, SM2 DER + raw ciphertext, SM2 decrypt + verify, SM4-CBC/GCM/CCM/XTS decrypt) proving the failure-mode invariant on adversarial bytes — no panic / no OOM / no hang. Workspace-excluded `fuzz/` crate (nightly-only; never in the published dep graph) + a capped nightly CI job (`.github/workflows/fuzz-nightly.yml`). Initial sweep: **zero crashes** → no published-crate change, **not a crates.io release** (assurance/infra only). See [`fuzz/README.md`](fuzz/README.md). |
| v0.15.0 (shipped) | **SM4-XTS multi-sector (disk) helper.** Per `docs/v0.15-scope.md` Q15.1–Q15.12. New: `sm4::mode_xts::{encrypt_sectors, decrypt_sectors}` (opt-in `sm4-xts`) — encrypt/decrypt a contiguous run of equal-size disk sectors **in place** (`&mut [u8] -> Option<()>`), sector `i` under tweak = little-endian-128(`start_sector + i`) (the standard disk-XTS data-unit convention; owns the encoding the single-shot v0.12 API left to the caller). Byte-identical to looping the single-shot per sector (transitively OpenSSL `xts_standard=GB`-pinned); whole-block sectors (no ciphertext stealing); ciphers built once + reused scratch (no per-sector allocation); single `None` for all validation with `buf` untouched; confidentiality-only. **Pure-core: no new dependency, no new feature flag, no new SIMD, no new dudect target** (the existing `ct_sm4_xts_decrypt` covers it). C FFI deferred to v0.16. crates.io skips `0.14.0` (the unpublished fuzzing cycle). **Additive — no public API breakage.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.15.0]`. |
| v0.16.0 (shipped) | **C ABI for the SM4-XTS multi-sector helper.** Per `docs/v0.16-scope.md` Q16.1–Q16.12. New: `gmcrypto_sm4_xts_encrypt_sectors` / `_decrypt_sectors` in `gmcrypto-c`, behind the existing forwarding `sm4-xts` feature — **in-place** over a contiguous run of equal-size sectors (`buf: *mut u8` + `buf_len`; no `out`/`out_capacity`/`out_actual_len`, mirroring the core's `&mut [u8]` so disk callers never double-allocate), `start_sector: uint64_t`, tweak = LE-128(`start_sector + i`). Byte-identical to `gmcrypto_core::sm4::mode_xts::{encrypt,decrypt}_sectors`; single `GMCRYPTO_ERR` with `buf` untouched on error; confidentiality-only. The deferred FFI half of v0.15 — every cipher mode is now FFI-complete. 11 new `c_smoke` tests + doc-only C example `examples/sm4_xts_multisector.c`; regenerated header (no `regen-header` change — free fns, no new opaque structs). No new `gmcrypto-core` API, no new dudect target, **no new dependency**. **Additive — no public API breakage.** See [`CHANGELOG.md`](CHANGELOG.md) `[0.16.0]`. |
| v0.17 (public release; not a crates.io release) | **Open-sourced the repository.** Flipped the GitHub repo private → public on the 0.x line; CI migrated off the self-hosted macOS runner to GitHub-hosted (`ci.yml` → `macos-14`, `fuzz-nightly.yml` → `ubuntu-latest`). A *repository* milestone — no crate code changes (workspace stays `0.16.0`; crates.io skips `0.17.0` per the v0.14 precedent); v1.0 reserved. Per [`docs/v0.17-scope.md`](docs/v0.17-scope.md). |
| v0.18 (infra-assurance; not a crates.io release) | **dudect-gate hardening.** Per `docs/v0.18-scope.md` Q18.1–Q18.7. Pinned the dudect CI workflows' drift axes (`ubuntu-24.04` OS-label + exact `dtolnay/rust-toolchain@1.95.0`) and gate on a CI-level multi-run median `\|tau\|` (PR 3 runs / nightly 5 runs; `required_low` + the nightly sentinel on the median, `negative_control` on the min, completeness gate on `< N` runs). `timing_leaks.rs` byte-unchanged — the loop + median live in CI. A 100K×5 calibration showed `ct_fn_invert`/`ct_fp_invert` back near baseline (medians 0.006–0.028) but **kept on telemetry / sentinel — not re-promoted** (the noise is runner-image-sensitive; a tight gate would re-flake if it returns). Also a comma-free `rust-cache` `shared-key`. A *repository / infra-assurance* milestone — no crate code change (workspace stays `0.16.0`; crates.io skips `0.18.0` per the v0.14 / v0.17 precedent). See [`docs/v0.5-dudect-recalibration.md`](docs/v0.5-dudect-recalibration.md) (v0.18 resolution). |
| v0.19 (infra-assurance; not a crates.io release) | **Self-calibrating relative dudect gate — TESTED and FALSIFIED → honest fallback.** Per `docs/v0.19-scope.md` Q19.1–Q19.7. Added two fix-vs-fix noise-floor probes (`noise_floor_f{n,p}_invert`) + a relative gate `median(target) ≤ max(0.20, 4·median(probe))` to re-promote `ct_fn_invert`/`ct_fp_invert`. The 100K calibration disproved the matched-sensitivity premise: the probes stay quiet (~0.005) while the targets spike to [0.26–0.32] (`ct_fp_invert` median 0.2606, ratio 50) — the noise is in the two-input class split, not the operation, so a same-input probe can't track it. Reverted to telemetry / sentinel @0.55; probes kept as telemetry (evidence for a v0.21+ class-split-aware "noise-twin"). Only the dev-only bench harness changed (workspace stays `0.16.0`; crates.io skips `0.19.0`). See [`docs/v0.5-dudect-recalibration.md`](docs/v0.5-dudect-recalibration.md) (v0.19 resolution). |
| v0.20 (infra-assurance; not a crates.io release) | **Streaming-decryptor differential fuzzing + `cargo fuzz coverage` + codified v1.0 CT baseline.** Per `docs/v0.20-scope.md` Q20.1–Q20.5. Two new differential targets (`fuzz_sm4_{cbc,gcm}_streaming_decrypt`) assert the streaming decryptors fed in arbitrary chunks equal the single-shot oracle; fuzz sweep → 18 targets (zero crashes, zero divergences); a non-gating `cargo fuzz coverage` nightly job (llvm-cov TOTALS artifact). Codified the settled v1.0 CT baseline in `SECURITY.md` (composite targets gated <0.20; the two single-inversion diagnostics on telemetry/sentinel @0.55, narrow revisit door). Theme chosen after a Codex+Grok discussion. Only `fuzz/` + `fuzz-nightly.yml` + docs changed (workspace stays `0.16.0`; crates.io skips `0.20.0`). |
| v0.21 (infra-assurance; not a crates.io release) | **v1.0 readiness audit.** Per `docs/v0.21-scope.md` Q21.1–Q21.9. Froze + tooling-guarded the public API ahead of 1.0: committed `cargo-public-api` baselines + an enforced drift-check, `cargo-semver-checks` (informational pre-1.0), a `cargo doc -D warnings` gate, and a `--no-default-features`/`--all-features` matrix (new `.github/workflows/api-stability.yml`); finalized the `#[doc(hidden)]` surface (3 core items + the whole `gmcrypto-simd` internal backend) with "not public / not SemVer" notes + existence tests; froze the docs. Non-publishing (doc-attributes + tests only, no behavior change; workspace stays `0.16.0`, crates.io skips `0.21.0`). **Headline finding:** the always-on public API names `crypto-bigint 0.7` types — a decision to resolve before 1.0 ([`docs/v1.0-readiness.md`](docs/v1.0-readiness.md) §3.A). Deferred to post-1.0: class-split-aware "noise-twin" dudect reference; round-trip/differential parser fuzzing; `aead 0.6` (upstream `0.6.0-rc.10`); AVX-512 `sbox_x64`; CCM buffered input; the `dudect-nightly` leg-cancellation fix. |
| v0.22 (infra-assurance; not a crates.io release) | **API-tightening — decouple `crypto-bigint 0.7` from the 1.0 contract.** Per `docs/v0.22-scope.md` Q22.1–Q22.8 (resolves the v0.21 §3.A finding via Option 2). Group A: `#[doc(hidden)]` (kept `pub`) the low-level `sm2::curve` / `sm2::scalar_mul` / `ProjectivePoint::to_affine` surface. Group B: reshape `asn1::{encode,decode}_sig` + `Sm2Ciphertext::{x,y}` from `U256` to `[u8; 32]`, **byte-output-identical** (KAT + gmssl interop 11/11). Group C: `ProjectivePoint` stays public + unchanged. The always-on (default-features) public API now names **zero** `crypto-bigint` types; only the opt-in `crypto-bigint-scalar` `from_scalar(U256)` retains it (documented escape hatch). **BREAKING** for consumers that named `Fn`/`Fp`/`encode_sig`/`Sm2Ciphertext::x`; ships with 1.0 (non-publishing — workspace stays `0.16.0`, crates.io skips `0.22.0`). |
| v0.23 (infra-assurance; not a crates.io release) | **Pre-1.0 re-audit remediation.** Per `docs/v0.23-scope.md` Q23.1–Q23.9 + `docs/v1.0-reaudit.md`. A multi-model adversarial pre-publish re-audit (Codex `gpt-5.5` + Grok, source-verified) returned NO-GO as-is — core primitives sound, but 2 API/ABI BLOCKERs + API-finality / zeroize-on-failure / spec-ceiling / doc should-fixes. Remediated: **W1 (API)** `Sm2PrivateKey::public_key() -> Sm2PublicKey`, the raw `ProjectivePoint` surface + `asn1::{reader,writer,oid}` + `traits::*` made `#[doc(hidden)]`; **W2 (crypto)** single-shot SM4-GCM `encrypt` made fallible (`2^36−32` ceiling), the fallible `rand_core::TryCryptoRng` bound on SM2 sign/encrypt (no-panic RNG-failure path), a fixed-budget constant-time SM2 nonce sampler, sign-nonce / CCM-tentative-plaintext / `Sm3`-on-drop zeroization, SM2 KDF wrap guard; **W3 (C ABI)** the SM4-GCM/CCM/XTS FFI symbols made always-on so `gmcrypto.h` == the default build. **Runtime output byte-identical** (gmssl interop 11/11) except the deliberately-changed signatures; the breaking API/ABI changes ship with 1.0 (non-publishing — workspace stays `0.16.0`, crates.io skips `0.23.0`). |
| v1.0 | **API stabilization + crates.io publish** (the deliberate cut after the audit + tightening + re-audit: the `crypto-bigint`-exposure decision is **resolved** [v0.22] and the pre-publish re-audit findings **remediated** [v0.23], bump `0.16.0 → 1.0.0` with exact sibling pins, publish `gmcrypto-simd → core → c`, flip `cargo-semver-checks` to enforced — see the runbook in [`docs/v1.0-readiness.md`](docs/v1.0-readiness.md) §4). |
| v1.0.1 (shipped) | **Readiness-cleanup patch — first post-1.0 publish.** Per the release-readiness synthesis [`docs/audits/2026-06-02-release-readiness-synthesis.md`](docs/audits/2026-06-02-release-readiness-synthesis.md) (GO-WITH-FOLLOWUP, 0 blockers). **Functional fix:** the `gmcrypto-c` `gmcrypto_version()` returned a hardcoded `"0.4.0"` → now the real `CARGO_PKG_VERSION` (the one behavior change justifying a patch publish). Plus doc improvements (raw-block ECB warnings, cbindgen header preconditions, FFI RNG/XTS notes, trait-stability caveats) + CI-health fixes (`sm4-xts` in MSRV/wasm/deny; dudect allowlist; `generate-lockfile` before deny; a new `simd-x86` job that caught a latent `unsafe_code` compile bug; removed `pull_request` `paths-ignore` so docs PRs aren't blocked). **No API/ABI change; wire output byte-identical to 1.0.0** (enforced `cargo-semver-checks`). 6 merged PRs (#87–#92). See [`CHANGELOG.md`](CHANGELOG.md) `[1.0.1]`. |
| v1.1.0 | **SM2 key exchange (GM/T 0003.3) with key confirmation.** Per [`docs/v1.1-sm2-key-exchange-design.md`](docs/v1.1-sm2-key-exchange-design.md) + `docs/v1.1-scope.md`. New `sm2::key_exchange` module behind the opt-in `sm2-key-exchange` feature (pure-core, no new dependency): `Sm2KxInitiator`/`Sm2KxResponder` role state-machines with typestate-enforced single-use ephemerals and commit-on-confirm key release; byte-identical to the GM/T 0003.5-2012 recommended-curve worked example (K + S_A/S_B); new dudect target `ct_sm2_key_exchange` + fuzz target `fuzz_sm2_kx`. C FFI deferred to v1.2. **Additive — no public API breakage.** |
| v1.2.0 | **C FFI for SM2 key exchange.** Per `docs/v1.2-scope.md` Q2.1–Q2.10. 9 new `gmcrypto-c` symbols + 2 opaque handles (`gmcrypto_sm2_kx_{initiator,responder}_t`) project the v1.1 typestate into C: initiator born-waiting (`_new` emits `R_A`), `_confirm`/`_finish` consume + free, failed-respond spends the handle; SysRng defaults + `_with_rng` variants; always-on per the v0.23 posture (72 FFI entry points). The GM/T 0003.5 recommended-curve KAT reproduced **byte-for-byte through the C ABI**; FFI↔Rust cross-handshakes both directions; `fuzz_c_abi` KX op + seed; doc-only `sm2_key_exchange.c` example. No core API change. **Additive — no breakage.** |

## Quick-start

```rust
use gmcrypto_core::sm2::{
    sign_with_id, verify_with_id, Sm2PrivateKey, DEFAULT_SIGNER_ID,
};
use getrandom::SysRng;
use hex_literal::hex;

// v0.5 W5 — `from_bytes_be` is the recommended public constructor
// (always-on, doesn't expose `crypto_bigint::U256` to callers).
let d_be: [u8; 32] = hex!(
    "3945208F7B2144B13F36E38AC6D39F95889393692860B51A42FB81EF4DF7C5B8"
);
let key = Sm2PrivateKey::from_bytes_be(&d_be).expect("d in [1, n-2]");
// `public_key()` returns an `Sm2PublicKey` directly (v0.23).
let public = key.public_key();

// SM2 sign/encrypt take a fallible `rand_core::TryCryptoRng` (v0.23), so
// `getrandom::SysRng` is passed directly — no `UnwrapErr` wrapper.
let mut rng = SysRng;
let sig = sign_with_id(&key, DEFAULT_SIGNER_ID, b"hello", &mut rng).unwrap();
assert!(verify_with_id(&public, DEFAULT_SIGNER_ID, b"hello", &sig));
```

**SM2 key exchange** (v1.1, opt-in — `gmcrypto-core = { version = "1.2",
features = ["sm2-key-exchange"] }`): an authenticated two-party key agreement
with mandatory key confirmation. Each step consumes the state machine, so an
ephemeral cannot be reused and neither side sees the key before the peer's
confirmation tag verifies:

```rust
use gmcrypto_core::sm2::key_exchange::{Sm2KxInitiator, Sm2KxResponder};

// A (initiator) and B (responder) hold each other's static public keys.
let init = Sm2KxInitiator::new(&key_a, &pub_b, b"A-id", b"B-id", 32)?;
let (r_a, init_waiting) = init.produce_ephemeral(&mut rng)?; // R_A -> B

let resp = Sm2KxResponder::new(&key_b, &pub_a, b"A-id", b"B-id", 32)?;
let (r_b, s_b, resp_waiting) = resp.respond(&r_a, &mut rng)?; // (R_B, S_B) -> A

let (k_a, s_a) = init_waiting.confirm(&r_b, &s_b)?; // verifies S_B; S_A -> B
let k_b = resp_waiting.finish(&s_a)?;               // verifies S_A
assert_eq!(k_a.as_bytes(), k_b.as_bytes());         // 32-byte agreed key
```

(From C, the same handshake is `gmcrypto_sm2_kx_*` — see the v1.2 scope above
and [`crates/gmcrypto-c/examples/sm2_key_exchange.c`](crates/gmcrypto-c/examples/sm2_key_exchange.c).)

## Threat model

See [`SECURITY.md`](SECURITY.md). Briefly: server-side use, dedicated host,
operator-trusted, network MITM in scope, side-channel attacks beyond what the
dudect harness covers are NOT in scope.

## Build & test

```bash
cargo test --workspace                                                          # unit + integration
cargo bench --bench timing_leaks --features crypto-bigint-scalar                # local timing harness (~75s)
DUDECT_SAMPLES=10000 cargo bench --bench timing_leaks --features crypto-bigint-scalar  # match CI smoke budget
```

`gmssl` interop test (gated; install [`gmssl`](https://github.com/guanzhi/GmSSL)
v3.1.1 to enable):

```bash
GMCRYPTO_GMSSL=1 cargo test --test interop_gmssl
```

## wasm32 support

`gmcrypto-core` builds on `wasm32-unknown-unknown` as of v0.4. CI gates
both stable and MSRV (1.85) builds on the target.

```bash
rustup target add wasm32-unknown-unknown
cargo build -p gmcrypto-core --target wasm32-unknown-unknown --no-default-features
```

The crate is `no_std + alloc` only and does NOT pull `getrandom`'s
`wasm_js` backend or `wasm-bindgen` / `js-sys` into its default dep
graph. Wasm callers wire their own `rand_core::Rng` impl — typically
by enabling `getrandom`'s `wasm_js` feature in *their* `Cargo.toml`:

```toml
[dependencies]
gmcrypto-core = "1.0"
rand_core = { version = "0.10", default-features = false }
getrandom = { version = "0.4", default-features = false, features = ["wasm_js"] }
```

```rust
use gmcrypto_core::sm2::{sign_with_id, Sm2PrivateKey, DEFAULT_SIGNER_ID};
use getrandom::SysRng;

let mut rng = SysRng; // wasm_js-backed when targeting wasm32
let sig = sign_with_id(&priv_key, DEFAULT_SIGNER_ID, b"msg", &mut rng).unwrap();
```

A `wasm-bindgen-test`-driven test runner (running KAT vectors under
Node or a headless browser) is post-v0.4 — v0.4 ships the build-target
gate only.

## License

Apache-2.0. See [`LICENSE`](LICENSE).

Some reference outputs use the upstream [`gmssl`](https://github.com/guanzhi/GmSSL)
tool. This project is independent of that project.