krypteia-arcana 0.1.0

Pure-Rust classical cryptographic primitives: RSA (PKCS#1 v1.5, OAEP), ECC (NIST P-256/384/521, secp256k1), ECDSA, EdDSA (Ed25519), X25519, AES (128/192/256, GCM/CBC), DES/3DES, SHA-1/2/3, HMAC. Side-channel-aware (Montgomery ladder, branchless point_add_ct). Targets embedded (no_std), STM32 M0/M4/M33, ESP32-C3 RISC-V. Zero runtime dependencies.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
//! X25519 Diffie-Hellman key agreement on Curve25519 (RFC 7748).
//!
//! X25519 is the "EDDH" (Edwards/Montgomery Diffie-Hellman) pairing
//! around Curve25519 — a Montgomery curve over `p = 2^255 - 19`. It
//! is the ECDH variant used by TLS 1.3, Noise, Signal, WireGuard,
//! and most modern protocols that want a fixed, fast, audit-friendly
//! ECDH.
//!
//! Unlike the short-Weierstrass curves exposed through the
//! [`super::curves::Curve`] trait, X25519 works entirely on the
//! **u-coordinate** (the x-coordinate of the Montgomery form). There
//! is no point addition, no compressed/uncompressed distinction, and
//! no branch on the y-coordinate — the Montgomery ladder operates on
//! projective (X:Z) pairs and returns a single 32-byte little-endian
//! u-coordinate.
//!
//! # Side-channel posture
//!
//! Per `arcana/doc/sca/countermeasures/x25519_x448.rst`:
//!
//! | Threat                                            | Status     | Roadmap item                                              |
//! |---------------------------------------------------|------------|-----------------------------------------------------------|
//! | Cache-timing on Montgomery ladder                 | partial    | `T1-G` — audit pass mirroring Weierstrass commit `76191c1`|
//! | SPA on Cortex-M0 (Weissbart-Picek-Batina 2021)    | vulnerable | `T1-G` + `T2-A` (Z-rerand defeats their template attack)  |
//! | DPA on field operations                           | vulnerable | `T2-A` — Z-rerandomization on `(X : Z)`                   |
//! | Template attacks                                  | vulnerable | `T2-A` (alignment break) + `T2-B` (scalar blinding)       |
//! | Invalid-curve attack on peer pubkey               | covered    | RFC 7748 twist security                                   |
//! | Small-subgroup contributory check                 | partial    | `T2-K` — confirm CT all-zero rejection                    |
//!
//! Curve25519 is CT **by construction** (single u-coordinate, no
//! special cases for the neutral element), but the concrete
//! Rust implementation can still leak through the `cswap` mask
//! pattern (see `super::field` `black_box` shielding) and
//! through unmodelled cache-line accesses. The
//! `weissbart2021_curve25519_ml_sca` paper demonstrated
//! deep-learning template attacks on Cortex-M0 even against
//! random-delay defences; Z-rerandomization is the standard
//! answer.
//!
//! # API
//!
//! ```rust,ignore
//! use arcana::ecc::x25519::{x25519_derive_public, x25519_ecdh};
//!
//! // Alice and Bob each draw 32 random bytes as their secret key.
//! let alice_sk: [u8; 32] = /* rng */;
//! let bob_sk:   [u8; 32] = /* rng */;
//!
//! // Derive public keys.
//! let alice_pk = x25519_derive_public(&alice_sk);
//! let bob_pk   = x25519_derive_public(&bob_sk);
//!
//! // Exchange public keys, then each derives the shared secret.
//! let s_ab = x25519_ecdh(&alice_sk, &bob_pk);
//! let s_ba = x25519_ecdh(&bob_sk,   &alice_pk);
//! assert_eq!(s_ab, s_ba);
//! ```
//!
//! # Test vectors
//!
//! The tests at the bottom of this file pin the two §5.2 primitive
//! vectors and the §6.1 full Diffie-Hellman vector directly from
//! RFC 7748. Any future regression in the ladder, the clamping, or
//! the LE byte encoding fails against those bytes immediately.

use super::field::*;

// ============================================================================
// Curve constants (RFC 7748 §4.1)
// ============================================================================

/// `a24 = (486662 - 2) / 4 = 121665` — the constant used inside the
/// Montgomery ladder's doubling step.
///
/// Declared as a `FieldElement<4>` so it can be fed directly into
/// `field_mul` without re-encoding on every call.
fn a24() -> FieldElement<4> {
    let mut fe = FieldElement::<4>::ZERO;
    fe.limbs[0] = 121_665;
    fe
}

/// X25519 base point u-coordinate = 9 (RFC 7748 §4.1).
const BASE_U: [u8; 32] = {
    let mut b = [0u8; 32];
    b[0] = 9;
    b
};

// ============================================================================
// Scalar clamping + u-coordinate decoding (RFC 7748 §5)
// ============================================================================

/// RFC 7748 §5 `decodeScalar25519`: clamps the 32-byte secret scalar
/// so that the resulting integer always lies in [2^254, 2^255) and
/// is a multiple of 8. This is the property that lets the Montgomery
/// ladder always iterate exactly 255 times with the top bit at
/// position 254 guaranteed to be 1.
///
/// Concretely:
/// - Clear the 3 low bits of byte 0 (= make the scalar a multiple of 8)
/// - Clear the high bit of byte 31 (= force bit 255 = 0)
/// - Set bit 6 of byte 31 (= force bit 254 = 1)
fn decode_scalar(scalar: &[u8; 32]) -> [u8; 32] {
    let mut k = *scalar;
    k[0] &= 248; // 0b11111000
    k[31] &= 127; // 0b01111111
    k[31] |= 64; // 0b01000000
    k
}

/// RFC 7748 §5 `decodeUCoordinate` (Curve25519 variant):
/// clear the most significant bit of the last byte, then interpret
/// as a little-endian integer in `Fp`.
///
/// Clearing the top bit matters because some peers (historically TLS)
/// don't mask it themselves; the spec requires the receiver to be
/// permissive.
fn decode_u(u: &[u8; 32]) -> FieldElement<4> {
    let mut buf = *u;
    buf[31] &= 0x7f;
    FieldElement::<4>::from_bytes_le(&buf)
}

/// Encode a field element as 32 little-endian bytes.
fn encode_u(fe: &FieldElement<4>) -> [u8; 32] {
    let v = fe.to_bytes_le();
    let mut out = [0u8; 32];
    out.copy_from_slice(&v);
    out
}

// ============================================================================
// Constant-time conditional swap of two field elements
// ============================================================================

/// Constant-time swap of `a` and `b` iff `swap == 1`.
///
/// `swap` must be 0 or 1. Uses a word-wide mask and XOR trick so the
/// data dependency is the same for both branches (essential: the
/// `swap` bit is derived from the secret scalar).
fn ct_swap_fe(a: &mut FieldElement<4>, b: &mut FieldElement<4>, swap: u64) {
    let mask = 0u64.wrapping_sub(swap);
    for i in 0..4 {
        let t = mask & (a.limbs[i] ^ b.limbs[i]);
        a.limbs[i] ^= t;
        b.limbs[i] ^= t;
    }
}

// ============================================================================
// The X25519 primitive — RFC 7748 §5, Montgomery ladder on (X:Z)
// ============================================================================

/// RFC 7748 §5 `X25519(scalar, u)`.
///
/// Takes a 32-byte little-endian scalar and a 32-byte little-endian
/// u-coordinate, and returns the 32-byte little-endian u-coordinate
/// of `scalar * (u, v)` on Curve25519 (where `v` is uniquely
/// determined by `u` up to sign — we never need it).
///
/// The ladder operates on projective `(X, Z)` pairs where the affine
/// u-coordinate is `X/Z`. 255 ladder steps are performed (bits 254..0
/// of the clamped scalar); the high bit 254 is always 1 after
/// clamping, so the first iteration deterministically initialises
/// (x_2, x_3) = (u, 1), (z_2, z_3) = (1, u) via the cswap.
pub fn x25519(scalar: &[u8; 32], u: &[u8; 32]) -> [u8; 32] {
    let k = decode_scalar(scalar);
    let x1 = decode_u(u);
    let a24 = a24();
    let p = &CURVE25519_P;

    // Ladder state: x_2 = 1, z_2 = 0, x_3 = x1, z_3 = 1.
    let mut x_2 = FieldElement::<4>::one();
    let mut z_2 = FieldElement::<4>::ZERO;
    let mut x_3 = x1;
    let mut z_3 = FieldElement::<4>::one();

    // `swap` tracks the conditional swap state between iterations.
    // See RFC 7748 §5 for the reference Python pseudocode.
    let mut swap: u64 = 0;

    for t in (0..=254).rev() {
        let k_t = ((k[t >> 3] >> (t & 7)) & 1) as u64;
        swap ^= k_t;
        ct_swap_fe(&mut x_2, &mut x_3, swap);
        ct_swap_fe(&mut z_2, &mut z_3, swap);
        swap = k_t;

        // A  = x_2 + z_2
        // AA = A^2
        // B  = x_2 - z_2
        // BB = B^2
        // E  = AA - BB
        // C  = x_3 + z_3
        // D  = x_3 - z_3
        // DA = D * A
        // CB = C * B
        // x_3 = (DA + CB)^2
        // z_3 = x_1 * (DA - CB)^2
        // x_2 = AA * BB
        // z_2 = E * (AA + a24 * E)
        let a = field_add(&x_2, &z_2, p);
        let aa = field_sqr(&a, p);
        let b = field_sub(&x_2, &z_2, p);
        let bb = field_sqr(&b, p);
        let e = field_sub(&aa, &bb, p);
        let c = field_add(&x_3, &z_3, p);
        let d = field_sub(&x_3, &z_3, p);
        let da = field_mul(&d, &a, p);
        let cb = field_mul(&c, &b, p);

        let da_plus_cb = field_add(&da, &cb, p);
        x_3 = field_sqr(&da_plus_cb, p);

        let da_minus_cb = field_sub(&da, &cb, p);
        let da_minus_cb_sq = field_sqr(&da_minus_cb, p);
        z_3 = field_mul(&x1, &da_minus_cb_sq, p);

        x_2 = field_mul(&aa, &bb, p);

        let a24_e = field_mul(&a24, &e, p);
        let aa_plus_a24e = field_add(&aa, &a24_e, p);
        z_2 = field_mul(&e, &aa_plus_a24e, p);
    }

    // Final swap based on the residual `swap` state.
    ct_swap_fe(&mut x_2, &mut x_3, swap);
    ct_swap_fe(&mut z_2, &mut z_3, swap);

    // Return x_2 / z_2 = x_2 * z_2^{p-2} mod p as 32 LE bytes.
    let z_inv = field_inv(&z_2, p);
    let result = field_mul(&x_2, &z_inv, p);
    encode_u(&result)
}

// ============================================================================
// Public API — keygen + ECDH convenience wrappers
// ============================================================================

/// Derive the X25519 **public key** from a 32-byte secret key.
///
/// The secret key is any 32 random bytes; the caller is responsible
/// for drawing them from a suitable CSPRNG. The returned public key
/// is the 32-byte u-coordinate of `sk * base` on Curve25519.
///
/// This is `X25519(sk, 9)` per RFC 7748 §5.
pub fn x25519_derive_public(sk: &[u8; 32]) -> [u8; 32] {
    x25519(sk, &BASE_U)
}

/// X25519 Diffie-Hellman: derive a shared secret from our secret key
/// and the peer's public key.
///
/// Returns the 32-byte u-coordinate of `sk * peer_pk`. This is the
/// raw shared secret (NIST SP 800-56A "Z"); pass it to an HKDF or
/// similar KDF before using it for symmetric keying.
///
/// # Small-subgroup attack note
///
/// RFC 7748 §6.1 warns that X25519 accepts certain "contributory"
/// public keys (points in small subgroups) that collapse the shared
/// secret to a fixed value. A defensive implementation may want to
/// reject the result if it is all-zero, which signals the peer sent
/// a low-order point. We deliberately do **not** reject here because
/// (a) the spec allows it, and (b) a downstream KDF with context
/// binding (TLS 1.3 `transcript_hash`, Noise `HKDF`, ...) is the
/// standard mitigation. Callers who want the low-order check can
/// run it themselves:
///
/// ```rust,ignore
/// let shared = x25519_ecdh(&sk, &peer_pk);
/// if shared.iter().all(|&b| b == 0) {
///     return Err("contributory shared secret");
/// }
/// ```
pub fn x25519_ecdh(sk: &[u8; 32], peer_pk: &[u8; 32]) -> [u8; 32] {
    x25519(sk, peer_pk)
}

// ============================================================================
// Tests (RFC 7748 pinned vectors)
// ============================================================================

#[cfg(test)]
mod tests {
    use super::*;

    fn hex32(h: &str) -> [u8; 32] {
        assert_eq!(h.len(), 64);
        let mut out = [0u8; 32];
        for i in 0..32 {
            out[i] = u8::from_str_radix(&h[2 * i..2 * i + 2], 16).unwrap();
        }
        out
    }

    // ----- RFC 7748 §5.2 primitive test vector #1 -----
    //
    // Scalar:      a546e36bf0527c9d3b16154b82465edd62144c0ac1fc5a18506a2244ba449ac4
    // u:           e6db6867583030db3594c1a424b15f7c726624ec26b3353b10a903a6d0ab1c4c
    // X25519(k,u): c3da55379de9c6908e94ea4df28d084f32eccf03491c71f754b4075577a28552
    #[test]
    fn rfc7748_section_5_2_vector_1() {
        let scalar = hex32("a546e36bf0527c9d3b16154b82465edd62144c0ac1fc5a18506a2244ba449ac4");
        let u = hex32("e6db6867583030db3594c1a424b15f7c726624ec26b3353b10a903a6d0ab1c4c");
        let expected = hex32("c3da55379de9c6908e94ea4df28d084f32eccf03491c71f754b4075577a28552");
        let got = x25519(&scalar, &u);
        assert_eq!(got, expected);
    }

    // ----- RFC 7748 §5.2 primitive test vector #2 -----
    //
    // Scalar:      4b66e9d4d1b4673c5ad22691957d6af5c11b6421e0ea01d42ca4169e7918ba0d
    // u:           e5210f12786811d3f4b7959d0538ae2c31dbe7106fc03c3efc4cd549c715a493
    // X25519(k,u): 95cbde9476e8907d7aade45cb4b873f88b595a68799fa152e6f8f7647aac7957
    #[test]
    fn rfc7748_section_5_2_vector_2() {
        let scalar = hex32("4b66e9d4d1b4673c5ad22691957d6af5c11b6421e0ea01d42ca4169e7918ba0d");
        let u = hex32("e5210f12786811d3f4b7959d0538ae2c31dbe7106fc03c3efc4cd549c715a493");
        let expected = hex32("95cbde9476e8907d7aade45cb4b873f88b595a68799fa152e6f8f7647aac7957");
        let got = x25519(&scalar, &u);
        assert_eq!(got, expected);
    }

    // ----- RFC 7748 §6.1 full Diffie-Hellman test vector -----
    //
    // Alice's private key: 77076d0a7318a57d3c16c17251b26645df4c2f87ebc0992ab177fba51db92c2a
    // Alice's public key:  8520f0098930a754748b7ddcb43ef75a0dbf3a0d26381af4eba4a98eaa9b4e6a
    // Bob's private key:   5dab087e624a8a4b79e17f8b83800ee66f3bb1292618b6fd1c2f8b27ff88e0eb
    // Bob's public key:    de9edb7d7b7dc1b4d35b61c2ece435373f8343c85b78674dadfc7e146f882b4f
    // Shared secret (K):   4a5d9d5ba4ce2de1728e3bf480350f25e07e21c947d19e3376f09b3c1e161742
    #[test]
    fn rfc7748_section_6_1_alice_pk() {
        let alice_sk = hex32("77076d0a7318a57d3c16c17251b26645df4c2f87ebc0992ab177fba51db92c2a");
        let expected = hex32("8520f0098930a754748b7ddcb43ef75a0dbf3a0d26381af4eba4a98eaa9b4e6a");
        assert_eq!(x25519_derive_public(&alice_sk), expected);
    }

    #[test]
    fn rfc7748_section_6_1_bob_pk() {
        let bob_sk = hex32("5dab087e624a8a4b79e17f8b83800ee66f3bb1292618b6fd1c2f8b27ff88e0eb");
        let expected = hex32("de9edb7d7b7dc1b4d35b61c2ece435373f8343c85b78674dadfc7e146f882b4f");
        assert_eq!(x25519_derive_public(&bob_sk), expected);
    }

    #[test]
    fn rfc7748_section_6_1_shared_secret() {
        let alice_sk = hex32("77076d0a7318a57d3c16c17251b26645df4c2f87ebc0992ab177fba51db92c2a");
        let bob_sk = hex32("5dab087e624a8a4b79e17f8b83800ee66f3bb1292618b6fd1c2f8b27ff88e0eb");
        let alice_pk = x25519_derive_public(&alice_sk);
        let bob_pk = x25519_derive_public(&bob_sk);
        let expected = hex32("4a5d9d5ba4ce2de1728e3bf480350f25e07e21c947d19e3376f09b3c1e161742");
        assert_eq!(x25519_ecdh(&alice_sk, &bob_pk), expected);
        assert_eq!(x25519_ecdh(&bob_sk, &alice_pk), expected);
    }

    // ----- Roundtrip on an arbitrary key pair -----
    #[test]
    fn x25519_roundtrip_custom_keys() {
        let alice_sk = hex32("0101010101010101010101010101010101010101010101010101010101010101");
        let bob_sk = hex32("0202020202020202020202020202020202020202020202020202020202020202");
        let alice_pk = x25519_derive_public(&alice_sk);
        let bob_pk = x25519_derive_public(&bob_sk);
        let s_ab = x25519_ecdh(&alice_sk, &bob_pk);
        let s_ba = x25519_ecdh(&bob_sk, &alice_pk);
        assert_eq!(s_ab, s_ba);
        // Sanity: the shared secret is non-zero (an all-zero output
        // would signal a low-order input point; our TestRng keys are
        // not in a small subgroup).
        assert!(s_ab.iter().any(|&b| b != 0));
    }

    // ----- Clamping test -----
    //
    // Changing any of the bits that `decode_scalar` forces (low 3
    // bits of byte 0, top 2 bits of byte 31) must not change the
    // output. This exercises the clamping path in isolation.
    #[test]
    fn clamping_is_idempotent() {
        let base = hex32("a546e36bf0527c9d3b16154b82465edd62144c0ac1fc5a18506a2244ba449ac4");
        let u = hex32("e6db6867583030db3594c1a424b15f7c726624ec26b3353b10a903a6d0ab1c4c");
        let ref_out = x25519(&base, &u);

        // Force-dirty the bits that clamping will clear anyway.
        let mut dirty = base;
        dirty[0] |= 0b0000_0111; // set low 3 bits (will be cleared)
        dirty[31] |= 0b1000_0000; // set high bit (will be cleared)
        dirty[31] &= !0b0100_0000; // clear bit 6 (will be set)
        let dirty_out = x25519(&dirty, &u);
        assert_eq!(ref_out, dirty_out);
    }

    // ----- High-bit-of-u decoding test -----
    //
    // decode_u clears the top bit of byte 31 per RFC 7748, so toggling
    // that bit on the input u must not change the output.
    #[test]
    fn u_high_bit_is_ignored() {
        let scalar = hex32("a546e36bf0527c9d3b16154b82465edd62144c0ac1fc5a18506a2244ba449ac4");
        let u = hex32("e6db6867583030db3594c1a424b15f7c726624ec26b3353b10a903a6d0ab1c4c");
        let ref_out = x25519(&scalar, &u);

        let mut u_dirty = u;
        u_dirty[31] |= 0x80;
        let dirty_out = x25519(&scalar, &u_dirty);
        assert_eq!(ref_out, dirty_out);
    }
}