cnfy-uint
Pure Rust 256-bit modular arithmetic. Zero dependencies. Zero unsafe code.
For cryptographic applications. No heap allocations.
Part of the cnfy suite -- cryptographic primitives built from scratch to eliminate supply chain attacks.
Performance
Full benchmarks and methodology: cnfy-bench. Security analysis: SECURITY.md.
We benchmark against crypto-bigint, ruint, ethnum, num-bigint, and uint -- all excellent libraries that we've learned from.
Why It's Fast
Sparse prime optimization. secp256k1's field prime P = 2^256 - 0x1000003D1 has a tiny complement. Reduction folds excess bits instead of running full division -- every modular operation benefits.
Cross-crate inlining. All 47 hot-path methods are #[inline]. cnfy-secp256k1 gets fully inlined arithmetic without LTO.
Algorithm selection:
- Modular inverse: Lehmer's extended GCD with two-phase inner loop (u64/u128)
- Division: Knuth's Algorithm D with MG10 reciprocal-based quotient estimation
- Batch inverse: Montgomery's trick (one inverse + 3N multiplications for N elements)
Quick Start
use U256;
// secp256k1 field prime
let p = U256from_be_limbs;
let a = U256from_be_limbs;
let b = U256from_be_limbs;
let sum = a.add_mod; // (42 + 17) mod P
let product = a.mul_mod; // (42 * 17) mod P
let inverse = a.mod_inv; // 42^-1 mod P
Types
| Type | Limbs | Bits | Description |
|---|---|---|---|
U256 |
4 | 256 | Primary type -- modular arithmetic, crypto operations |
U320 |
5 | 320 | Intermediate: U256 * u64 product |
U384 |
6 | 384 | Intermediate: U256 * u128 product |
U512 |
8 | 512 | Widening multiply output, U256 * U256 product |
U256Batch<N> |
-- | -- | Batch modular inverse via Montgomery's trick |
All types use [u64; N] in little-endian order: [w0_lsb, ..., wN_msb].
Internal Algorithm Structs
| Type | Description |
|---|---|
GcdState |
Lehmer's extended GCD state machine |
KnuthD |
Knuth's Algorithm D long division |
LehmerStep |
GCD inner loop (u64/u128 phases) |
API Reference
Modular Arithmetic
U256-only. These are the core cryptographic operations.
| Method | Description | CT feature |
|---|---|---|
add_mod(&b, &m) |
(a + b) mod m |
ct-field |
sub_mod(&b, &m) |
(a - b) mod m |
ct-field |
neg_mod(&m) |
(-a) mod m -- modular negation |
ct-field |
mul_mod(&b, &m) |
(a * b) mod m -- auto-selects sparse/generic |
ct-field |
square_mod(&m) |
a^2 mod m -- auto-selects sparse/generic |
ct-field |
pow_mod(&exp, &m) |
a^exp mod m -- square-and-multiply |
ct-pow |
mod_inv(&m) |
a^-1 mod m -- Lehmer GCD or SafeGCD (ct-inv) |
ct-inv |
mod_inv_fermat(&m) |
a^(m-2) mod m -- Fermat's little theorem |
-- |
div_mod(&b, &m) |
a * b^-1 mod m |
-- |
sqrt_mod(&p) |
Tonelli-Shanks modular square root | -- |
modulo(&m) |
a mod m -- reduction to [0, m) |
ct-field |
Wrapping Arithmetic
| Method | Description | U256 | U320 | U384 | U512 |
|---|---|---|---|---|---|
overflowing_add(&b) |
Add with carry flag | x | x | x | x |
overflowing_sub(&b) |
Subtract with borrow flag | x | x | x | x |
checked_add(&b) |
Add, None on overflow |
x | x | x | x |
checked_sub(&b) |
Subtract, None on underflow |
x | x | x | x |
negate() |
Two's complement negation | x | x | x | x |
Multiplication
| Method | Description | U256 | U320 | U384 | U512 |
|---|---|---|---|---|---|
widening_mul(&b) -> U512 |
Full 512-bit product | x | |||
square() -> U512 |
Full 512-bit square | x | |||
mul_u64(b) -> U320 |
U256 * u64 -> 320-bit | x | |||
mul_u128(b) -> U384 |
U256 * u128 -> 384-bit | x |
Division
| Method | Description | U256 | U320 | U384 | U512 |
|---|---|---|---|---|---|
div_rem(&b) |
Quotient and remainder | x | |||
div_u64(b) |
Divide by u64 | x | x | x | |
div_u256(b) |
Divide by U256 (Knuth D) | x |
Reduction (U512 only)
| Method | Description | CT feature |
|---|---|---|
sparse_reduce(&compl) |
Fold using sparse prime structure | ct-field |
Bit Operations
| Method | Description | U256 | U320 | U384 | U512 |
|---|---|---|---|---|---|
bit(i) |
Test bit at position i |
x | |||
shl(n) / shr(n) |
Shift left/right | x | x | x | x |
leading_zeros() |
Count leading zero bits | x | x | x | x |
trailing_zeros() |
Count trailing zero bits | x | x | x | x |
bit_len() |
Number of significant bits | x | x | x | x |
sig_limbs() |
Number of significant limbs | x | x | x | x |
is_zero() |
Test for zero | x | x | x | x |
is_even() / is_odd() |
Parity test | x | x | x | x |
is_power_of_two() |
Test if exactly one bit is set | x | |||
count_ones() |
Population count (Hamming weight) | x |
Morton / Z-Order
Bit dilation, undilation, interleaving, and deinterleaving for spatial hashing and Z-order curves.
| Method | Description | Level |
|---|---|---|
U256::from_u128_dilated_even(v) |
Dilate u128 → even positions (0,2,4,...,254) | u128→U256 |
U256::from_u128_dilated_odd(v) |
Dilate u128 → odd positions (1,3,5,...,255) | u128→U256 |
U256::interleave(x, y) |
Interleave two u128 → U256 Morton code | u128→U256 |
.undilate_even() |
Extract even positions → u128 | U256→u128 |
.undilate_odd() |
Extract odd positions → u128 | U256→u128 |
.deinterleave() |
Split into (u128, u128) | U256→u128 |
.dilate_even() |
Dilate U256 → even positions of U512 | U256→U512 |
.dilate_odd() |
Dilate U256 → odd positions of U512 | U256→U512 |
U512::interleave(&x, &y) |
Interleave two U256 → U512 Morton code | U256→U512 |
.undilate_even() |
Extract even positions → U256 | U512→U256 |
.undilate_odd() |
Extract odd positions → U256 | U512→U256 |
.deinterleave() |
Split into (U256, U256) | U512→U256 |
Constant-Time Primitives
U256-only.
| Method | Description |
|---|---|
ct_select(&other, flag) |
Branchless conditional select -- returns self when flag is true |
has_u64_complement(&m) |
Test if 2^256 - m fits in u64 (sparse prime detection) |
Conversion
| Method | Description | U256 | U320 | U384 | U512 |
|---|---|---|---|---|---|
from_be_limbs(arr) |
From big-endian limbs | x | x | x | x |
from_le_limbs(arr) |
From little-endian limbs | x | x | x | x |
from_be_bytes(arr) |
From big-endian bytes | x | x | x | x |
from_le_bytes(arr) |
From little-endian bytes | x | x | x | x |
to_be_limbs() |
To big-endian limbs | x | x | x | x |
to_le_limbs() |
To little-endian limbs | x | x | x | x |
from_be_bits(arr) |
From big-endian [bool; 256] (MSB first) |
x | |||
from_le_bits(arr) |
From little-endian [bool; 256] (LSB first) |
x | |||
to_be_bits() |
To big-endian [bool; N] (MSB first) |
x | |||
to_le_bits() |
To little-endian [bool; N] (LSB first) |
x | |||
to_be_bytes() |
To big-endian bytes | x | x | x | x |
to_le_bytes() |
To little-endian bytes | x | x | x | x |
Into<[u8; N]> |
To big-endian bytes (trait) | x | x | x | x |
from(u64) |
From u64 | x | x | x | x |
from(u128) |
From u128 | x | x | x | x |
from(U256) |
Widen from U256 | x | x | x | |
from(U320) |
Widen from U320 | x | x | ||
from(U384) |
Widen from U384 | x | |||
low_u256() |
Lower 256 bits | x | x | x | |
high_u256() |
Upper 256 bits | x | |||
low_u128() / high_u128() |
Lower/upper 128 bits | x | |||
non_zero() |
Option<Self> for early returns |
x | x | x | x |
const_eq(&other) |
Const-compatible equality | x | x | x | x |
Standard Traits
| Trait | U256 | U320 | U384 | U512 |
|---|---|---|---|---|
Add / Sub |
x | x | x | x |
Div / Rem |
x | |||
AddAssign |
x | |||
BitAnd / BitOr / BitXor / Not |
x | x | x | x |
Shl / Shr |
x | x | x | x |
Display |
x | x | x | x |
LowerHex / UpperHex |
x | x | x | x |
FromStr |
x | x | x | x |
Default |
x | x | x | x |
Ord / PartialOrd |
x | x | x | x |
Hash |
x |
U256Batch<N>
Const-generic batch container for N values of [U256]. Wraps a [U256; N] array with batch operations that amortize expensive computations across all elements.
The primary use case is Montgomery batch inversion: computing N modular inverses using a single mod_inv call plus 3(N-1) modular multiplications. At large N, this reduces per-element inversion cost from ~802 ns to ~46 ns (17x speedup). Used by cnfy-secp256k1 for batch Jacobian-to-affine conversion and batch public key generation.
| Method | Description |
|---|---|
U256Batch(arr) |
Construct from a [U256; N] array |
zeroed() |
Create a batch with all elements set to U256::ZERO |
mod_inv(&m) |
Montgomery batch inversion -- returns Option<[U256; N]> (None if any element is zero) |
mod_inv_vt(&m) |
Variable-time batch inversion (always Lehmer GCD, ignores ct-inv) |
.0[i] |
Direct element access (public inner array) |
Constant-Time Feature Flags
cnfy-uint provides granular compile-time feature flags that switch methods to constant-time implementations:
| Feature | What it covers |
|---|---|
ct-field |
add_mod, sub_mod, mul_mod_sparse, square_mod_sparse, sparse_reduce, modulo |
ct-inv |
mod_inv (SafeGCD / Bernstein-Yang divsteps); implies ct-field |
ct-pow |
pow_mod (always 256 iterations); implies ct-field |
ct |
Umbrella enabling all of the above |
Constant-time overhead: 30-49% on field ops, 10.8x on mod_inv, 8% on pow_mod. See SECURITY.md for details.
What This Powers
cnfy-secp256k1 -- scalar multiplication 2.3x faster than the C FFI, public key addition 3.7x faster.
Testing
100% test coverage. Validated against known test vectors.
Ethics
See ETHICS.md for the ethical principles that guide this project.
Links
License
Licensed under the Shariah-Informed Open License (SIOL-1.0).