cnfy-uint
Pure Rust 256-bit modular arithmetic. Zero dependencies. Zero unsafe code.
For cryptographic applications. No heap allocations.
Part of the cnfy suite -- cryptographic primitives built from scratch to eliminate supply chain attacks.
Performance
Full benchmarks and methodology: cnfy-bench. Security analysis: SECURITY.md.
We benchmark against crypto-bigint, ruint, ethnum, num-bigint, and uint -- all excellent libraries that we've learned from.
Why It's Fast
Sparse prime optimization. secp256k1's field prime P = 2^256 - 0x1000003D1 has a tiny complement. Reduction folds excess bits instead of running full division -- every modular operation benefits.
Cross-crate inlining. All 47 hot-path methods are #[inline]. cnfy-secp256k1 gets fully inlined arithmetic without LTO.
Algorithm selection:
- Modular inverse: Lehmer's extended GCD with two-phase inner loop (u64/u128)
- Division: Knuth's Algorithm D with MG10 reciprocal-based quotient estimation
- Batch inverse: Montgomery's trick (one inverse + 3N multiplications for N elements)
Quick Start
use U256;
// secp256k1 field prime
let p = U256from_be_limbs;
let a = U256from_be_limbs;
let b = U256from_be_limbs;
let sum = a.add_mod; // (42 + 17) mod P
let product = a.mul_mod; // (42 * 17) mod P
let inverse = a.mod_inv; // 42^-1 mod P
Types
| Type | Limbs | Bits | Description |
|---|---|---|---|
U256 |
4 | 256 | Primary type -- modular arithmetic, crypto operations |
U320 |
5 | 320 | Intermediate: U256 * u64 product |
U384 |
6 | 384 | Intermediate: U256 * u128 product |
U512 |
8 | 512 | Widening multiply output, U256 * U256 product |
U256Batch<N> |
-- | -- | Batch modular inverse via Montgomery's trick |
All types use [u64; N] in little-endian order: [w0_lsb, ..., wN_msb].
Internal Algorithm Structs
| Type | Description |
|---|---|
GcdState |
Lehmer's extended GCD state machine |
KnuthD |
Knuth's Algorithm D long division |
LehmerStep |
GCD inner loop (u64/u128 phases) |
API Reference
Modular Arithmetic
U256-only. These are the core cryptographic operations.
| Method | Description | CT feature |
|---|---|---|
add_mod(&b, &m) |
(a + b) mod m |
ct-field |
sub_mod(&b, &m) |
(a - b) mod m |
ct-field |
mul_mod(&b, &m) |
(a * b) mod m -- auto-selects sparse/generic |
ct-field |
square_mod(&m) |
a^2 mod m -- auto-selects sparse/generic |
ct-field |
pow_mod(&exp, &m) |
a^exp mod m -- square-and-multiply |
ct-pow |
mod_inv(&m) |
a^-1 mod m -- Lehmer GCD or SafeGCD (ct-inv) |
ct-inv |
mod_inv_fermat(&m) |
a^(m-2) mod m -- Fermat's little theorem |
-- |
div_mod(&b, &m) |
a * b^-1 mod m |
-- |
sqrt_mod(&p) |
Tonelli-Shanks modular square root | -- |
modulo(&m) |
a mod m -- reduction to [0, m) |
ct-field |
Wrapping Arithmetic
| Method | Description | U256 | U320 | U384 | U512 |
|---|---|---|---|---|---|
overflowing_add(&b) |
Add with carry flag | x | x | x | x |
overflowing_sub(&b) |
Subtract with borrow flag | x | x | x | x |
checked_add(&b) |
Add, None on overflow |
x | x | x | x |
checked_sub(&b) |
Subtract, None on underflow |
x | x | x | x |
negate() |
Two's complement negation | x | x | x | x |
Multiplication
| Method | Description | U256 | U320 | U384 | U512 |
|---|---|---|---|---|---|
widening_mul(&b) -> U512 |
Full 512-bit product | x | |||
square() -> U512 |
Full 512-bit square | x | |||
mul_u64(b) -> U320 |
U256 * u64 -> 320-bit | x | |||
mul_u128(b) -> U384 |
U256 * u128 -> 384-bit | x |
Division
| Method | Description | U256 | U320 | U384 | U512 |
|---|---|---|---|---|---|
div_rem(&b) |
Quotient and remainder | x | |||
div_u64(b) |
Divide by u64 | x | x | x | |
div_u256(b) |
Divide by U256 (Knuth D) | x |
Reduction (U512 only)
| Method | Description | CT feature |
|---|---|---|
sparse_reduce(&compl) |
Fold using sparse prime structure | ct-field |
Bit Operations
| Method | Description | U256 | U320 | U384 | U512 |
|---|---|---|---|---|---|
bit(i) |
Test bit at position i |
x | |||
shl(n) / shr(n) |
Shift left/right | x | x | x | x |
leading_zeros() |
Count leading zero bits | x | x | x | x |
trailing_zeros() |
Count trailing zero bits | x | x | x | x |
bit_len() |
Number of significant bits | x | x | x | x |
sig_limbs() |
Number of significant limbs | x | x | x | x |
is_zero() |
Test for zero | x | x | x | x |
is_even() / is_odd() |
Parity test | x | x | x | x |
Morton / Z-Order
Bit dilation, undilation, interleaving, and deinterleaving for spatial hashing and Z-order curves.
| Method | Description | Level |
|---|---|---|
U256::from_u128_dilated_even(v) |
Dilate u128 → even positions (0,2,4,...,254) | u128→U256 |
U256::from_u128_dilated_odd(v) |
Dilate u128 → odd positions (1,3,5,...,255) | u128→U256 |
U256::interleave(x, y) |
Interleave two u128 → U256 Morton code | u128→U256 |
.undilate_even() |
Extract even positions → u128 | U256→u128 |
.undilate_odd() |
Extract odd positions → u128 | U256→u128 |
.deinterleave() |
Split into (u128, u128) | U256→u128 |
.dilate_even() |
Dilate U256 → even positions of U512 | U256→U512 |
.dilate_odd() |
Dilate U256 → odd positions of U512 | U256→U512 |
U512::interleave(&x, &y) |
Interleave two U256 → U512 Morton code | U256→U512 |
.undilate_even() |
Extract even positions → U256 | U512→U256 |
.undilate_odd() |
Extract odd positions → U256 | U512→U256 |
.deinterleave() |
Split into (U256, U256) | U512→U256 |
Constant-Time Primitives
U256-only.
| Method | Description |
|---|---|
ct_select(&other, flag) |
Branchless conditional select -- returns self when flag is true |
has_u64_complement(&m) |
Test if 2^256 - m fits in u64 (sparse prime detection) |
Conversion
| Method | Description | U256 | U320 | U384 | U512 |
|---|---|---|---|---|---|
from_be_limbs(arr) |
From big-endian limbs | x | x | x | x |
from_le_limbs(arr) |
From little-endian limbs | x | x | x | x |
from_be_bytes(arr) |
From big-endian bytes | x | x | x | x |
from_le_bytes(arr) |
From little-endian bytes | x | x | x | x |
to_be_limbs() |
To big-endian limbs | x | x | x | x |
to_le_limbs() |
To little-endian limbs | x | x | x | x |
to_be_bytes() |
To big-endian bytes | x | x | x | x |
to_le_bytes() |
To little-endian bytes | x | x | x | x |
Into<[u8; N]> |
To big-endian bytes (trait) | x | x | x | x |
from(u64) |
From u64 | x | x | x | x |
from(u128) |
From u128 | x | x | x | x |
from(U256) |
Widen from U256 | x | x | x | |
from(U320) |
Widen from U320 | x | x | ||
from(U384) |
Widen from U384 | x | |||
low_u256() |
Lower 256 bits | x | x | x | |
high_u256() |
Upper 256 bits | x | |||
low_u128() / high_u128() |
Lower/upper 128 bits | x | |||
non_zero() |
Option<Self> for early returns |
x | x | x | x |
const_eq(&other) |
Const-compatible equality | x | x | x | x |
Standard Traits
| Trait | U256 | U320 | U384 | U512 |
|---|---|---|---|---|
Add / Sub |
x | x | x | x |
Div / Rem |
x | |||
AddAssign |
x | |||
BitAnd / BitOr / BitXor / Not |
x | x | x | x |
Shl / Shr |
x | x | x | x |
Display |
x | x | x | x |
LowerHex / UpperHex |
x | x | x | x |
FromStr |
x | x | x | x |
Default |
x | x | x | x |
Ord / PartialOrd |
x | x | x | x |
Hash |
x |
U256Batch<N>
Const-generic batch container for N values of [U256]. Wraps a [U256; N] array with batch operations that amortize expensive computations across all elements.
The primary use case is Montgomery batch inversion: computing N modular inverses using a single mod_inv call plus 3(N-1) modular multiplications. At large N, this reduces per-element inversion cost from ~802 ns to ~46 ns (17x speedup). Used by cnfy-secp256k1 for batch Jacobian-to-affine conversion and batch public key generation.
| Method | Description |
|---|---|
U256Batch(arr) |
Construct from a [U256; N] array |
zeroed() |
Create a batch with all elements set to U256::ZERO |
mod_inv(&m) |
Montgomery batch inversion -- returns Option<[U256; N]> (None if any element is zero) |
.0[i] |
Direct element access (public inner array) |
Constant-Time Feature Flags
cnfy-uint provides granular compile-time feature flags that switch methods to constant-time implementations:
| Feature | What it covers |
|---|---|
ct-field |
add_mod, sub_mod, mul_mod_sparse, square_mod_sparse, sparse_reduce, modulo |
ct-inv |
mod_inv (SafeGCD / Bernstein-Yang divsteps); implies ct-field |
ct-pow |
pow_mod (always 256 iterations); implies ct-field |
ct |
Umbrella enabling all of the above |
Constant-time overhead: 30-49% on field ops, 10.8x on mod_inv, 8% on pow_mod. See SECURITY.md for details.
What This Powers
cnfy-secp256k1 -- scalar multiplication 2.3x faster than the C FFI, public key addition 3.7x faster.
Testing
100% test coverage. Validated against known test vectors.
Built With
This library was built with the accompaniment of Claude Code (Claude Opus 4.6, Max plan).
Links
License
Licensed under either of
at your option.