1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
// SPDX-FileCopyrightText: 2026 John Moxley
// SPDX-License-Identifier: MIT OR Apache-2.0
//! `sqrt_newton_with_table_seed` — bespoke square-root kernel for the
//! `(D57, SCALE == 20)` cell.
//!
//! At `SCALE = 20` the radicand `r · 10^20` is bounded by
//! `D57<20>::MAX · 10^20 ≤ 10^57 · 10^20 = 10^77`, which fits
//! comfortably in `Int<4>` (~`10^77`) instead of the generic Newton
//! kernel's `Int<6>` work integer. Running `isqrt` on a 256-bit value
//! rather than a 384-bit value cuts the wide-int big-number arithmetic
//! cost roughly in half — the per-iteration `isqrt` step is dominated by
//! limb-array `mul_sub` whose cost scales `O(L²)` where `L` is the limb
//! count (`L = 4` for `Int<4>`, `L = 6` for `Int<6>`).
//!
//! # `f64`-bridge Newton seed
//!
//! The generic `isqrt` seeds Newton at `2^⌈bits/2⌉` — an upper bound
//! correct to 1 bit. Each Newton step doubles the correct-bit count, so
//! reaching the 128-bit precision needed for a 256-bit radicand takes
//! ~7 iterations, each performing a 256-bit `div_rem`.
//!
//! When `std` is available we instead seed with `f64::sqrt(n.as_f64())`.
//! `as_f64` rounds the radicand to an `f64` (~53 bits of mantissa
//! accuracy), `f64::sqrt` is correctly rounded so the seed lands within
//! ~2⁻⁵² of the true `√n` in relative terms. From a 53-bit seed Newton
//! needs only 2 iterations to reach 128-bit precision, dropping the
//! dominant `div_rem` cost ~3×.
//!
//! The composition `f64::sqrt(n.as_f64())` can under- OR over-shoot the
//! true `√n` depending on how `n.as_f64()` rounded. A single
//! unconditional Newton step from any positive seed lands at `≥ √n` by
//! the AM-GM inequality (`(x + n/x) / 2 ≥ √(x · n/x) = √n`), so we always
//! perform one pre-iter before entering the standard `y ≥ x` monotone-
//! decrease loop. Cost is one extra `div_rem` versus the 4-5 saved by
//! the tighter seed.
//!
//! Result is bit-for-bit identical to [`crate::algos::sqrt::sqrt_newton`]
//! under all six [`RoundingMode`] values; only the integer width and the
//! seed source change.
//!
//! NOT feature-gated: referenced by the feature-independent `sqrt` policy (as
//! a kept reference seam), so it compiles in every build.
use crateBigInt;
use crateInt;
use crateRoundingMode;
/// `D57<20>` square-root kernel. The floor square root is taken via the
/// integer wide-kernel surface ([`Int::isqrt`] → the int `isqrt` policy);
/// the `f64::sqrt`-vs-classical seed std/no_std choice is encapsulated in
/// the seed leaf the kernel calls, so this body is cfg-free. The
/// result `⌊√(raw·10^20)⌋` is bit-identical either way; only the
/// iteration count differs.
pub