1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
// SPDX-FileCopyrightText: 2026 John Moxley
// SPDX-License-Identifier: MIT OR Apache-2.0
//! Integer cubing policy — the sqr-then-multiply algorithm matcher.
//!
//! `Uint<N>::cube` / `Uint<N>::wrapping_cube` and the `Int<N>` siblings
//! delegate to [`dispatch`], which follows the canonical policy shape (see
//! `docs/ARCHITECTURE.md` → "Policy file structure"):
//!
//! 1. an [`Algorithm`] enum — the real cubing algorithm(s), no `Default`
//! variant;
//! 2. a [`Select`] verdict — a settled algorithm or "the value decides";
//! 3. a `const fn` [`select`] keyed on `N`, total over the key;
//! 4. dispatch via an inline `const { select::<N>() }` block, then an
//! **exhaustive** `match algo` — no `_`, no panic.
//!
//! Because `select` is `const` and keyed only on the const generic `N`,
//! the `const { … }` block folds per monomorphisation and the unchosen arm
//! is dead-arm-eliminated in release: each concrete `Uint<N>` compiles to a
//! direct call to the sqr-then-multiply sequence, no runtime branch.
//!
//! # Algorithm
//!
//! The optimal form of `x³` is `x²·x` — two limb operations rather than
//! three sequential multiplies; no cheaper form exists below two
//! multiplications. The algorithm fn
//! [`crate::int::algos::cube::cube_schoolbook::cube_schoolbook`] computes
//! the square with the const half-product kernel
//! [`crate::int::algos::sqr::sqr_low_fixed::sqr_low_fixed`] and the final multiply with
//! the const truncated kernel [`crate::int::algos::mul::mul_schoolbook::mul_low_fixed`],
//! so the total limb-multiply count is `N(N+1)/2 + N²`. The layering points
//! DOWN — the algorithm calls the kernels, never a cube/sqr/mul method on
//! `Uint<N>`.
//!
//! **Fused product-scanning candidate (`Comba`).**
//! [`crate::int::algos::cube::cube_fused_comba::cube_fused_comba`] forms `x³`
//! in a SINGLE product-scanning pass (the cube analogue of the symmetric comba
//! square) rather than materialising `x²` and re-multiplying. It is also
//! `const fn`, so it can be wired without de-const-ing the public API. The
//! `int_cube_eq_ab` N-way A/B shows it beats `x²·x` only at the narrow
//! `N == 2` tier (~1.11x, reproducible); at `N == 1` it ties, and from
//! `N >= 3` its `~N³/6` triple-product count loses to the schoolbook
//! `N(N+1)/2 + N²` and the gap GROWS with width (1.2x at N=3 → 13x at N=64).
//! `select` therefore routes `N == 2` to `Comba` and every other width to
//! `Schoolbook`.
//!
//! The `LimbSize` u128 packing used by [`crate::int::policy::sqr_low`] /
//! [`crate::int::policy::mul_low`] is NOT a cube candidate: those kernels are
//! **not `const fn`** (the `Limb` trait methods are not const), and `cube`'s
//! dispatch MUST stay `const fn` because `Int<N>::wrapping_cube` is `const fn`
//! and is reached from `const` contexts crate-wide.
//!
//! The `ByValue` arm of [`Select`] is present for canonical-shape
//! uniformity; `select` never returns it.
//!
//! # Const-ness
//!
//! `dispatch` IS `const fn`: the algorithm fn computes via const kernels,
//! so the type's `const fn` `wrapping_cube` can delegate through it. The
//! `ByValue` arm returns the default algorithm tag without invoking the fn
//! pointer (calling a fn pointer is not permitted in `const fn`; merely
//! matching the variant is fine).
use cratecube_fused_comba;
use cratecube_schoolbook;
use crateUint;
// ── 1. the real cubing algorithm — NAMED, no `Default` ───────────────
/// The cubing algorithms this policy chooses between. The single variant
/// is the CamelCase of the algorithm fn's name minus the `cube_` function
/// prefix (`cube_schoolbook` → `Schoolbook`) — strict 1:1 with the fn.
// ── 2. the verdict ────────────────────────────────────────────────────
/// A settled algorithm, or "the value decides". The cube picker always
/// returns `ByAlgorithm`: the choice is fully determined by `N` (which is
/// constant, and the same algorithm wins at every `N`). `ByValue` is part
/// of the canonical shape for uniformity across functions; `select` never
/// returns it.
// ── 3. the matcher: const, keyed on `N`, total over the key ──────────
/// Pick the cubing algorithm for storage limb count `N`. Total over the key.
///
/// - `N == 2` → [`Algorithm::Comba`] (the fused product-scanning pass wins
/// ~1.11x at this narrow tier per the `int_cube_eq_ab` A/B).
/// - every other `N` (the `_` arm) → [`Algorithm::Schoolbook`] (`x²·x`; comba
/// ties at N=1 and loses by a widening margin from N>=3).
const
// ── 4. the dispatcher: fold the verdict, then dispatch ────────────────
/// Integer cubing dispatcher for `Uint<N>`.
///
/// Resolves the compile-time algorithm verdict via
/// `const { select::<N>() }` (folds per monomorphisation; dead arms are
/// eliminated in release) then dispatches exhaustively over [`Algorithm`].
///
/// Must be `const fn`: `Int<N>::wrapping_cube` is itself `const fn`. The
/// `ByValue` arm returns the default algorithm tag without invoking the fn
/// pointer, satisfying the `const fn` constraint.
pub const