1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
//! Streaming / out-of-core corpus driver for the SAE term (#973).
//!
//! This module is the **driver** that lets a sparse-autoencoder term fit on a
//! corpus of activations far larger than RAM, without ever materializing the
//! activations as a dense `f64` matrix. It sits *behind* the SAE term: the
//! term consumes the seam this module exposes, this module owns the streaming,
//! warm-state, scheduling and mixed-precision machinery.
//!
//! # The four pieces
//!
//! * [`shard_reader`] — an mmap-backed, bounded-prefetch reader over one or
//! many on-disk activation shards. It defines the `v1` shard format
//! (fixed 32-byte header + row-major `f32` payload) and yields `f64`-upcast
//! row batches in a **deterministic global order** with stable `row_id`s,
//! independent of OS readahead.
//! * [`warm_state`] — a disk-backed (mmap/LRU) per-row warm-state cache keyed
//! by `(row_id, TermCollectionSpec structural hash)`. It persists each row's
//! inner-solve seed (latent coords + active set) so re-solving the same row
//! across outer ρ passes (or across a SIGKILL-resume) costs ~3 inner
//! iterations instead of ~30. The structural hash is computed the same way
//! the existing warm-start cache does (#869,
//! `TermCollectionSpec::write_structural_shape_hash`), so distinct topologies
//! never cross-seed.
//! * [`rho_cascade`] — a subsample-converge-then-full-pass ρ schedule that
//! carries **importance weights**. Every outer ρ step is a full corpus pass
//! in expectation; early steps run on a deterministic hashed-`row_id`
//! subsample with each included row reweighted by `1/inclusion_probability`
//! (the subsample-honesty contract), and the trailing steps are honest full
//! passes.
//! * [`kernels`] — fused mixed-precision kernels (`dot`, `gram`, `gemv`,
//! `gemv_t`, `cross`) that **read `f32` rows and accumulate in `f64`**, the
//! numerical contract that keeps the streaming sums deterministic and
//! precise despite `f32` on-disk storage.
//! * [`object_store`] (#987) — the same `v1` shards streamed out of **object
//! storage** through a two-method [`object_store::ObjectStore`] trait (no
//! cloud SDK in-tree), with a bounded prefetch window and the identical
//! deterministic `(row_id, row)` sequence as the mmap reader. Also carries
//! the frontier predicate
//! [`object_store::designed_sampling_mandatory`]: above 10⁸ rows the fit
//! must see a designed, honesty-weighted subsample
//! ([`gam_solve::row_sampling_measure::RowSamplingMeasure::designed_subsample`]), not
//! a full exact pass.
//! * [`designed_target`] (#991) — the consumer bridge: design a row sample
//! from a [`gam_solve::row_sampling_measure::RowSamplingMeasure`] (uniform cold start,
//! or a previous harvest's lifted Fisher measure), stream-collect exactly
//! those rows into the `budget × p` fit target, and hand the `1/π` honesty
//! weights to `SaeManifoldTerm::set_row_loss_weights`. Full budget ⇒ the
//! exact pass, bit-for-bit (weights collapse to the unweighted path).
//!
//! # The seam
//!
//! The SAE term (owned by another track, `sae_manifold.rs`) consumes exactly
//! two traits from here — re-exported below as the public seam:
//!
//! * [`CorpusRowSource`] — "give me the next deterministic batch of rows /
//! rewind for the next ρ pass", and
//! * [`RowWarmCache`] — "give me / take back this row's inner-solve warm
//! start".
//!
//! Together with [`rho_cascade::RhoCascadeSchedule`] (which step's subsample +
//! importance weights to apply) and the [`kernels`] (how to accumulate a
//! batch's contribution), these let the term run a full streaming, warm-started,
//! mixed-precision REML fit over an out-of-core corpus while keeping the
//! determinism and crash-resume guarantees the rest of #973 established.
//!
//! Nothing in this module references `sae_manifold.rs`; the term wires these
//! pieces in on its side of the seam.
// ---------------------------------------------------------------------------
// The driver seam consumed by the SAE term.
// ---------------------------------------------------------------------------
/// Deterministic, restartable source of activation row batches (seam half 1).
pub use ;
/// Per-row inner-solve warm-state cache (seam half 2).
pub use ;
/// Subsample → full-pass ρ schedule with importance weights.
pub use ;
/// Designed corpus target collection (#991): stream → designed sample +
/// honesty weights, the row set the term actually fits.
pub use ;
/// Mixed-precision fused kernels (read `f32`, accumulate `f64`).
pub use ;