1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
//! Model architecture save/load (gated on `rocm-hip`).
//!
//! Phase 2.10 of the 0.7B MoE training project. The
//! `crate::checkpoint` module saves *parameter values*; this
//! module saves the *architecture* (depth, width, expert count,
//! etc.) as `arch.json` and a SHA-256 fingerprint of the
//! topology for tamper detection.
//!
//! - `arch` — the architecture descriptor (depth, width, etc.).
//! - `save` / `load` — JSON + SHA-256 round-trip.
//! - `dense` — the dense-MLP architecture specialization.
//! - `fingerprint` — the topology fingerprint.
//!
// Phase 2.10 model architecture save/load.
//
// `crate::checkpoint` saves *parameter values*; this module saves the
// *architecture* (size variant, dimensions, expert count, seed,
// model family, router kernel) so that a model can be reconstructed
// from scratch. The two files are complementary: the checkpoint
// tells you what the trained weights are, the arch tells you what
// shape the network had to have.
//
// Five submodules:
// - `arch` — `ModelArch` struct, `ModelKind` / `RouterKind`
// enums, arch<->model conversion.
// - `dense` — `DenseModel` (the 0.7B Dense MLP) and
// `QualityModel` (the enum that wraps either
// an `MoEModel` or a `DenseModel`).
// - `save` — write a `ModelArch` to a JSON file.
// - `load` — read a `ModelArch` from a JSON file.
// - `fingerprint` — stable short hash for cache keys / model registry.
//
// The on-disk schema is intentionally trivial: a single JSON object
// with the arch fields, written via `serde_json::to_writer_pretty`. No
// magic bytes, no version field yet (we use the file extension + the
// fingerprint for version negotiation when we need to bump the schema).
pub use ;
pub use ;
pub use arch_fingerprint;
pub use load_arch;
pub use save_arch;