# anamnesis — Flagship Example (V2)
**The problem in one sentence:** Mistral ships Ministral 3B Instruct as FP8 safetensors. No Rust ML framework can load it.
*Note: Tensor counts and sizes in CLI output examples below are illustrative.
Actual values will be confirmed when the test models are downloaded in Phase 1.*
---
## Before anamnesis
```rust
// candle
let vb = VarBuilder::from_mmaped_safetensors(&[path], DType::BF16, &device)?;
// -> Error: unsupported safetensor dtype F8_E4M3
```
This blocks the entire Rust ML ecosystem: candle, candle-mi, burn, tch. The only workaround is a Python script.
## After anamnesis
### Step 1 — Parse (make contact with the weights)
Everything begins with parsing. You cannot remember what you have not first parsed.
```bash
$ amn parse model-fp8.safetensors
277 tensors parsed
224 quantized FP8 E4M3, per-tensor scale
53 passthrough BF16 (norms, embeddings, lm_head)
File: 4.35 GB
```
### Step 2 — Inspect (what did Lethe take?)
```bash
$ amn inspect model-fp8.safetensors
Format: Per-tensor FP8 (E4M3), single scale factor per tensor
Quantized: 224 tensors (weights)
Passthrough: 53 tensors (norms, embeddings, lm_head)
Size: 4.35 GB (FP8) -> 7.70 GB (BF16)
Lethe took: ~3.35 GB of precision
```
### Step 3 — Remember (dequantize)
```bash
$ amn remember model-fp8.safetensors --to bf16
Parsing... 277 tensors, per-tensor FP8 (E4M3)
Recalling... 224 tensors [====================] 2.1s
Output: model-bf16.safetensors (7.70 GB)
```
### Step 4 — Use anywhere
```rust
// candle — now works
let vb = VarBuilder::from_mmaped_safetensors(&["model-bf16.safetensors"], DType::BF16, &device)?;
// candle-mi — logit lens, circuit tracing, CLT analysis
let model = MIModel::from_pretrained("./model-bf16/")?;
```
---
## Fine-grained FP8 (128x128 block scales)
Not all FP8 is the same. EXAONE-4.0-1.2B-FP8 uses fine-grained quantization
with 128x128 block scale factors — more precision, more metadata:
```bash
$ amn inspect exaone-fp8.safetensors
Format: Fine-grained FP8 (E4M3), 128x128 blocks
Quantized: 180 tensors (weights) + 180 scale tensors (F32)
Passthrough: 31 tensors (norms, embeddings)
Size: 1.22 GB (FP8) -> 2.40 GB (BF16)
Lethe took: ~1.18 GB of precision
$ amn remember exaone-fp8.safetensors --to bf16
Parsing... 391 tensors, fine-grained FP8 (E4M3), 128x128 blocks
Recalling... 180 tensors [====================] 0.4s
Output: exaone-bf16.safetensors (2.40 GB)
```
---
## NPZ parsing (nothing was forgotten)
SAE weights (Gemma Scope) ship as NPZ archives — a foreign container, not a
quantized format. Parsing alone suffices. There is nothing to remember because
nothing was forgotten:
```bash
$ amn parse params.npz
5 tensors parsed
W_dec F32 [16384, 2304]
W_enc F32 [2304, 16384]
b_dec F32 [2304]
b_enc F32 [16384]
threshold F32 [16384]
File: 302 MB
```
No `remember` step needed — the data is already full-precision. anamnesis
extracted it from the NPZ container; any Rust framework can now consume the
raw arrays.
---
## The same in Rust (library API)
The library API reflects the parse-first architecture: parse once, then act.
```rust
use anamnesis::{parse, TargetDtype};
// Parse — the foundation of every operation
let model = parse("model-fp8.safetensors")?;
// Inspect — built on parse, no re-read
let info = model.inspect();
println!("{}", info.format); // "Per-tensor FP8 (E4M3)"
println!("{}", info.quantized); // 224
println!("{}", info.passthrough); // 53
// Remember — built on parse, no re-read
model.remember("model-bf16.safetensors", TargetDtype::BF16)?;
// Output: standard BF16 safetensors, loadable by any Rust ML framework.
```
### NPZ in Rust
```rust
use anamnesis::parse_npz;
// Parse — nothing to remember, the data is intact
let tensors = parse_npz("params.npz")?;
let w_enc = &tensors["W_enc"];
assert_eq!(w_enc.shape, &[2304, 16384]);
// w_enc.data is raw F32 bytes, ready for any framework
```
---
## The opposite direction (future: lethe module)
```rust
use anamnesis::lethe::{QuantScheme};
let model = anamnesis::parse("model-bf16.safetensors")?;
// The weights forget some precision
model.forget("model-fp8.safetensors", QuantScheme::FineGrainedFp8)?;
```
```bash
$ amn forget model-bf16.safetensors --scheme fp8-fine-grained
Parsing... 277 tensors, BF16
Forgetting... 224 tensors [====================] 1.8s
Output: model-fp8.safetensors (4.35 GB)
Lethe distance: 0.0012 mean absolute error
```
---
## CLI verbs and synonyms
Every verb has a technical synonym. Use whichever vocabulary you prefer:
| `amn parse` | — | Decode + validate format | Make contact with the weights |
| `amn inspect` | `amn info` | Show format, sizes, what Lethe took | What was forgotten? |
| `amn remember` | `amn dequantize` | FP8/GPTQ/AWQ -> BF16/F16/F32 | Anamnesis — recollection of the forms |
| `amn forget` | `amn quantize` | BF16/F32 -> FP8/INT8/INT4 | Lethe — drinking from the river |
## Integration with hf-fetch-model
```bash
# Download + remember in one step
$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors --dequantize bf16
```
hf-fetch-model calls anamnesis under the hood. The user never touches FP8 files.