hf-fetch-model
A Rust library and CLI for downloading and inspecting HuggingFace models. Multi-connection parallel downloads, file filtering, checksum verification, retry — plus safetensors header inspection and tensor layout comparison between models, all without downloading weight data.
Table of contents
- Install
- Commands
- Try it
- Inspect & compare
- Disk usage
- Library quick start
- Documentation
- Used by
- License
- Development
Install
Commands
| Command | |
|---|---|
hf-fm <REPO_ID> |
Download a model (multi-connection, auto-tuned) |
hf-fm diff <REPO_A> <REPO_B> |
Compare tensor layouts between two models |
hf-fm discover |
Find new model families on the Hub |
hf-fm download-file <REPO_ID> <FILE> |
Download a single file (or glob pattern) |
hf-fm du [REPO_ID] |
Show cache disk usage |
hf-fm inspect <REPO_ID> [FILE] |
Inspect safetensors headers (tensor names, shapes, dtypes) |
hf-fm list-families |
List model families in local cache |
hf-fm list-files <REPO_ID> |
List remote files (sizes, SHA256) without downloading |
hf-fm search <QUERY> |
Search the HuggingFace Hub for models |
hf-fm status [REPO_ID] |
Show download status (complete / partial / missing) |
See CLI Reference for all flags and output examples.
Try it
$ hf-fm search mistral,3B,instruct
Models matching "mistral,3B,instruct" (by downloads):
hf-fm mistralai/Ministral-3-3B-Instruct-2512 (159,700 downloads)
hf-fm mistralai/Ministral-3-3B-Instruct-2512-BF16 (62,600 downloads)
hf-fm mistralai/Ministral-3-3B-Instruct-2512-GGUF (32,700 downloads)
...
$ hf-fm search mistralai/Ministral-3-3B-Instruct-2512 --exact
Exact match:
hf-fm mistralai/Ministral-3-3B-Instruct-2512 (159,700 downloads)
License: apache-2.0
Pipeline: text-generation
Library: vllm
Languages: en, fr, es, de, it, pt, nl, zh, ja, ko, ar
$ hf-fm list-files mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors
File Size SHA256
model-00001-of-00002.safetensors 3.68 GiB a1b2c3d4e5f6
model-00002-of-00002.safetensors 2.88 GiB f6e5d4c3b2a1
config.json 856 B —
...
7 files, 6.57 GiB total
$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors --dry-run
Repo: mistralai/Ministral-3-3B-Instruct-2512
Revision: main
File Size Status
model-00001-of-00002.safetensors 3.68 GiB to download
model-00002-of-00002.safetensors 2.88 GiB to download
...
Total: 6.57 GiB (7 files, 0 cached, 7 to download)
Recommended config:
concurrency: 2
connections/file: 8
chunk threshold: 100 MiB
$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors
Downloaded to: ~/.cache/huggingface/hub/models--mistralai--Ministral-3-3B.../snapshots/...
6.57 GiB in 18.2s (369.1 MiB/s)
# Download to flat layout (files directly in ./models/)
$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors --flat --output-dir ./models
# Download sharded PyTorch files by glob
$ hf-fm download-file org/model "pytorch_model-*.bin"
Inspect & compare
$ hf-fm inspect EleutherAI/pythia-1.4b model.safetensors --cached --filter "layers.0."
Repo: EleutherAI/pythia-1.4b
File: model.safetensors
Source: cached
Tensor Dtype Shape Size Params
gpt_neox.layers.0.attention.dense.weight F16 [2048, 2048] 8.00 MiB 4.2M
gpt_neox.layers.0.mlp.dense_h_to_4h.weight F16 [8192, 2048] 32.00 MiB 16.8M
...
────────────────────────────────────────────────────────────────────────────────────────────────
15/364 tensors, 54.6M/1.52B params (filter: "layers.0.")
$ hf-fm diff RedHatAI/Llama-3.2-1B-Instruct-FP8 casperhansen/llama-3.2-1b-instruct-awq --cached --summary
A: RedHatAI/Llama-3.2-1B-Instruct-FP8
B: casperhansen/llama-3.2-1b-instruct-awq
──────────────────────────────────────────────────────────────────────────────────────────────
A: 371 tensors | B: 370 tensors | only-A: 337 | only-B: 336 | differ: 34 | match: 0
Inspect reads tensor metadata via HTTP Range requests (2 requests per file) — no weight data downloaded. Diff compares tensor names, dtypes, and shapes between any two models (remote or cached).
Disk usage
$ hf-fm du
5.10 GiB google/gemma-2-2b-it (8 files)
2.80 GiB EleutherAI/pythia-1.4b (12 files)
1.20 GiB google/gemma-scope-2b-pt-res (3 files)
──────────────────────────────────────────────────
9.10 GiB total (3 repos, 23 files)
Library quick start
let outcome = download.await?;
println!;
Filter, progress, auth, and more via the builder — see Configuration.
Documentation
| Topic | |
|---|---|
| CLI Reference | All subcommands, flags, and output examples |
| Search | Comma filtering, --exact, model card metadata |
| Configuration | Builder API, presets, progress callbacks |
| Architecture | How hf-fetch-model relates to hf-hub and candle-mi |
| Diagnostics | --verbose output, tracing setup for library users |
| Changelog | Release history and migration notes |
Used by
- candle-mi — Mechanistic interpretability toolkit for transformer models
License
Licensed under either of Apache License, Version 2.0 or MIT License at your option.
Development
- Exclusively developed with Claude Code (dev) and Augment Code (review)
- Git workflow managed with Fork
- All code follows CONVENTIONS.md, derived from Amphigraphic-Strict's Grit — a strict Rust subset designed to improve AI coding accuracy.