hf-fetch-model

A Rust library and CLI for downloading and inspecting HuggingFace models. Multi-connection parallel downloads, file filtering, checksum verification, retry — plus safetensors header inspection and tensor layout comparison between models, all without downloading weight data.

Install
Commands
Try it
Inspect & compare
Disk usage
Library quick start
Documentation
Used by
License
Development

Install

cargo install hf-fetch-model --features cli

Commands

Command
`hf-fm <REPO_ID>`	Download a model (multi-connection, auto-tuned)
`hf-fm diff <REPO_A> <REPO_B>`	Compare tensor layouts between two models
`hf-fm discover`	Find new model families on the Hub
`hf-fm download-file <REPO_ID> <FILE>`	Download a single file (or glob pattern)
`hf-fm du [REPO_ID]`	Show cache disk usage
`hf-fm inspect <REPO_ID> [FILE]`	Inspect safetensors headers (tensor names, shapes, dtypes)
`hf-fm list-families`	List model families in local cache
`hf-fm list-files <REPO_ID>`	List remote files (sizes, SHA256) without downloading
`hf-fm search <QUERY>`	Search the HuggingFace Hub for models
`hf-fm status [REPO_ID]`	Show download status (complete / partial / missing)

See CLI Reference for all flags and output examples.

Try it

$ hf-fm search mistral,3B,instruct
Models matching "mistral,3B,instruct" (by downloads):

  hf-fm mistralai/Ministral-3-3B-Instruct-2512           (159,700 downloads)
  hf-fm mistralai/Ministral-3-3B-Instruct-2512-BF16      (62,600 downloads)
  hf-fm mistralai/Ministral-3-3B-Instruct-2512-GGUF      (32,700 downloads)
  ...

$ hf-fm search mistralai/Ministral-3-3B-Instruct-2512 --exact
Exact match:

  hf-fm mistralai/Ministral-3-3B-Instruct-2512           (159,700 downloads)

  License:      apache-2.0
  Pipeline:     text-generation
  Library:      vllm
  Languages:    en, fr, es, de, it, pt, nl, zh, ja, ko, ar

$ hf-fm list-files mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors
  File                                               Size      SHA256
  model-00001-of-00002.safetensors                 3.68 GiB    a1b2c3d4e5f6
  model-00002-of-00002.safetensors                 2.88 GiB    f6e5d4c3b2a1
  config.json                                        856 B     —
  ...
  7 files, 6.57 GiB total

$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors --dry-run
  Repo:     mistralai/Ministral-3-3B-Instruct-2512
  Revision: main

  File                                               Size      Status
  model-00001-of-00002.safetensors                 3.68 GiB    to download
  model-00002-of-00002.safetensors                 2.88 GiB    to download
  ...
  Total: 6.57 GiB (7 files, 0 cached, 7 to download)

  Recommended config:
    concurrency:        2
    connections/file:   8
    chunk threshold:  100 MiB

$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors
Downloaded to: ~/.cache/huggingface/hub/models--mistralai--Ministral-3-3B.../snapshots/...
  6.57 GiB in 18.2s (369.1 MiB/s)

# Download to flat layout (files directly in ./models/)
$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors --flat --output-dir ./models

# Download sharded PyTorch files by glob
$ hf-fm download-file org/model "pytorch_model-*.bin"

Inspect & compare

$ hf-fm inspect EleutherAI/pythia-1.4b model.safetensors --cached --filter "layers.0."
  Repo:     EleutherAI/pythia-1.4b
  File:     model.safetensors
  Source:   cached

  Tensor                                             Dtype    Shape                  Size     Params
  gpt_neox.layers.0.attention.dense.weight           F16      [2048, 2048]       8.00 MiB       4.2M
  gpt_neox.layers.0.mlp.dense_h_to_4h.weight         F16      [8192, 2048]      32.00 MiB      16.8M
  ...
  ────────────────────────────────────────────────────────────────────────────────────────────────
  15/364 tensors, 54.6M/1.52B params (filter: "layers.0.")

$ hf-fm diff RedHatAI/Llama-3.2-1B-Instruct-FP8 casperhansen/llama-3.2-1b-instruct-awq --cached --summary
  A: RedHatAI/Llama-3.2-1B-Instruct-FP8
  B: casperhansen/llama-3.2-1b-instruct-awq
  ──────────────────────────────────────────────────────────────────────────────────────────────
  A: 371 tensors | B: 370 tensors | only-A: 337 | only-B: 336 | differ: 34 | match: 0

Inspect reads tensor metadata via HTTP Range requests (2 requests per file) — no weight data downloaded. Diff compares tensor names, dtypes, and shapes between any two models (remote or cached).

Disk usage

$ hf-fm du
    5.10 GiB  google/gemma-2-2b-it          (8 files)
    2.80 GiB  EleutherAI/pythia-1.4b        (12 files)
    1.20 GiB  google/gemma-scope-2b-pt-res  (3 files)
    ──────────────────────────────────────────────────
    9.10 GiB  total (3 repos, 23 files)

Library quick start

let outcome = hf_fetch_model::download(
    "google/gemma-2-2b-it".to_owned(),
).await?;

println!("Model at: {}", outcome.inner().display());

Filter, progress, auth, and more via the builder — see Configuration.

Documentation

Topic
CLI Reference	All subcommands, flags, and output examples
Search	Comma filtering, `--exact`, model card metadata
Configuration	Builder API, presets, progress callbacks
Architecture	How hf-fetch-model relates to `hf-hub` and `candle-mi`
Diagnostics	`--verbose` output, `tracing` setup for library users
Changelog	Release history and migration notes

Used by

candle-mi — Mechanistic interpretability toolkit for transformer models

License

Licensed under either of Apache License, Version 2.0 or MIT License at your option.

Development

Exclusively developed with Claude Code (dev) and Augment Code (review)
Git workflow managed with Fork
All code follows CONVENTIONS.md, derived from Amphigraphic-Strict's Grit — a strict Rust subset designed to improve AI coding accuracy.

hf-fetch-model 0.9.2