hf-fetch-model 0.9.7

Download, inspect, and compare HuggingFace models from Rust. Multi-connection parallel downloads plus safetensors header inspection via HTTP Range. No weight data downloaded.
Documentation

hf-fetch-model

CI Crates.io docs.rs Rust License

A Rust library and CLI for downloading and inspecting HuggingFace models. Multi-connection parallel downloads, file filtering, checksum verification, retry — plus safetensors header inspection and tensor layout comparison between models, all without downloading weight data.

Table of contents

Install

cargo install hf-fetch-model --features cli

Commands

Command Description
hf-fm <REPO_ID> (default) Download a model (multi-connection, auto-tuned)
hf-fm cache clean-partial Remove .chunked.part files from interrupted downloads
hf-fm cache delete <REPO_ID|N> Delete a cached model
hf-fm cache path <REPO_ID|N> Print snapshot directory path (for scripting)
hf-fm diff <REPO_A> <REPO_B> Compare tensor layouts between two models
hf-fm discover Find new model families on the Hub
hf-fm download-file <REPO_ID> <FILE> Download a single file (or glob pattern)
hf-fm du [REPO_ID|N] Show cache disk usage (by name or # index)
hf-fm inspect <REPO_ID> [FILE] Inspect safetensors headers (tensor names, shapes, dtypes) without downloading weights
hf-fm list-families List model families in local cache
hf-fm list-files <REPO_ID> List remote files (sizes, SHA256) without downloading
hf-fm search <QUERY> Search the HuggingFace Hub for models
hf-fm status [REPO_ID] Show download status (complete / partial / missing)

See CLI Reference for all flags and output examples.

Try it

$ hf-fm search mistral,3B,instruct
Models matching "mistral,3B,instruct" (by downloads):

  hf-fm mistralai/Ministral-3-3B-Instruct-2512           (159,700 downloads)
  hf-fm mistralai/Ministral-3-3B-Instruct-2512-BF16      (62,600 downloads)
  hf-fm mistralai/Ministral-3-3B-Instruct-2512-GGUF      (32,700 downloads)
  ...

$ hf-fm search llama --tag gguf --limit 3
Models matching "llama" (by downloads):

  hf-fm bartowski/Llama-3.2-3B-Instruct-GGUF             (489,856 downloads)  [text-generation]
  hf-fm bartowski/Meta-Llama-3.1-8B-Instruct-GGUF        (237,791 downloads)  [text-generation]
  hf-fm MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF    (184,847 downloads)  [text-generation]

$ hf-fm search mistralai/Ministral-3-3B-Instruct-2512 --exact
Exact match:

  hf-fm mistralai/Ministral-3-3B-Instruct-2512           (159,700 downloads)

  License:      apache-2.0
  Pipeline:     text-generation
  Library:      vllm
  Languages:    en, fr, es, de, it, pt, nl, zh, ja, ko, ar

$ hf-fm list-files mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors
  File                                               Size      SHA256
  model-00001-of-00002.safetensors                 3.68 GiB    a1b2c3d4e5f6
  model-00002-of-00002.safetensors                 2.88 GiB    f6e5d4c3b2a1
  config.json                                        856 B     —
  ...
  7 files, 6.57 GiB total

$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors --dry-run
  Repo:     mistralai/Ministral-3-3B-Instruct-2512
  Revision: main

  File                                               Size      Status
  model-00001-of-00002.safetensors                 3.68 GiB    to download
  model-00002-of-00002.safetensors                 2.88 GiB    to download
  ...
  Total: 6.57 GiB (7 files, 0 cached, 7 to download)

  Recommended config:
    concurrency:        2
    connections/file:   8
    chunk threshold:  100 MiB

$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors
Downloaded to: ~/.cache/huggingface/hub/models--mistralai--Ministral-3-3B.../snapshots/...
  6.57 GiB in 18.2s (369.1 MiB/s)

# Download to flat layout (files directly in ./models/)
$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors --flat --output-dir ./models

# Download sharded PyTorch files by glob
$ hf-fm download-file org/model "pytorch_model-*.bin"

Inspect & compare

$ hf-fm inspect EleutherAI/pythia-1.4b model.safetensors --cached --filter "layers.0."
  Repo:     EleutherAI/pythia-1.4b
  File:     model.safetensors
  Source:   cached

  Tensor                                             Dtype    Shape                  Size     Params
  gpt_neox.layers.0.attention.dense.weight           F16      [2048, 2048]       8.00 MiB       4.2M
  gpt_neox.layers.0.mlp.dense_h_to_4h.weight         F16      [8192, 2048]      32.00 MiB      16.8M
  ...
  ────────────────────────────────────────────────────────────────────────────────────────────────
  15/364 tensors, 54.6M/1.52B params (filter: "layers.0.")

$ hf-fm inspect google/gemma-4-E2B-it model.safetensors --tree --filter "embed"
  Repo:     google/gemma-4-E2B-it
  File:     model.safetensors
  Source:   remote (2 HTTP requests)

  └── model.
      ├── embed_audio.embedding_projection.weight   BF16  [1536, 1536]   4.50 MiB
      ├── embed_vision.embedding_projection.weight  BF16  [1536, 768]    2.25 MiB
      ├── language_model.
      │   ├── embed_tokens.weight            BF16  [262144, 1536]      768.00 MiB
      │   └── embed_tokens_per_layer.weight  BF16  [262144, 8960]        4.38 GiB
      └── vision_tower.patch_embedder.
          ├── input_proj.weight         BF16  [768, 768]        1.12 MiB
          └── position_embedding_table  BF16  [2, 10240, 768]  30.00 MiB
  6/2011 tensors, 2.77B/5.12B params (filter: "embed")

$ hf-fm diff RedHatAI/Llama-3.2-1B-Instruct-FP8 casperhansen/llama-3.2-1b-instruct-awq --cached --summary
  A: RedHatAI/Llama-3.2-1B-Instruct-FP8
  B: casperhansen/llama-3.2-1b-instruct-awq
  ──────────────────────────────────────────────────────────────────────────────────────────────
  A: 371 tensors | B: 370 tensors | only-A: 337 | only-B: 336 | differ: 34 | match: 0

Inspect reads tensor metadata via HTTP Range requests (2 requests per file) — no weight data downloaded. The --tree flag shows the hierarchical namespace with numeric sibling groups auto-collapsed to [0..N] for structural discovery. Diff compares tensor names, dtypes, and shapes between any two models (remote or cached).

Disk usage

$ hf-fm du
   #        SIZE  REPO                                             FILES
   1    5.10 GiB  google/gemma-2-2b-it                                 8
   2    2.80 GiB  EleutherAI/pythia-1.4b                              12  ●
   3    1.20 GiB  google/gemma-scope-2b-pt-res                         3
  ─────────────────────────────────────────────────────────────────────────────
   9.10 GiB  total (3 repos, 23 files)
  ● = partial downloads

$ hf-fm du 2
  EleutherAI/pythia-1.4b:

   #        SIZE  FILE
   1    2.50 GiB  model-00001-of-00002.safetensors
   2    0.26 GiB  model-00002-of-00002.safetensors
   ...
  ──────────────────────────────────────────────────────────────────
   2.80 GiB  total (12 files)

  ● partial downloads — run `hf-fm status EleutherAI/pythia-1.4b` for details

$ hf-fm du --age
   #        SIZE  REPO                                             FILES  AGE
   1    5.10 GiB  google/gemma-2-2b-it                                 8  2 days ago
   2    2.80 GiB  EleutherAI/pythia-1.4b                              12  45 days ago     ●
   3    1.20 GiB  google/gemma-scope-2b-pt-res                         3  3 months ago
  ─────────────────────────────────────────────────────────────────────────────────────────
   9.10 GiB  total (3 repos, 23 files)
  ● = partial downloads

$ hf-fm cache path google/gemma-2-2b-it
/home/user/.cache/huggingface/hub/models--google--gemma-2-2b-it/snapshots/abc1234

Library quick start

let outcome = hf_fetch_model::download(
    "google/gemma-2-2b-it".to_owned(),
).await?;

println!("Model at: {}", outcome.inner().display());

Filter, progress, auth, and more via the builder — see Configuration.

Documentation

Topic
CLI Reference All subcommands, flags, and output examples
FAQ Common questions — installation, auth, cache location, discovery, errors
Search Comma filtering, --exact, model card metadata
Configuration Builder API, presets, progress callbacks
Architecture How hf-fetch-model relates to hf-hub and candle-mi
Diagnostics --verbose output, tracing setup for library users
Upstream differences Where hf-fetch-model diverges from Python huggingface_hub/hf_transfer
Candle example Inspect tensor layouts before downloading — for candle users
Changelog Release history and migration notes

Used by

  • candle-mi — Mechanistic interpretability toolkit for language models

License

Licensed under either of Apache License, Version 2.0 or MIT License at your option.

Development