hf-fetch-model 0.9.2

Fast HuggingFace model downloads for Rust — an embeddable library for downloading HuggingFace models with maximum throughput
Documentation

hf-fetch-model

CI Crates.io docs.rs Rust License

A Rust library and CLI for downloading and inspecting HuggingFace models. Multi-connection parallel downloads, file filtering, checksum verification, retry — plus safetensors header inspection and tensor layout comparison between models, all without downloading weight data.

Table of contents

Install

cargo install hf-fetch-model --features cli

Commands

Command
hf-fm <REPO_ID> Download a model (multi-connection, auto-tuned)
hf-fm diff <REPO_A> <REPO_B> Compare tensor layouts between two models
hf-fm discover Find new model families on the Hub
hf-fm download-file <REPO_ID> <FILE> Download a single file (or glob pattern)
hf-fm du [REPO_ID] Show cache disk usage
hf-fm inspect <REPO_ID> [FILE] Inspect safetensors headers (tensor names, shapes, dtypes)
hf-fm list-families List model families in local cache
hf-fm list-files <REPO_ID> List remote files (sizes, SHA256) without downloading
hf-fm search <QUERY> Search the HuggingFace Hub for models
hf-fm status [REPO_ID] Show download status (complete / partial / missing)

See CLI Reference for all flags and output examples.

Try it

$ hf-fm search mistral,3B,instruct
Models matching "mistral,3B,instruct" (by downloads):

  hf-fm mistralai/Ministral-3-3B-Instruct-2512           (159,700 downloads)
  hf-fm mistralai/Ministral-3-3B-Instruct-2512-BF16      (62,600 downloads)
  hf-fm mistralai/Ministral-3-3B-Instruct-2512-GGUF      (32,700 downloads)
  ...

$ hf-fm search mistralai/Ministral-3-3B-Instruct-2512 --exact
Exact match:

  hf-fm mistralai/Ministral-3-3B-Instruct-2512           (159,700 downloads)

  License:      apache-2.0
  Pipeline:     text-generation
  Library:      vllm
  Languages:    en, fr, es, de, it, pt, nl, zh, ja, ko, ar

$ hf-fm list-files mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors
  File                                               Size      SHA256
  model-00001-of-00002.safetensors                 3.68 GiB    a1b2c3d4e5f6
  model-00002-of-00002.safetensors                 2.88 GiB    f6e5d4c3b2a1
  config.json                                        856 B     —
  ...
  7 files, 6.57 GiB total

$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors --dry-run
  Repo:     mistralai/Ministral-3-3B-Instruct-2512
  Revision: main

  File                                               Size      Status
  model-00001-of-00002.safetensors                 3.68 GiB    to download
  model-00002-of-00002.safetensors                 2.88 GiB    to download
  ...
  Total: 6.57 GiB (7 files, 0 cached, 7 to download)

  Recommended config:
    concurrency:        2
    connections/file:   8
    chunk threshold:  100 MiB

$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors
Downloaded to: ~/.cache/huggingface/hub/models--mistralai--Ministral-3-3B.../snapshots/...
  6.57 GiB in 18.2s (369.1 MiB/s)

# Download to flat layout (files directly in ./models/)
$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors --flat --output-dir ./models

# Download sharded PyTorch files by glob
$ hf-fm download-file org/model "pytorch_model-*.bin"

Inspect & compare

$ hf-fm inspect EleutherAI/pythia-1.4b model.safetensors --cached --filter "layers.0."
  Repo:     EleutherAI/pythia-1.4b
  File:     model.safetensors
  Source:   cached

  Tensor                                             Dtype    Shape                  Size     Params
  gpt_neox.layers.0.attention.dense.weight           F16      [2048, 2048]       8.00 MiB       4.2M
  gpt_neox.layers.0.mlp.dense_h_to_4h.weight         F16      [8192, 2048]      32.00 MiB      16.8M
  ...
  ────────────────────────────────────────────────────────────────────────────────────────────────
  15/364 tensors, 54.6M/1.52B params (filter: "layers.0.")

$ hf-fm diff RedHatAI/Llama-3.2-1B-Instruct-FP8 casperhansen/llama-3.2-1b-instruct-awq --cached --summary
  A: RedHatAI/Llama-3.2-1B-Instruct-FP8
  B: casperhansen/llama-3.2-1b-instruct-awq
  ──────────────────────────────────────────────────────────────────────────────────────────────
  A: 371 tensors | B: 370 tensors | only-A: 337 | only-B: 336 | differ: 34 | match: 0

Inspect reads tensor metadata via HTTP Range requests (2 requests per file) — no weight data downloaded. Diff compares tensor names, dtypes, and shapes between any two models (remote or cached).

Disk usage

$ hf-fm du
    5.10 GiB  google/gemma-2-2b-it          (8 files)
    2.80 GiB  EleutherAI/pythia-1.4b        (12 files)
    1.20 GiB  google/gemma-scope-2b-pt-res  (3 files)
    ──────────────────────────────────────────────────
    9.10 GiB  total (3 repos, 23 files)

Library quick start

let outcome = hf_fetch_model::download(
    "google/gemma-2-2b-it".to_owned(),
).await?;

println!("Model at: {}", outcome.inner().display());

Filter, progress, auth, and more via the builder — see Configuration.

Documentation

Topic
CLI Reference All subcommands, flags, and output examples
Search Comma filtering, --exact, model card metadata
Configuration Builder API, presets, progress callbacks
Architecture How hf-fetch-model relates to hf-hub and candle-mi
Diagnostics --verbose output, tracing setup for library users
Changelog Release history and migration notes

Used by

  • candle-mi — Mechanistic interpretability toolkit for transformer models

License

Licensed under either of Apache License, Version 2.0 or MIT License at your option.

Development