Expand description
§hf-fetch-model
Fast HuggingFace model downloads for Rust.
An embeddable library for downloading HuggingFace model repositories
with maximum throughput. Wraps hf_hub and adds repo-level orchestration.
§Quick Start
let outcome = hf_fetch_model::download("julien-c/dummy-unknown".to_owned()).await?;
println!("Model at: {}", outcome.inner().display());§Configured Download
use hf_fetch_model::FetchConfig;
let config = FetchConfig::builder()
.filter("*.safetensors")
.filter("*.json")
.on_progress(|e| {
println!("{}: {:.1}%", e.filename, e.percent);
})
.build()?;
let outcome = hf_fetch_model::download_with_config(
"google/gemma-2-2b".to_owned(),
&config,
).await?;
// outcome.is_cached() tells you if it came from local cache
let path = outcome.into_inner();§Inspect Before Downloading
Read tensor metadata from .safetensors headers via HTTP Range requests —
no weight data downloaded. Sharded repos (those with
model.safetensors.index.json) work transparently —
inspect::inspect_repo_safetensors reads every shard’s header in parallel
and returns a flat per-file result list. See
examples/candle_inspect.rs
for a runnable example, or the
Inspect tutorial
for a narrative walkthrough.
let results = hf_fetch_model::inspect::inspect_repo_safetensors(
"EleutherAI/pythia-1.4b", None, None,
).await?;
for (filename, header, _source) in &results {
println!("{filename}: {} tensors", header.tensors.len());
}The CLI also exposes hf-fm inspect <repo> [FILE] --check-gpu [N] (v0.10.1)
to print a one-line GPU-fit verdict against device N (default 0) using
the hypomnesis crate (NVML on Linux/Windows, DXGI on Windows). The
verdict is a binary-only feature today; no library equivalent is exposed
— depend on hypomnesis directly if you need the device-info numbers
from library code.
§HuggingFace Cache
Downloaded files are stored in the standard HuggingFace cache directory
(~/.cache/huggingface/hub/), ensuring compatibility with Python tooling.
§Cache Management
v0.10.0 adds library APIs for inspecting, verifying, and pruning the local
cache. cache::cache_summary enumerates every cached repo with size and
file counts; cache::repo_status gives a per-file Complete / Partial /
Missing breakdown for one repo; cache::verify_cache re-checks SHA256
digests of cached files against HuggingFace LFS metadata; and
cache::find_partial_files locates .chunked.part orphans from
interrupted downloads.
For long verifications (multi-GiB safetensors files), drive
cache::verify_cache_with_progress with an Fn callback that receives
cache::VerifyEvents so a CLI or GUI can render a spinner or progress
bar without polling.
use hf_fetch_model::cache::{self, VerifyStatus};
let results = cache::verify_cache("google/gemma-2-2b-it", None, None).await?;
let ok = results
.iter()
.filter(|r| matches!(r.status, VerifyStatus::Ok))
.count();
let mismatch = results
.iter()
.filter(|r| matches!(r.status, VerifyStatus::Mismatch { .. }))
.count();
println!("{}/{} files verified, {} mismatches", ok, results.len(), mismatch);§Download Durability
Multi-connection downloads survive interruption. When a download is
aborted by FetchConfigBuilder::timeout_per_file (default 300 s),
Ctrl-C, panic, or a transient chunk error, the partial .chunked.part
file plus a small per-chunk progress sidecar are kept on disk. The next
call to download_with_config for the same file picks up where it
stopped — each parallel chunk sends a fresh Range request that skips
the bytes it already has — provided the upstream etag still matches.
On etag change, schema-version mismatch, or a different
FetchConfigBuilder::connections_per_file count, the partial is
discarded and a fresh download starts.
For slow connections on multi-GiB files, raise the per-file budget to match real throughput:
use std::time::Duration;
use hf_fetch_model::FetchConfig;
let config = FetchConfig::builder()
.timeout_per_file(Duration::from_secs(1800))
.build()?;§Authentication
Set the HF_TOKEN environment variable to access private or gated models,
or use FetchConfig::builder().token().
Re-exports§
pub use config::compile_glob_patterns;pub use config::file_matches;pub use config::has_glob_chars;pub use config::FetchConfig;pub use config::FetchConfigBuilder;pub use config::Filter;pub use discover::DiscoveredFamily;pub use discover::GateStatus;pub use discover::ModelCardMetadata;pub use discover::SearchResult;pub use download::DownloadOutcome;pub use error::FetchError;pub use error::FileFailure;pub use inspect::AdapterConfig;pub use plan::download_plan;pub use plan::DownloadPlan;pub use plan::FilePlan;pub use progress::ProgressEvent;pub use progress::ProgressReceiver;
Modules§
- cache
HuggingFacecache directory resolution, model family scanning, disk usage, and integrity verification.- cache_
layout - Centralized
hf-hubcache path construction. - checksum
- SHA256 checksum verification for downloaded files.
- config
- Configuration for model downloads.
- discover
- Model family discovery and search via the
HuggingFaceHub API. - download
- Download orchestration for
HuggingFacemodel repositories. - error
- Error types for hf-fetch-model.
- inspect
- Safetensors header inspection (local and remote).
- plan
- Download plan: metadata-only analysis of what needs downloading.
- progress
- Progress reporting for model downloads.
- repo
- Repository file listing via the
HuggingFaceAPI.
Functions§
- build_
client - Builds a
reqwest::Clientwith auth token, user-agent, and 30-second TCP connect timeout. - download
- Downloads all files from a
HuggingFacemodel repository. - download_
blocking - Blocking version of
download()for non-async callers. - download_
file - Downloads a single file from a
HuggingFacemodel repository. - download_
file_ blocking - Blocking version of
download_file()for non-async callers. - download_
files - Downloads all files from a
HuggingFacemodel repository and returns a filename → path map. - download_
files_ blocking - Blocking version of
download_files()for non-async callers. - download_
files_ with_ config - Downloads files from a
HuggingFacemodel repository using the given configuration and returns a filename → path map. - download_
files_ with_ config_ blocking - Blocking version of
download_files_with_config()for non-async callers. - download_
with_ config - Downloads files from a
HuggingFacemodel repository using the given configuration. - download_
with_ config_ blocking - Blocking version of
download_with_config()for non-async callers. - download_
with_ plan - Downloads files according to an existing
DownloadPlan. - download_
with_ plan_ blocking - Blocking version of
download_with_plan()for non-async callers.