Skip to main content

Crate hf_fetch_model

Crate hf_fetch_model 

Source
Expand description

§hf-fetch-model

Fast HuggingFace model downloads for Rust.

An embeddable library for downloading HuggingFace model repositories with maximum throughput. Wraps hf_hub and adds repo-level orchestration.

§Quick Start

let outcome = hf_fetch_model::download("julien-c/dummy-unknown".to_owned()).await?;
println!("Model at: {}", outcome.inner().display());

§Configured Download

use hf_fetch_model::FetchConfig;

let config = FetchConfig::builder()
    .filter("*.safetensors")
    .filter("*.json")
    .on_progress(|e| {
        println!("{}: {:.1}%", e.filename, e.percent);
    })
    .build()?;

let outcome = hf_fetch_model::download_with_config(
    "google/gemma-2-2b".to_owned(),
    &config,
).await?;
// outcome.is_cached() tells you if it came from local cache
let path = outcome.into_inner();

§Inspect Before Downloading

Read tensor metadata from .safetensors headers via HTTP Range requests — no weight data downloaded. Sharded repos (those with model.safetensors.index.json) work transparently — inspect::inspect_repo_safetensors reads every shard’s header in parallel and returns a flat per-file result list. See examples/candle_inspect.rs for a runnable example, or the Inspect tutorial for a narrative walkthrough.

let results = hf_fetch_model::inspect::inspect_repo_safetensors(
    "EleutherAI/pythia-1.4b", None, None,
).await?;

for (filename, header, _source) in &results {
    println!("{filename}: {} tensors", header.tensors.len());
}

The CLI also exposes hf-fm inspect <repo> [FILE] --check-gpu [N] (v0.10.1) to print a one-line GPU-fit verdict against device N (default 0) using the hypomnesis crate (NVML on Linux/Windows, DXGI on Windows). The verdict is a binary-only feature today; no library equivalent is exposed — depend on hypomnesis directly if you need the device-info numbers from library code.

§HuggingFace Cache

Downloaded files are stored in the standard HuggingFace cache directory (~/.cache/huggingface/hub/), ensuring compatibility with Python tooling.

§Cache Management

v0.10.0 adds library APIs for inspecting, verifying, and pruning the local cache. cache::cache_summary enumerates every cached repo with size and file counts; cache::repo_status gives a per-file Complete / Partial / Missing breakdown for one repo; cache::verify_cache re-checks SHA256 digests of cached files against HuggingFace LFS metadata; and cache::find_partial_files locates .chunked.part orphans from interrupted downloads.

For long verifications (multi-GiB safetensors files), drive cache::verify_cache_with_progress with an Fn callback that receives cache::VerifyEvents so a CLI or GUI can render a spinner or progress bar without polling.

use hf_fetch_model::cache::{self, VerifyStatus};

let results = cache::verify_cache("google/gemma-2-2b-it", None, None).await?;
let ok = results
    .iter()
    .filter(|r| matches!(r.status, VerifyStatus::Ok))
    .count();
let mismatch = results
    .iter()
    .filter(|r| matches!(r.status, VerifyStatus::Mismatch { .. }))
    .count();
println!("{}/{} files verified, {} mismatches", ok, results.len(), mismatch);

§Download Durability

Multi-connection downloads survive interruption. When a download is aborted by FetchConfigBuilder::timeout_per_file (default 300 s), Ctrl-C, panic, or a transient chunk error, the partial .chunked.part file plus a small per-chunk progress sidecar are kept on disk. The next call to download_with_config for the same file picks up where it stopped — each parallel chunk sends a fresh Range request that skips the bytes it already has — provided the upstream etag still matches. On etag change, schema-version mismatch, or a different FetchConfigBuilder::connections_per_file count, the partial is discarded and a fresh download starts.

For slow connections on multi-GiB files, raise the per-file budget to match real throughput:

use std::time::Duration;
use hf_fetch_model::FetchConfig;

let config = FetchConfig::builder()
    .timeout_per_file(Duration::from_secs(1800))
    .build()?;

§Authentication

Set the HF_TOKEN environment variable to access private or gated models, or use FetchConfig::builder().token().

Re-exports§

pub use config::compile_glob_patterns;
pub use config::file_matches;
pub use config::has_glob_chars;
pub use config::FetchConfig;
pub use config::FetchConfigBuilder;
pub use config::Filter;
pub use discover::DiscoveredFamily;
pub use discover::GateStatus;
pub use discover::ModelCardMetadata;
pub use discover::SearchResult;
pub use download::DownloadOutcome;
pub use error::FetchError;
pub use error::FileFailure;
pub use inspect::AdapterConfig;
pub use plan::download_plan;
pub use plan::DownloadPlan;
pub use plan::FilePlan;
pub use progress::ProgressEvent;
pub use progress::ProgressReceiver;

Modules§

cache
HuggingFace cache directory resolution, model family scanning, disk usage, and integrity verification.
cache_layout
Centralized hf-hub cache path construction.
checksum
SHA256 checksum verification for downloaded files.
config
Configuration for model downloads.
discover
Model family discovery and search via the HuggingFace Hub API.
download
Download orchestration for HuggingFace model repositories.
error
Error types for hf-fetch-model.
inspect
Safetensors header inspection (local and remote).
plan
Download plan: metadata-only analysis of what needs downloading.
progress
Progress reporting for model downloads.
repo
Repository file listing via the HuggingFace API.

Functions§

build_client
Builds a reqwest::Client with auth token, user-agent, and 30-second TCP connect timeout.
download
Downloads all files from a HuggingFace model repository.
download_blocking
Blocking version of download() for non-async callers.
download_file
Downloads a single file from a HuggingFace model repository.
download_file_blocking
Blocking version of download_file() for non-async callers.
download_files
Downloads all files from a HuggingFace model repository and returns a filename → path map.
download_files_blocking
Blocking version of download_files() for non-async callers.
download_files_with_config
Downloads files from a HuggingFace model repository using the given configuration and returns a filename → path map.
download_files_with_config_blocking
Blocking version of download_files_with_config() for non-async callers.
download_with_config
Downloads files from a HuggingFace model repository using the given configuration.
download_with_config_blocking
Blocking version of download_with_config() for non-async callers.
download_with_plan
Downloads files according to an existing DownloadPlan.
download_with_plan_blocking
Blocking version of download_with_plan() for non-async callers.