Crate hf_fetch_model

Expand description

§hf-fetch-model

Fast HuggingFace model downloads for Rust.

An embeddable library for downloading HuggingFace model repositories with maximum throughput. Wraps hf_hub and adds repo-level orchestration.

§Quick Start

let outcome = hf_fetch_model::download("julien-c/dummy-unknown".to_owned()).await?;
println!("Model at: {}", outcome.inner().display());

§Configured Download

use hf_fetch_model::FetchConfig;

let config = FetchConfig::builder()
    .filter("*.safetensors")
    .filter("*.json")
    .on_progress(|e| {
        println!("{}: {:.1}%", e.filename, e.percent);
    })
    .build()?;

let outcome = hf_fetch_model::download_with_config(
    "google/gemma-2-2b".to_owned(),
    &config,
).await?;
// outcome.is_cached() tells you if it came from local cache
let path = outcome.into_inner();

§Inspect Before Downloading

Read tensor metadata from .safetensors headers via HTTP Range requests — no weight data downloaded. Sharded repos (those with model.safetensors.index.json) work transparently — inspect::inspect_repo_safetensors reads every shard’s header in parallel and returns a flat per-file result list. See examples/candle_inspect.rs for a runnable example, or the Inspect tutorial for a narrative walkthrough.

let results = hf_fetch_model::inspect::inspect_repo_safetensors(
    "EleutherAI/pythia-1.4b", None, None,
).await?;

for (filename, header, _source) in &results {
    println!("{filename}: {} tensors", header.tensors.len());
}

The CLI also exposes hf-fm inspect <repo> [FILE] --check-gpu [N] (v0.10.1) to print a one-line GPU-fit verdict against device N (default 0) using the hypomnesis crate (NVML on Linux/Windows, DXGI on Windows). The verdict is a binary-only feature today; no library equivalent is exposed — depend on hypomnesis directly if you need the device-info numbers from library code.

§`HuggingFace` Cache

Downloaded files are stored in the standard HuggingFace cache directory (~/.cache/huggingface/hub/), ensuring compatibility with Python tooling.

§Cache Management

v0.10.0 adds library APIs for inspecting, verifying, and pruning the local cache. cache::cache_summary enumerates every cached repo with size and file counts; cache::repo_status gives a per-file Complete / Partial / Missing breakdown for one repo; cache::verify_cache re-checks SHA256 digests of cached files against HuggingFace LFS metadata; and cache::find_partial_files locates .chunked.part orphans from interrupted downloads.

For long verifications (multi-GiB safetensors files), drive cache::verify_cache_with_progress with an Fn callback that receives cache::VerifyEvents so a CLI or GUI can render a spinner or progress bar without polling.

use hf_fetch_model::cache::{self, VerifyStatus};

let results = cache::verify_cache("google/gemma-2-2b-it", None, None).await?;
let ok = results
    .iter()
    .filter(|r| matches!(r.status, VerifyStatus::Ok))
    .count();
let mismatch = results
    .iter()
    .filter(|r| matches!(r.status, VerifyStatus::Mismatch { .. }))
    .count();
println!("{}/{} files verified, {} mismatches", ok, results.len(), mismatch);

§Download Durability

Multi-connection downloads survive interruption. When a download is aborted by FetchConfigBuilder::timeout_per_file (default 300 s), Ctrl-C, panic, or a transient chunk error, the partial .chunked.part file plus a small per-chunk progress sidecar are kept on disk. The next call to download_with_config for the same file picks up where it stopped — each parallel chunk sends a fresh Range request that skips the bytes it already has — provided the upstream etag still matches. On etag change, schema-version mismatch, or a different FetchConfigBuilder::connections_per_file count, the partial is discarded and a fresh download starts.

For slow connections on multi-GiB files, raise the per-file budget to match real throughput:

use std::time::Duration;
use hf_fetch_model::FetchConfig;

let config = FetchConfig::builder()
    .timeout_per_file(Duration::from_secs(1800))
    .build()?;

§Authentication

Set the HF_TOKEN environment variable to access private or gated models, or use FetchConfig::builder().token().

Re-exports§

pub use config::compile_glob_patterns;
pub use config::file_matches;
pub use config::has_glob_chars;
pub use config::FetchConfig;
pub use config::FetchConfigBuilder;
pub use config::Filter;
pub use discover::DiscoveredFamily;
pub use discover::GateStatus;
pub use discover::ModelCardMetadata;
pub use discover::SearchResult;
pub use download::DownloadOutcome;
pub use error::FetchError;
pub use error::FileFailure;
pub use inspect::AdapterConfig;
pub use plan::download_plan;
pub use plan::DownloadPlan;
pub use plan::FilePlan;
pub use progress::ProgressEvent;
pub use progress::ProgressReceiver;

Modules§

cache: HuggingFace cache directory resolution, model family scanning, disk usage, and integrity verification.
cache_layout: Centralized hf-hub cache path construction.
checksum: SHA256 checksum verification for downloaded files.
config: Configuration for model downloads.
discover: Model family discovery and search via the HuggingFace Hub API.
download: Download orchestration for HuggingFace model repositories.
error: Error types for hf-fetch-model.
inspect: Safetensors header inspection (local and remote).
plan: Download plan: metadata-only analysis of what needs downloading.
progress: Progress reporting for model downloads.
repo: Repository file listing via the HuggingFace API.

Functions§

build_client: Builds a reqwest::Client with auth token, user-agent, and 30-second TCP connect timeout.
download: Downloads all files from a HuggingFace model repository.
download_blocking: Blocking version of download() for non-async callers.
download_file: Downloads a single file from a HuggingFace model repository.
download_file_blocking: Blocking version of download_file() for non-async callers.
download_files: Downloads all files from a HuggingFace model repository and returns a filename → path map.
download_files_blocking: Blocking version of download_files() for non-async callers.
download_files_with_config: Downloads files from a HuggingFace model repository using the given configuration and returns a filename → path map.
download_files_with_config_blocking: Blocking version of download_files_with_config() for non-async callers.
download_with_config: Downloads files from a HuggingFace model repository using the given configuration.
download_with_config_blocking: Blocking version of download_with_config() for non-async callers.
download_with_plan: Downloads files according to an existing DownloadPlan.
download_with_plan_blocking: Blocking version of download_with_plan() for non-async callers.