Crate fetch_source

Crate fetch_source 

Source
Expand description

Declare external source dependencies in Cargo.toml and fetch them programmatically.

This crate allows you to define external sources (Git repositories, tar archives) in your Cargo.toml under [package.metadata.fetch-source] and fetch them programmatically. This crate is intended for use in build scripts where Rust bindings are generated from external source(s).

Inspired by CMake’s FetchContent module.

§Core Features

  • Define sources directly in your project metadata.
  • Cache fetched sources for efficient sharing between projects.
  • Clone git repositories (possibly recursively) by branch, tag, or specific commit (requires git to be installed and available on PATH).

§Optional Features

  • tar: Download and extract .tar.gz archives. This is an optional feature because it uses the reqwest crate which brings quite a few more dependencies.
  • rayon: Fetch sources in parallel with rayon.

§Basic Usage

Parse external sources declared in your Cargo.toml like so:

// Imagine this is in your Cargo.toml:
let cargo_toml = r#"
[package.metadata.fetch-source]
my-repo = { git = "https://github.com/user/repo.git", recursive = true }
other-repo = { git = "https://github.com/user/project.git", branch = "the-feature" }
my-data = { tar = "https://example.com/data.tar.gz" }
"#;

for (name, source) in fetch_source::try_parse_toml(cargo_toml)? {
    println!("{name}: {source}");
}

Fetch all sources into a directory:

use std::path::PathBuf;

let cargo_toml = r#"
[package.metadata.fetch-source]
"syn::latest" = { git = "https://github.com/dtolnay/syn.git" }
"syn::1.0.0" = { tar = "https://github.com/dtolnay/syn/archive/refs/tags/1.0.0.tar.gz" }
"#;

let out_dir = PathBuf::from(std::env::temp_dir());
for err in fetch_source::try_parse_toml(cargo_toml)?.into_iter()
    .map(|(_, source)| source.fetch(&out_dir))
    .filter_map(Result::err) {
    eprintln!("{err}");
}

With rayon, it’s trivial to fetch sources in parallel:

use rayon::prelude::*;
use std::path::PathBuf;

let cargo_toml = r#"
[package.metadata.fetch-source]
"syn::latest" = { git = "https://github.com/dtolnay/syn.git" }
"syn::1.0.0" = { tar = "https://github.com/dtolnay/syn/archive/refs/tags/1.0.0.tar.gz" }
"#;

let out_dir = PathBuf::from(std::env::temp_dir());
fetch_source::try_parse_toml(cargo_toml)?.into_par_iter()
    .map(|(_, source)| source.fetch(&out_dir))
    .filter_map(Result::err)
    .for_each(|err| eprintln!("{err}"));

§Caching Sources

Cache sources for efficient sharing across repeated builds. Refer to the same source across different builds or projects by using the same source definition in Cargo.toml.

let cache = Cache::load_or_create(std::env::temp_dir())?;

let project1 = r#"
[package.metadata.fetch-source]
"syn::latest" = { git = "https://github.com/dtolnay/syn.git" }
"#;

let sources1 = fetch_source::try_parse_toml(project1)?;
// Check where this source would be cached
let cache_latest = cache.cached_path(&sources1.get("syn::latest").unwrap());

// Note the re-use of 'syn::latest' with a different definition!
let project2 = r#"
[package.metadata.fetch-source]
"syn::greatest" = { git = "https://github.com/dtolnay/syn.git" }
"syn::latest" = { git = "https://github.com/dtolnay/syn.git", branch = "dev" }
"#;

let sources2 = fetch_source::try_parse_toml(project2)?;
let cache_greatest = cache.cached_path(&sources2.get("syn::greatest").unwrap());
let cache_dev = cache.cached_path(&sources2.get("syn::latest").unwrap());

// The same source by a different name from a different project is the same in the cache
assert_eq!(cache_latest, cache_greatest);

// The name doesn't uniquely identify a source - only the definition of the source matters
assert_ne!(cache_latest, cache_dev);

§Declaring sources

The keys in the package.metadata.fetch-source table name a remote source. They can include any path character and zero or more ‘::’ separators. Each ::-separated component of a name maps to a subdirectory of the output directory.

Each value in the package.metadata.fetch-source table must be a table which identifies the remote source it represents:

Tar archives

  • The tar key gives the URL of the archive.

Git repos

  • The git key gives the SSH or HTTPS upstream URL.
  • Any one of the branch/tag/rev keys indicates what to clone. The default is to clone the default branch.
  • Use recursive = true to recursively clone submodules.
  • All clones are shallow, i.e. with a depth of 1.

Structs§

Artefact
Represents a source that has been fetched from a remote location.
Cache
Owns data about cached sources and is responsible for its persistence.
CacheDir
The absolute path to a cached artefact
CacheItems
Records data about the cached sources and where their artefacts are within a Cache.
CacheRoot
The root directory of a cache
Digest
The digest associated with the definition of a Source
Error
The main error type for this crate.
FetchError
Errors that occur during fetching
Git
Represents a remote git repository to be cloned.
RelCacheDir
The path of a cached artefact relative to the cache root
Tar
Represents a remote tar archive to be downloaded and extracted.

Enums§

ErrorKind
The different kinds of error that can be emitted by this crate.
Source
Represents an entry in the package.metadata.fetch-source table.
SourceParseError
Errors encountered when parsing sources from Cargo.toml

Functions§

cache_all_par
Convenience function to update the given cache with all missing sources in parallel. Returns any errors that occurred when fetching the missing sources.
fetch_all
Convenience function to fetch all sources serially
fetch_all_par
Convenience function to fetch all sources in parallel
load_sources
Convenience function to load sources from Cargo.toml in the given directory
try_parse_toml
Parse the contents of a Cargo.toml file containing the package.metadata.fetch-source table into a SourcesTable map.

Type Aliases§

FetchResult
Represents the result of a fetch operation
SourceName
The name of a source
SourcesTable
Represents the contents of the package.metadata.fetch-source table in a Cargo.toml file.