Crate globmatch

source ·
Expand description

This crate provides cross platform matching for globs with relative path prefixes.

For CLI utilities it can be a common pattern to operate on a set of files. Such a set of files is either provided directly, as parameter to the tool - or via configuration files. The use of a configuration file makes it easier to determine the location of a file since the path can be specified relative to the configuration. Consider, e.g., the following .json input:

{
  "globs": [
    "../../../some/text-files/**/*.txt",
    "other/inputs/*.md",
    "paths/from/dir[0-9]/*.*"
  ]
}

Specifying these paths in a dedicated configuration file allows to resolve the paths independent of the invocation of the script operating on these files, the location of the configuration file is used as base directory.

This crate combines the features of the existing crates globset and walkdir to implement a relative glob matcher:

  • A Builder is created for each glob in the same style as in globset::Glob.
  • A Matcher is created from the Builder using Builder::build. This call resolves the relative path components within the glob by “moving” it to the specified root directory.
  • The Matcher is then transformed into an iterator yielding path::PathBuf.

For the previous example it would be sufficient to use one builder per glob and to specify the root folder when building the pattern (see examples below).

Globs

Please check the documentation of globset for the available glob format.

Example: A simple match.

The following example uses the files stored in the test-files/c-simple folder, we’re trying to match all the .txt files using the glob test-files/c-simple/**/*.txt (where test-files/c-simple is the only relative path component).

/*
    Example files:
    globmatch/test-files/c-simple/.hidden
    globmatch/test-files/c-simple/.hidden/h_1.txt
    globmatch/test-files/c-simple/.hidden/h_0.txt
    globmatch/test-files/c-simple/a/a2/a2_0.txt
    globmatch/test-files/c-simple/a/a0/a0_0.txt
    globmatch/test-files/c-simple/a/a0/a0_1.txt
    globmatch/test-files/c-simple/a/a0/A0_3.txt
    globmatch/test-files/c-simple/a/a0/a0_2.md
    globmatch/test-files/c-simple/a/a1/a1_0.txt
    globmatch/test-files/c-simple/some_file.txt
    globmatch/test-files/c-simple/b/b_0.txt
 */

use globmatch;

let builder = globmatch::Builder::new("test-files/c-simple/**/*.txt")
    .build(env!("CARGO_MANIFEST_DIR"))?;

let paths: Vec<_> = builder.into_iter()
    .flatten()
    .collect();

println!(
    "paths:\n{}",
    paths
        .iter()
        .map(|p| format!("{}", p.to_string_lossy()))
        .collect::<Vec<_>>()
        .join("\n")
);

assert_eq!(6 + 2 + 1, paths.len());

Example: Specifying options and using .filter_entry.

Similar to the builder pattern in globset when using globset::GlobBuilder, this crate allows to pass options (currently just case sensitivity) to the builder.

In addition, the filter_entry function from walkdir is accessible, but only as a single call (this crate does not implement a recursive iterator). This function allows filter files and folders before matching against the provided glob and therefore to efficiently exclude files and folders, e.g., hidden folders:

use globmatch;

let root = env!("CARGO_MANIFEST_DIR");
let pattern = "test-files/c-simple/**/[ah]*.txt";

let builder = globmatch::Builder::new(pattern)
    .case_sensitive(true)
    .build(root)?;

let paths: Vec<_> = builder
    .into_iter()
    .filter_entry(|p| !globmatch::is_hidden_entry(p))
    .flatten()
    .collect();

assert_eq!(4, paths.len());

Example: Filtering with .build_glob.

The above examples demonstrated how to search for paths using this crate. Two more builder functions are available for additional matching on the paths yielded by the iterator, e.g., to further limit the files (e.g., based on a global blacklist).

  • Builder::build_glob to create a single Glob (caution: the builder only checks that the pattern is not empty, but allows absolute paths).
  • Builder::build_glob_set to create a Glob matcher that contains two globs [glob, **/glob] out of the specified glob parameter of Builder::new. The pattern must not be an absolute path.
use globmatch;

let root = env!("CARGO_MANIFEST_DIR");
let pattern = "test-files/c-simple/**/a*.*";

let builder = globmatch::Builder::new(pattern)
    .case_sensitive(true)
    .build(root)?;

let glob = globmatch::Builder::new("*.txt").build_glob_set()?;

let paths: Vec<_> = builder
    .into_iter()
    .filter_entry(|p| !globmatch::is_hidden_entry(p))
    .flatten()
    .filter(|p| glob.is_match(p))
    .collect();

assert_eq!(4, paths.len());

Modules

  • This module implements common usecases.

Structs

Functions