Skip to main content

Crate glob_set

Crate glob_set 

Source
Expand description

§glob-set

Crates.io docs.rs GitHub License

A globset-compatible glob matcher – no_std, no regex, built on glob-matcher

§Usage

use glob_set::{Glob, GlobSet, GlobSetBuilder};

let mut builder = GlobSetBuilder::new();
builder.add(Glob::new("*.rs").unwrap());
builder.add(Glob::new("*.toml").unwrap());
let set = builder.build().unwrap();

assert!(set.is_match("main.rs"));
assert!(set.is_match("Cargo.toml"));
assert!(!set.is_match("index.js"));

§Features

  • no_std compatible (uses alloc)
  • No regex dependency – built on glob-matcher
  • Aho-Corasick pre-filter for efficient multi-pattern matching
  • API-compatible with globset: Glob, GlobSet, GlobBuilder, Candidate
  • Supports *, **, ?, [...], and {a,b} patterns

§Benchmarks

All benchmarks use Criterion and compare glob-set against the original globset crate. The single-pattern benchmarks (ext, short, long, many_short) are sourced from the upstream ripgrep globset benchmarks. The multi-pattern benchmarks (glob_set vs globset) test 8 patterns against 10 paths, representative of typical schema-catalog file matching.

Run them with:

cargo bench -p glob-set

§Multi-pattern GlobSet (8 patterns x 10 paths)

BenchmarkTimevs globset
glob_set289 ns1.3x faster
globset379 nsbaseline
tiny_glob_set313 ns1.2x faster

§Build time (8 patterns)

BenchmarkTimevs globset
glob_set_build5.5 µs8.4x faster
globset_build46 µsbaseline
tiny_glob_set_build2.2 µs21x faster

The build-time advantage comes from avoiding regex compilation entirely. This matters in applications that recompile pattern sets frequently (e.g. re-reading a schema catalog on file change).

§Single-pattern matching (upstream ripgrep benchmarks)

Benchmarkglob-setglobsetNotes
ext (*.txt)53 ns53 nsTied
short (some/**/needle.txt)51 ns23 nsRegex wins on **
long (some/**/needle.txt, deep path)285 ns53 nsRegex wins on **
many_short (14-pattern set)249 ns102 nsRegex wins on set matching

For single-pattern ** matching, globset’s regex backend is faster. The glob-set advantage shows in multi-pattern GlobSet matching (where the Aho-Corasick pre-filter and strategy engine apply) and dramatically in build times.

§Differences from globset

glob-set follows POSIX/gitignore semantics where * does not match path separators (/). In globset, * crosses / by default.

PatternPathglob-setglobset
*.cfoo.cmatchmatch
*.csrc/foo.cnomatch
**/*.csrc/foo.cmatchmatch

If you need recursive matching, use ** explicitly (e.g. **/*.c instead of *.c). This is consistent with how .gitignore and most POSIX glob implementations behave.

§Acknowledgments

This crate is based on the API and design of globset by Andrew Gallant (BurntSushi), part of the ripgrep project. The original globset crate is an excellent, battle-tested library — glob-set reimplements its API surface with a no_std-compatible, regex-free backend built on glob-matcher.

§License

Apache-2.0

Structs§

Candidate
A pre-processed path for matching against multiple patterns.
Error
An error that occurs when parsing a glob pattern.
Glob
A single glob pattern.
GlobBuilder
A builder for configuring a glob pattern.
GlobMap
A map from glob patterns to values.
GlobMapBuilder
A builder for constructing a GlobMap.
GlobMatcher
A compiled matcher for a single glob pattern.
GlobSet
A set of glob patterns that can be matched against paths efficiently.
GlobSetBuilder
A builder for constructing a GlobSet.
TinyGlobSet
A lightweight glob set that trades query speed for fast construction.
TinyGlobSetBuilder
A builder for constructing a TinyGlobSet.

Enums§

ErrorKind
The kind of error that can occur when parsing a glob pattern.

Functions§

escape
Escape all special glob characters in the given string.