glob-set 0.0.1

A globset-compatible glob matcher -- `no_std`, no regex, built on glob-matcher
Documentation

glob-set

Crates.io docs.rs GitHub License

A globset-compatible glob matcher -- no_std, no regex, built on glob-matcher

Usage

use glob_set::{Glob, GlobSet, GlobSetBuilder};

let mut builder = GlobSetBuilder::new();
builder.add(Glob::new("*.rs").unwrap());
builder.add(Glob::new("*.toml").unwrap());
let set = builder.build().unwrap();

assert!(set.is_match("main.rs"));
assert!(set.is_match("Cargo.toml"));
assert!(!set.is_match("index.js"));

Features

  • no_std compatible (uses alloc)
  • No regex dependency -- built on glob-matcher
  • Aho-Corasick pre-filter for efficient multi-pattern matching
  • API-compatible with globset: Glob, GlobSet, GlobBuilder, Candidate
  • Supports *, **, ?, [...], and {a,b} patterns

Benchmarks

All benchmarks use Criterion and compare glob-set against the original globset crate. The single-pattern benchmarks (ext, short, long, many_short) are sourced from the upstream ripgrep globset benchmarks. The multi-pattern benchmarks (glob_set vs globset) test 8 patterns against 10 paths, representative of typical schema-catalog file matching.

Run them with:

cargo bench -p glob-set

Multi-pattern GlobSet (8 patterns x 10 paths)

Benchmark Time vs globset
glob_set 289 ns 1.3x faster
globset 379 ns baseline
tiny_glob_set 313 ns 1.2x faster

Build time (8 patterns)

Benchmark Time vs globset
glob_set_build 5.5 µs 8.4x faster
globset_build 46 µs baseline
tiny_glob_set_build 2.2 µs 21x faster

The build-time advantage comes from avoiding regex compilation entirely. This matters in applications that recompile pattern sets frequently (e.g. re-reading a schema catalog on file change).

Single-pattern matching (upstream ripgrep benchmarks)

Benchmark glob-set globset Notes
ext (*.txt) 53 ns 53 ns Tied
short (some/**/needle.txt) 51 ns 23 ns Regex wins on **
long (some/**/needle.txt, deep path) 285 ns 53 ns Regex wins on **
many_short (14-pattern set) 249 ns 102 ns Regex wins on set matching

For single-pattern ** matching, globset's regex backend is faster. The glob-set advantage shows in multi-pattern GlobSet matching (where the Aho-Corasick pre-filter and strategy engine apply) and dramatically in build times.

Differences from globset

glob-set follows POSIX/gitignore semantics where * does not match path separators (/). In globset, * crosses / by default.

Pattern Path glob-set globset
*.c foo.c match match
*.c src/foo.c no match
**/*.c src/foo.c match match

If you need recursive matching, use ** explicitly (e.g. **/*.c instead of *.c). This is consistent with how .gitignore and most POSIX glob implementations behave.

Acknowledgments

This crate is based on the API and design of globset by Andrew Gallant (BurntSushi), part of the ripgrep project. The original globset crate is an excellent, battle-tested library — glob-set reimplements its API surface with a no_std-compatible, regex-free backend built on glob-matcher.

License

Apache-2.0