Crate simplematch

Crate simplematch 

Source
Expand description

§simplematch

The simplematch library provides a fast and efficient way to match wildcard patterns on strings and bytes. It includes two primary functions, dowild and dowild_with, along with an Options struct to customize the behavior of the dowild_with function.

§Usage

To use the simplematch library, include it in your Cargo.toml:

[dependencies]
simplematch = "0.3.1"

§Functions

§dowild

This function is the most performant but has no customization options.

pub fn dowild<T>(pattern: &[T], haystack: &[T]) -> bool
where
    T: Wildcard

Matches the given haystack against the specified pattern using simple wildcard rules. The * character matches any sequence of characters, while the ? character matches a single character.

Wildcard is natively implemented for u8 and char.

Parameters:

  • pattern: A bytes or char slice representing the wildcard pattern to match against.
  • haystack: A bytes or char slice representing the text to be matched.

Returns:

  • true if the pattern matches the haystack, otherwise false.
§Examples
use simplematch::dowild;

assert_eq!(dowild("foo*".as_bytes(), "foobar".as_bytes()), true);
assert_eq!(dowild("foo?".as_bytes(), "fooa".as_bytes()), true)

Or, bringing the trait DoWild in scope allows for more convenient access to this function without performance loss:

use simplematch::DoWild;

assert_eq!("foo*".dowild("foobar"), true);

A possible usage with char:

use simplematch::DoWild;

let pattern = "foo*".chars().collect::<Vec<char>>();
let haystack = "foobar".chars().collect::<Vec<char>>();

assert_eq!(pattern.dowild(haystack), true);

§dowild_with

use simplematch::Options;

pub fn dowild_with<T>(pattern: &[T], haystack: &[T], options: Options<T>) -> bool
where
   T: Wildcard + Ord,

Matches the given haystack against the specified pattern with customizable Options. This function allows for matching case insensitive, custom wildcard characters, escaping special characters and character classes including ranges.

Parameters:

  • pattern: A bytes or char slice representing the wildcard pattern to match against.
  • haystack: A bytes or char slice representing the text to be matched.
  • options: An instance of the Options struct to customize the matching behavior.

Returns:

  • true if the pattern matches the haystack according to the specified options, otherwise false.
§Examples
use simplematch::{dowild_with, Options};

let options = Options::default()
    .case_insensitive(true)
    .wildcard_any_with(b'%');

assert_eq!(
    dowild_with("foo%".as_bytes(), "FOOBAR".as_bytes(), options),
    true
);

Like dowild, the dowild_with function can be accessed directly on the string or u8 slice, …:

use simplematch::{DoWild, Options};

assert_eq!(
    "foo*".dowild_with("FOObar", Options::default().case_insensitive(true)),
    true
);

§Character classes

An expression [...] matches a single character if the first character following the leading [ is not an !. The contents of the brackets must not be empty otherwise the brackets are interpreted literally (the pattern a[]c matches a[]c exactly); however, a ] can be included as the first character within the brackets. For example, [][!] matches the three characters [, ], and !.

§Ranges

A special convention exists where two characters separated by - represent a range. For instance, [A-Fa-f0-9] is equivalent to [ABCDEFabcdef0123456789]. To include - as a literal character, it must be placed as the first or last character within the brackets. For example, []-] matches the two characters ] and -. As opposed to regex, it is possible to revert a range [F-A] which has the same meaning as [A-F].

§Complementation

An expression [!...] matches any single character that is not included in the expression formed by removing the first !. For example, [!]a-] matches any character except ], a, and -.

To remove the special meanings of ?, *, and [, you can precede them with the escape character (per default the backslash character \). Within brackets, these characters represent themselves. For instance, [[?*\\] matches the four characters [, ?, *, and \.

§Credits

This linear-time wildcard matching algorithm is derived from the one presented in Russ Cox’s great article about simple and performant glob matching (https://research.swtch.com/glob). Furthermore, the optimizations for the ? handling are based on the article Matching Wildcards: An Improved Algorithm for Big Data written by Kirk J. Krauss.

The simplematch algorithm is an improved version which uses generally about 2-6x less instructions than the original algorithm; tested with random small and big data.

Structs§

Options
Customize the matching behavior of the dowild_with function

Enums§

SimpleMatchError
The Error of the simplematch crate

Traits§

DoWild
A convenience trait to use dowild and dowild_with directly for this type
Wildcard
The trait for types which should be able to be matched for a wildcard pattern

Functions§

dowild
Returns true if the wildcard pattern matches the haystack.
dowild_with
Returns true if the wildcard pattern matches the haystack. This method can be customized with Options.