Struct grex::RegExpBuilder

source ·

pub struct RegExpBuilder { /* private fields */ }

Expand description

This struct builds regular expressions from user-provided test cases.

Implementations§

source §

impl RegExpBuilder

source

pub fn from<T: Clone + Into<String>>(test_cases: &[T]) -> Self

Specifies the test cases to build the regular expression from.

The test cases need not be sorted because RegExpBuilder sorts them internally.

⚠ Panics if test_cases is empty.

source

pub fn from_file<T: Into<PathBuf>>(file_path: T) -> Self

Specifies a text file containing test cases to build the regular expression from.

The test cases need not be sorted because RegExpBuilder sorts them internally.

Each test case needs to be on a separate line. Lines may be ended with either a newline (\n) or a carriage return with a line feed (\r\n). The final line ending is optional.

⚠ Panics if:

the file cannot be found
the file’s encoding is not valid UTF-8 data
the file cannot be opened because of conflicting permissions

source

pub fn with_conversion_of_digits(&mut self) -> &mut Self

Converts any Unicode decimal digit to character class \d.

This method takes precedence over with_conversion_of_words if both are set. Decimal digits are converted to \d, the remaining word characters to \w.

This method takes precedence over with_conversion_of_non_whitespace if both are set. Decimal digits are converted to \d, the remaining non-whitespace characters to \S.

source

pub fn with_conversion_of_non_digits(&mut self) -> &mut Self

Converts any character which is not a Unicode decimal digit to character class \D.

This method takes precedence over with_conversion_of_non_words if both are set. Non-digits which are also non-word characters are converted to \D.

This method takes precedence over with_conversion_of_non_whitespace if both are set. Non-digits which are also non-space characters are converted to \D.

source

pub fn with_conversion_of_whitespace(&mut self) -> &mut Self

Converts any Unicode whitespace character to character class \s.

This method takes precedence over with_conversion_of_non_digits if both are set. Whitespace characters are converted to \s, the remaining non-digit characters to \D.

This method takes precedence over with_conversion_of_non_words if both are set. Whitespace characters are converted to \s, the remaining non-word characters to \W.

source

pub fn with_conversion_of_non_whitespace(&mut self) -> &mut Self

Converts any character which is not a Unicode whitespace character to character class \S.

source

pub fn with_conversion_of_words(&mut self) -> &mut Self

Converts any Unicode word character to character class \w.

This method takes precedence over with_conversion_of_non_digits if both are set. Word characters are converted to \w, the remaining non-digit characters to \D.

This method takes precedence over with_conversion_of_non_whitespace if both are set. Word characters are converted to \w, the remaining non-space characters to \S.

source

pub fn with_conversion_of_non_words(&mut self) -> &mut Self

Converts any character which is not a Unicode word character to character class \W.

This method takes precedence over with_conversion_of_non_whitespace if both are set. Non-words which are also non-space characters are converted to \W.

source

pub fn with_conversion_of_repetitions(&mut self) -> &mut Self

Detects repeated non-overlapping substrings and to convert them to {min,max} quantifier notation.

source

pub fn with_case_insensitive_matching(&mut self) -> &mut Self

Enables case-insensitive matching of test cases so that letters match both upper and lower case.

source

pub fn with_capturing_groups(&mut self) -> &mut Self

Replaces non-capturing groups with capturing ones.

source

pub fn with_minimum_repetitions(&mut self, quantity: u32) -> &mut Self

Specifies the minimum quantity of substring repetitions to be converted if with_conversion_of_repetitions is set.

If the quantity is not explicitly set with this method, a default value of 1 will be used.

⚠ Panics if quantity is zero.

source

pub fn with_minimum_substring_length(&mut self, length: u32) -> &mut Self

Specifies the minimum length a repeated substring must have in order to be converted if with_conversion_of_repetitions is set.

If the length is not explicitly set with this method, a default value of 1 will be used.

⚠ Panics if length is zero.

source

pub fn with_escaping_of_non_ascii_chars( &mut self, use_surrogate_pairs: bool ) -> &mut Self

Converts non-ASCII characters to unicode escape sequences. The parameter use_surrogate_pairs specifies whether to convert astral code planes (range U+010000 to U+10FFFF) to surrogate pairs.

source

pub fn with_verbose_mode(&mut self) -> &mut Self

Produces a nicer looking regular expression in verbose mode.

source

pub fn without_start_anchor(&mut self) -> &mut Self

Removes the caret anchor ‘^’ from the resulting regular expression, thereby allowing to match the test cases also when they do not occur at the start of a string.

source

pub fn without_end_anchor(&mut self) -> &mut Self

Removes the dollar sign anchor ‘$’ from the resulting regular expression, thereby allowing to match the test cases also when they do not occur at the end of a string.

source

pub fn without_anchors(&mut self) -> &mut Self

Removes the caret and dollar sign anchors from the resulting regular expression, thereby allowing to match the test cases also when they occur within a larger string that contains other content as well.

source