Module regular_expressions

Source
Expand description

Regular expressions

This module defines the abstract syntax of regular expressions BaseRegLan and the regular expression type RE. Regular expressions are built using an ReManager, which provides hash consing.

Important: Because of hash consing, regular expressions created with different ReManagers are not compatible. It’s recommended to use only one ReManager.

Input to the manager’s methods are static references to RE objects (see type RegLan). The manager also returns objects of type RegLan when producing regular expressions.

ReManager also implements the derivative operation. The derivative of a regular expression R with respect to a character c is another regular expression S that defines all the strings that can follow c in the language of R. For example, the derivative of regular expression ‘(abc + cd)*’ with respect to ‘a’ is ‘bc(abc + cd)*’: all words of ‘(abc + cd)*’ that start with ‘a’ are of the form ‘a.w’ where ‘w’ is a word ‘bc(abc + cd)*’.

For a regular expression R, we use a CharPartition that divides the alphabet into equivalent derivative classes. If two characters c1 and c2 are in the same derivative class, then the derivative of R with respect to c1 and the derivative of R with respect to c2 are equal. The ReManager implements derivative of R with respect to one of its derivative class. More generally, the derivative of R with respect to a character set C is well defined if C is included in a derivative class of R.

Derivatives allows one to convert REs to deterministic automata and support other operations such as checking whether a string matches an RE.

Structs§

DerivativeIterator
Iterator to enumerate all the derivatives of a regular expression
RE
Regular expression structure
ReManager
A store for constructing regular expressions using hash-consing.

Enums§

BaseRegLan
Abstract syntax for regular expressions

Functions§

leaves
Iterator that enumerates the leaves of r
sub_terms
Iterator for the sub-terms of r

Type Aliases§

RegLan
Reference to a Regular Expression descriptor