Module character_sets

Module character_sets 

Source
Expand description

Character sets and alphabet partitions

An SMT-LIB character is an unsigned integer in the range [0..MAX_CHAR]. We represent characters by u32 integers in this range. The constant MAX_CHAR is defined in smt_strings.

A character set is either a single character (e.g., 'A') or a range of characters (e.g., ['0'..'9']). We represent both as CharSets. A CharSet is a pair of two integers, start and end, where start <= end and end <= MAX_CHAR. This denotes the character interval [start, end].

A partition is a collection of disjoint character sets, sorted in increasing order. A partition is then a list of intervals:

[a0, b0], [a1, b1], …, [ak, bk] where ai <= bi and bi < ai+1.

A partition defines an equivalence relation over characters: two characters are equivalent either if they belong to the same class [ai, bi] or if they’re outside of all the intervals.

A partition with n intervals defines then (n+1) classes: C0, C1, …, Cn-1 and D.

  • For i=0,…, n-1, class Ci is the interval [ai, bi].
  • Class D is the complementary class, that is, the complement of Union(C0, …, Cn-1).

Note: the complementary class D may be empty.

Each class in a partition can be identified by its ClassId:

  • ClassId::Interval(i) denotes the class Ci, that is, the interval [ai, bi].
  • ClassId::Complement denotes the complementary class D.

Partitions are used to decompose the SMT-LIB alphabet (of 196608 characters) into a typically much smaller number of equivalent classes.

  • For regular expressions, a partition divides the alphabet into derivative classes. See ReManager.
  • For a finite-state automaton, a partition divides the alphabet into classes that have the same transitions. A character partition is attached to every state s in an automaton and the successors of s are defined for every class in this partition. See Automaton.

Structs§

CharPartition
Collection of disjoint character sets.
CharSet
Interval [start, end] where start <= end <= MAX_CHAR.
ClassIdIterator
Iterator to go through all valid ClassId’s in a partition
PickIterator
Iterator to pick a character in each class of a partition

Enums§

ClassId
ClassId
CoverResult
Class that covers an interval in a CharPartition

Functions§

merge_partition_list
Merge a list of partitions
merge_partitions
Merge two partitions p1 and p2