[][src]Module egg::tutorials::_01_background

Concepts: e-graphs and equality saturation

An e-graph is a data structure that powers the equality saturation optimization technique.

Both e-graphs (Gregory Nelson's PhD Thesis, 1980) and equality saturation (Tate et. al., 2009) were invented before. egg aims to make them easy, extensible, and efficient.

This tutorial will approach these concepts at a high level. More detail can be found in our paper.

Why any of this?

When building pretty much any tool relating to programming languages, you'll typically be working with terms from your language (we will use the words term, expression, and program interchangeably). You may be trying to:

  • Convert a term to a "better", equivalent term (optimization);
  • Build a term from scratch according to a specification (synthesis);
  • Or show that two terms are equivalent (verification).

Term rewriting is common technique for working with terms toward any of these goals. You start with a term t and a set of rewrites, each of which take the form l → r. For each rewrite, you try to match the pattern l against the term t, generating a substitution σ at some subterm, and then you apply that substitution to the right-hand pattern r and replace the matching subterm.

For example, consider the term 42 × (7 + 7) and the rewrite x + x → 2 × x:

  • The left-hand side would match at the subterm 7 + 7, generating the substitution σ = {x: 7}.
  • Applying the substitution gives r[σ] = 2 × 7. (r[σ] means apply the substitution σ to r)
  • Replacing the matched subterm 7 + 7 with r[σ] = 2 × 7 gives the result: 42 × (2 × 7).

One issue with term rewriting (and other programming language techniques that involve manipulating terms) is the matter of choice. Rewriting is destructive, meaning that once you do a rewrite, the initial term is gone. If you have many rewrites to choose from, picking one is like choosing a fork in the road: going back and choosing another way is expensive or impossible, so you're committed to the path you've chosen. Furthermore, picking the right rewrite at the right time is very difficult in general; sometimes, a choice that seems good locally may be wrong globally.

As an example, a C optimizer would like to rewrite a * 2 to the cheaper a << 1. But what about (a * 2) / 2? A greedy, local approach will take a "wrong turn" resulting in (a << 1) / 2, which is better than the input expression but hides the fact that we could have canceled the multiplication and division.

There are some techniques to mitigate this (backtracking and value-numbering), but fundamentally, the only way around it is to take all choices at the same time. But rewriting is destructive, so applying n rewrites to a term results in n terms, and then applying n rewrites to each of those n terms yields n2 more terms, and so on. If you want to explore m "layers" deep in this tree, you'll have to store nm terms, which is prohibitively expensive.

E-graphs can solve this representation problem. When rewriting a bunch of terms, the results are often structurally similar. E-graphs can store large sets of similar expressions very compactly. Using e-graphs, you can apply many rewrites simultaneously without a multiplicative growth in space. Equality saturation is a technique to do this kind of rewriting for program optimization.

What's an e-graph?

An e-graph (/'igraf/) is a data structure to maintain an equivalence relation (really a congruence relation, see the next section) over expressions. An e-graph is a set of equivalence classes (e-classes), each of which contains equivalent e-nodes. An e-node is an operator with children, but instead of children being other operators or values, the children are e-classes. In egg, these are represented by the EGraph, EClass, and Language (e-nodes) types.

Even small e-graphs can represent a large number of expressions, exponential in the number of e-nodes. This compactness is what makes e-graphs a compelling data structure. We can define what it means for represent (or contain) a term as follows:

  • An e-graph represents a term if any of its e-classes do.
  • An e-class represents a term if any of its e-nodes do.
  • An e-node f(n1, n2, ...) represents a term f(t1, t2, ...) if e-node ni represents term ti.

Here are some e-graphs. We picture e-classes as dotted boxes surrounding the equivalent e-nodes. Note that edges go from e-nodes to e-classes (not to other e-nodes), since e-nodes have e-classes as children. Don't worry about how they are made just yet, we'll get to rewriting soon.

An e-graph as it undergoes a series of rewrites. Gray indicates what didn't change from the previous image.

An e-graph can be queried for patterns through a procedure called e-matching (matching up to equivalence), which searches the e-graph for e-classes that represent terms that match a given pattern. egg's e-matching (inspired by De Moura and Bjørner) is done by the Searcher and Pattern APIs. There are two core operations that modify the e-graph: add which adds e-nodes to the e-graph, and union which merges two e-classes.

These operations can be combined to do rewriting over the e-graph. Critically, this rewriting process is only additive, meaning that the e-graph never forgets old versions of terms. Additionally, rewrites add entire classes of terms at a time. To perform a rewrite l → r over an e-graph, you first search l in the e-graph, yielding pairs of (c, σ), where c is the matching e-class and σ is the substitution. Then, you add the term r[σ] to the e-graph and union it with the matching e-class c.

Let's put it all together with an example referring to the four e-graphs in the image above.

  1. The initial e-graph represents the term (a × 2) / 2. Since each e-class only has one e-node, the e-graph is basically an abstract syntax tree with sharing (the 2 is not duplicated).
  2. Applying the rewrite x × 2 → x << 1 has recorded the fact that a × 2 = a << 1 without forgetting about a × 2. Note how the newly added a << 1 refers to the existing "a" e-node, and the "<<" e-node has been unioned into the same e-class as the equivalent "×" e-node where the pattern x × 2 matched.
  3. Applying rewrite (x × y) / z → x × (y / z) realizes that division associates with multiplication. This rewrite is critical to discovering the cancellation of 2s that we are looking for, and it still works despite the fact that we applied the "wrong" rewrite previously.
  4. Applying rewrites x / x → 1 and 1 × x → x doesn't add any new e-nodes, since all the e-nodes were already present in the e-graph. The result only unions e-classes, meaning that e-graph actually got smaller from applying these rewrites, even though it now represents more terms. In fact, observe that that the top-right "×" e-node's left child is itself; this cycle means the e-class represents the infinite (!) set of terms a, a × 1, a × 1 × 1, and so on.

Invariants and Rebuilding

An e-graph has two core operations that modify the e-graph: add which adds e-nodes to the e-graph, and union which merges two e-classes. These operations maintains two key (related) invariants:

  1. Congruence

    An e-graph maintains not just an equivalence relation over expressions, but a congruence relation. Congruence basically states that if x is equivalent to y, f(x) must be equivalent to f(y). So as the user calls union, many e-classes other than the given two may need to merge to maintain congruence.

    For example, suppose terms a + x and a + y are represented in e-classes 1 and 2, respectively. At some later point, x and y become equivalent (perhaps the user called union on their containing e-classes). E-classes 1 and 2 must merge, because now the two "+" operators have equivalent arguments, making them equivalent.

  2. Uniqueness of e-nodes

    There do not exist two distinct e-nodes with the same operators and equivalent children in the e-class, either in the same e-class or different e-classes. This is maintained in part by the hashconsing performed by add, and by deduplication performed by union and rebuild.

egg takes a delayed approach to maintaining these invariants. Specifically, the effects of calling union (or applying a rewrite, which calls union) may not be reflected immediately. To restore the e-graph invariants and make these effects visible, the user must call the rebuild method.

egg's choice here allows for a higher performance implementation Maintaining the congruence relation complicates the core e-graph data structure and requires an expensive traversal through the e-graph on every union. egg chooses to relax these invariants for better performance, only restoring the invariants on a call to rebuild. The paper on egg goes into greater detail on why this design choice makes things faster. See the rebuild documentation for more information on what it does and when you have to call it. Note also that Runners take care of this for you, calling rebuild between rewrite iterations.

Equality Saturation

Equality saturation (Tate et. al., 2009) is a technique for program optimization. egg implements a variant of equality saturation in the Runner API that looks like the following pseudocode:

This example is not tested
fn equality_saturation(expr: Expression, rewrites: Vec<Rewrite>) -> Expression {
    let mut egraph = make_initial_egraph(expr);

    while !egraph.is_saturated_or_timeout() {
        let mut matches = vec![];

        // read-only phase, invariants are preserved
        for rw in rewrites {
            for (subst, eclass) in egraph.search(rw.lhs) {
                matches.push((rw, subst, eclass));
            }
        }

        // write-only phase, temporarily break invariants
        for (rw, subst, eclass) in matches {
            eclass2 = egraph.add(rw.rhs.subst(subst));
            egraph.union(eclass, eclass2);
        }

        // restore the invariants once per iteration
        egraph.rebuild();
    }

    return egraph.extract_best();
}

Most of this was covered above, but we need to define two new terms:

  • Saturation occurs when an e-graph detects that rewrites no longer add new information. Consider the commutative rewrite x + y → y + x. After applying it once, the second time adds no new information since the e-graph didn't forget about the initial x + y terms. If all the rewrites are in this state, we say the e-graph is saturated, meaning that the e-graph encodes all possible equivalences derivable from the given rewrites.
  • Extraction is a procedure for picking a single represented from an e-class that is optimal according to some cost function. egg's Extractors provide this functionality.

Put together, equality saturation explores all possible variants of a program that can be derived from a set of rewrites, and then it extracts the best one. This solves the problem of choice describes above, as equality saturation essentially applies every rewrite every iteration, using the e-graph to avoid exponential explosion.

It sounds a little too good to be true, and while it definitely has its caveats, egg addresses many of the shortcomings from this approach, making it feasible to apply on real-world problems. We've already discussed how rebuilding makes egg's e-graphs fast, and later tutorials will discuss how Analysis makes this approach flexible and able to handle more than just syntactic rewriting.