midenc_hir/ir/traits/
canonicalization.rs

1use alloc::rc::Rc;
2
3use crate::{patterns::RewritePatternSet, Context, Op};
4
5/// This trait represents an [Op] that has a canonical/normal form.
6///
7/// Canonicalization patterns are applied via a single canonicalization pass, which iteratively
8/// applies canonicalization patterns of all operations until either fixpoint is reached, or some
9/// maximum number of iterations is reached. As a result, canonicalization is performed on a best-
10/// effort basis, and the IR overall is not guaranteed to be in perfect canonical form.
11///
12/// Canonicalization is intended to simplify subsequent analysis and optimization, by allowing them
13/// to assume that all operations are in canonical form, and thus not needing to handle all the
14/// many different variations of the same op that might occur in practice. This reduces the amount
15/// of redundant work that must be done, and improves the performance of compilation overall. It
16/// is important to stress though, that canonicalizations must not be required for correctness -
17/// if an operation is not in canonical form, compilation should still produce correct output,
18/// albeit less optimal output.
19///
20///
21/// Operations which have a canonical/normal form must be able to define what it means for two
22/// instances of the op to be equivalent. This is because the mathematical properties of
23/// canonicalization are defined in terms of equivalence relations over all forms of an op in the
24/// same _equivalence class_. An intuitive way of thinking about this in terms of operations, are
25/// that two instances of an operation are in the same equivalence class if they are unique from
26/// each other, but would produce identical results, i.e. they represent the same computation in
27/// different forms.
28///
29/// For operations which meet the above requirement, and do provide a set of canonicalization
30/// patterns, those patterns must uphold the following properties:
31///
32/// 1. _Idempotence_. After applying canonicalization pattern(s) to the op, applying again to the
33///    resulting op will have no effect, i.e. `canonicalize(op) = canonicalize(canonicalize(op))`.
34/// 2. _Decisiveness_. Applying the canonicalization patterns to any two instances of the operation
35///    in the same equivalence class, produce an identical result. The result must be the canonical
36///    form, must be unique for the equivalence class, and must be itself a member of the
37///    equivalence class (i.e. the output is equivalent to the inputs). In other words, given a
38///    equivalence class `{a, b, c}`, where `c` is the canonical representation for that equivalence
39///    class, then: `canon(a) = canon(b) = c`.
40/// 3. _Convergence_. Each canonicalization rewrite must either leave the IR unchanged, or rewrite
41///    it such that the output is strictly _more_ canonical than the input. A rewrite which makes
42///    the input less canonical "temporarily" so that another rewrite will apply, can easily result
43///    in unstable or cyclic rewrites, causing canonicalization to never reach fixpoint.
44///
45/// Additionally, there are some general rules that should be followed:
46///
47/// * Canonicalization rewrites should be simple, and focused purely on reaching canonical form.
48///   They should not be used to do unrelated optimizations/rewrites that do not pertain to the
49///   task of canonicalization.
50///
51/// * Canonicalization rewrites should never rewrite other operations with canonical forms. However,
52///   it is fine to add or remove operations in order to reach canonical form. For example, if we
53///   are canonicalizing an `if` expression, we might want to simplify the condition expression such
54///   that a condition of the form `!x` is rewritten as just `x`, swapping the then/else blocks so
55///   that the resulting `if` is equivalent to the original, but without the unnecessary inversion
56///   of the condition. In this case, we removed an operation that became dead as the result of
57///   canonicalizing the `if`. It would not, however, have been a good idea for us to try and do
58///   more complex analysis/rewrite of the `x` expression itself - instead, the canonicalization
59///   should assume that `x` is already in its canonical form.
60///
61/// * It is best to canonicalize towards fewer uses of a value when operands are duplicated, as
62///   some rewrite patterns only match when a value has a single use. For example, canonicalizing
63///   `x + x` as `x * 2`, since this reduces the number of uses of `x` by one.
64pub trait Canonicalizable {
65    /// Populate `rewrites` with the set of rewrite patterns that should be applied to canonicalize
66    /// this operation.
67    ///
68    /// NOTE: This should not be used to register _all_ possible rewrites for this operation, only
69    /// those used for canonicalization.
70    ///
71    /// An operation that has no canonicalization patterns may simply return without adding any
72    /// patterns to the set. This is the default behavior implemented for all [Op] impls.
73    fn get_canonicalization_patterns(rewrites: &mut RewritePatternSet, context: Rc<Context>);
74}
75
76impl<T: Op> Canonicalizable for T {
77    default fn get_canonicalization_patterns(
78        _rewrites: &mut RewritePatternSet,
79        _context: Rc<Context>,
80    ) {
81    }
82}