midenc-hir-transform 0.4.1

Transformation passes and utilities for Miden HIR
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
//! This code is an implementation of the algorithm described in _Perfect Reconstructability of
//! Control Flow from Demand Dependence Graphs_, by Bahmann, Reismann, Jahre, and Meyer. 2015.
//! See https://doi.org/10.1145/2693261.
//!
//! It defines an algorithm to translate any control flow graph with a single entry and single exit
//! block into structured control flow operations consisting of regions of do-while loops and
//! operations conditionally dispatching to one out of multiple regions before continuing after the
//! operation. This includes control flow graphs containing irreducible control flow.
//!
//! The implementation here additionally supports the transformation on regions with multiple exit
//! blocks. This is implemented by first transforming all occurrences of return-like operations to
//! branch to a single exit block containing an instance of that return-like operation. If there are
//! multiple kinds of return-like operations, multiple exit blocks are created. In that case the
//! transformation leaves behind a conditional control flow graph operation that dispatches to the
//! given regions terminating with different kinds of return-like operations each.
//!
//! If the function only contains a single kind of return-like operations, it is guaranteed that all
//! control flow graph ops will be lifted to structured control flow, and that no more control flow
//! graph ops remain after the operation.
//!
//! The algorithm to lift CFGs consists of two transformations applied after each other on any
//! single-entry, single-exit region:
//!
//! 1. Lifting cycles to structured control flow loops
//! 2. Lifting conditional branches to structured control flow branches
//!
//! These are then applied recursively on any new single-entry single-exit regions created by the
//! transformation until no more CFG operations remain.
//!
//! The first part of cycle lifting is to detect any cycles in the CFG. This is done using an
//! algorithm for iterating over SCCs. Every SCC representing a cycle is then transformed into a
//! structured loop with a single entry block and a single latch containing the only back edge to
//! the entry block and the only edge to an exit block outside the loop. Rerouting control flow to
//! create single entry and exit blocks is achieved via a multiplexer construct that can be
//! visualized as follows:
//!
//! ```text,ignore
//! +-----+ +-----+   +-----+
//! | bb0 | | bb1 |...| bbN |
//! +--+--+ +--+--+   +-+---+
//!    |       |        |
//!    |       v        |
//!    |  +------+      |
//!    | ++      ++<----+
//!    | | Region |
//!    +>|        |<----+
//!      ++      ++     |
//!       +------+------+
//! ```
//!
//! The above transforms to:
//!
//! ```text,ignore
//! +-----+ +-----+   +-----+
//! | bb0 | | bb1 |...| bbN |
//! +-----+ +--|--+   ++----+
//!      |     v       |
//!      +->+-----+<---+
//!         | bbM |<-------+
//!         +---+-+        |
//!     +---+   | +----+   |
//!     |       v      |   |
//!     |   +------+   |   |
//!     |  ++      ++<-+   |
//!     +->| Region |      |
//!        ++      ++      |
//!         +------+-------+
//! ```
//!
//! bbM in the above is the multiplexer block, and any block previously branching to an entry block
//! of the region are redirected to it. This includes any branches from within the region. Using a
//! block argument, bbM then dispatches to the correct entry block of the region dependent on the
//! predecessor.
//!
//! A similar transformation is done to create the latch block with the single back edge and loop
//! exit edge.
//!
//! The above form has the advantage that bbM now acts as the loop header of the loop body. After
//! the transformation on the latch, this results in a structured loop that can then be lifted to
//! structured control flow. The conditional branches created in bbM are later lifted to conditional
//! branches.
//!
//! Lifting conditional branches is done by analyzing the *first* conditional branch encountered in
//! the entry region. The algorithm then identifies all blocks that are dominated by a specific
//! control flow edge and the region where control flow continues:
//!
//! ```text,ignore
//!                  +-----+
//!            +-----+ bb0 +----+
//!            v     +-----+    v
//! Region 1 +-+-+    ...     +-+-+ Region n
//!          +---+            +---+
//!           ...              ...
//!            |                |
//!            |      +---+     |
//!            +---->++   ++<---+
//!                  |     |
//!                  ++   ++ Region T
//!                   +---+
//! ```
//!
//! Every region following bb0 consists of 0 or more blocks that eventually branch to Region T. If
//! there are multiple entry blocks into Region T, a single entry block is created using a
//! multiplexer block as shown above. Region 1 to Region n are then lifted together with the
//! conditional control flow operation terminating bb0 into a structured conditional operation
//! followed by the operations of the entry block of Region T.
mod edges;
mod transform;

use midenc_hir::{
    adt::SmallSet, dominance::DominanceInfo, traits::BranchOpInterface, BlockRef, Builder,
    OpBuilder, Operation, OperationRef, Region, RegionRef, Report, SmallVec, SourceSpan, Type,
    Value, ValueRange, ValueRef, WalkResult,
};

use self::transform::TransformationContext;

/// This trait is used to abstract over the dialect-specific aspects of the control flow lifting
/// transformation performed by [transform_cfg_to_scf].
///
/// Implementations must be able to create switch-like control flow operations in order to
/// facilitate intermediate transformations; as well as the various structured control flow ops
/// represented by each method (e.g. `scf.if`, `scf.while`).
pub trait CFGToSCFInterface {
    /// Creates a structured control flow operation branching to one of `regions`.
    ///
    /// It replaces `control_flow_cond_op` and must produce `result_types` as results.
    ///
    /// `regions` contains the list of branch regions corresponding to each successor of
    /// `control_flow_cond_op`. Their bodies must simply be taken and left as is.
    ///
    /// Returns `Err` if incapable of converting the control flow graph operation.
    fn create_structured_branch_region_op(
        &self,
        builder: &mut OpBuilder,
        control_flow_cond_op: OperationRef,
        result_types: &[Type],
        regions: &mut SmallVec<[RegionRef; 2]>,
    ) -> Result<OperationRef, Report>;

    /// Creates a return-like terminator for a branch region of the op returned by
    /// [CFGToSCFInterface::create_structured_branch_region_op].
    ///
    /// * `branch_region_op` is the operation returned by `create_structured_branch_region_op`.
    /// * `replaced_control_flow_op` is the control flow op being replaced by the terminator or
    ///   `None` if the terminator is not replacing any existing control flow op.
    /// * `results` are the values that should be returned by the branch region.
    fn create_structured_branch_region_terminator_op(
        &self,
        span: SourceSpan,
        builder: &mut OpBuilder,
        branch_region_op: OperationRef,
        replaced_control_flow_op: Option<OperationRef>,
        results: ValueRange<'_, 2>,
    ) -> Result<(), Report>;

    /// Creates a structured control flow operation representing a do-while loop.
    ///
    /// The do-while loop is expected to have the exact same result types as the types of the
    /// iteration values. `loop_body` is the body of the loop.
    ///
    /// Implementations must create a suitable terminator op at the end of the last block in
    /// `loop_body` which continues the loop if `condition` is 1, and exits the loop if 0.
    ///
    /// `loop_values_next_iter` are the values that have to be passed as the iteration values for
    /// the next iteration if continuing, or the result of the loop if exiting.
    ///
    /// `condition` is guaranteed to be of the same type as values returned by
    /// `get_cfg_switch_value` with either 0 or 1 as value.
    ///
    /// `loop_values_init` are the values used to initialize the iteration values of the loop.
    ///
    /// Returns `Err` if incapable of creating a loop op.
    fn create_structured_do_while_loop_op(
        &self,
        builder: &mut OpBuilder,
        replaced_op: OperationRef,
        loop_values_init: ValueRange<'_, 2>,
        condition: ValueRef,
        loop_values_next_iter: ValueRange<'_, 2>,
        loop_body: RegionRef,
    ) -> Result<OperationRef, Report>;

    /// Creates a constant operation with a result representing `value` that is suitable as flag
    /// for `create_cfg_switch_op`.
    fn get_cfg_switch_value(
        &self,
        span: SourceSpan,
        builder: &mut OpBuilder,
        value: u32,
    ) -> ValueRef;

    /// Creates a switch-like unstructured branch operation, branching to one of `case_destinations`
    /// or `default_dest`.
    ///
    /// This is used by [transform_cfg_to_scfg] for intermediate transformations before lifting to
    /// structured control flow.
    ///
    /// The switch op branches based on `flag` which is guaranteed to be of the same type as values
    /// returned by `get_cfg_switch_value`. The insertion block of the builder is guaranteed to have
    /// its predecessors already set to create an equivalent CFG after this operation.
    ///
    /// NOTE: `case_values` and other related slices may be empty to represent an unconditional
    /// branch.
    #[allow(clippy::too_many_arguments)]
    fn create_cfg_switch_op(
        &self,
        span: SourceSpan,
        builder: &mut OpBuilder,
        flag: ValueRef,
        case_values: &[u32],
        case_destinations: &[BlockRef],
        case_arguments: &[ValueRange<'_, 2>],
        default_dest: BlockRef,
        default_args: ValueRange<'_, 2>,
    ) -> Result<(), Report>;

    /// Creates a constant operation returning an undefined instance of `type`.
    ///
    /// This is required by the transformation as the lifting process might create control flow
    /// paths where an SSA-value is undefined.
    fn get_undef_value(&self, span: SourceSpan, builder: &mut OpBuilder, ty: Type) -> ValueRef;

    /// Creates a return-like terminator indicating unreachable.
    ///
    /// This is required when the transformation encounters a statically known infinite loop. Since
    /// structured control flow ops are not terminators, after lifting an infinite loop, a
    /// terminator has to be placed after to possibly satisfy the terminator requirement of the
    /// region originally passed to [transform_cfg_to_scf].
    ///
    /// `region` is guaranteed to be the region originally passed to [transform_cfg_to_scf] and the
    /// op is guaranteed to always be an op in a block directly nested under `region` after the
    /// transformation.
    ///
    /// Returns `Err` if incapable of creating an unreachable terminator.
    fn create_unreachable_terminator(
        &self,
        span: SourceSpan,
        builder: &mut OpBuilder,
        region: RegionRef,
    ) -> Result<OperationRef, Report>;

    /// Helper function to create an unconditional branch using [create_cfg_switch_op].
    fn create_single_destination_branch(
        &self,
        span: SourceSpan,
        builder: &mut OpBuilder,
        dummy_flag: ValueRef,
        destination: BlockRef,
        arguments: ValueRange<'_, 2>,
    ) -> Result<(), Report> {
        self.create_cfg_switch_op(span, builder, dummy_flag, &[], &[], &[], destination, arguments)
    }

    /// Helper function to create a conditional branch using [create_cfg_switch_op].
    #[allow(clippy::too_many_arguments)]
    fn create_conditional_branch(
        &self,
        span: SourceSpan,
        builder: &mut OpBuilder,
        condition: ValueRef,
        true_dest: BlockRef,
        true_args: ValueRange<'_, 2>,
        false_dest: BlockRef,
        false_args: ValueRange<'_, 2>,
    ) -> Result<(), Report> {
        self.create_cfg_switch_op(
            span,
            builder,
            condition,
            &[0],
            &[false_dest],
            &[false_args],
            true_dest,
            true_args,
        )
    }
}

/// This function transforms all unstructured control flow operations within `region`, to equivalent
/// structured control flow operations.
///
/// This transformation is dialect-agnostic, facilitated by delegating the dialect-specific aspects
/// of lifting operations, to the implementation of [CFGToSCFInterface] that is provided.
///
/// If the region contains only a single kind of return-like operation, all control flow graph
/// operations will be converted successfully. Otherwise a single control flow graph operation
/// branching to one block per return-like operation kind remains.
///
/// The transformation currently requires that all control flow graph operations have no side
/// effects, implement the [crate::traits::BranchOpInterface], and do not have any operation-
/// produced successor operands.
///
/// Returns `Err` if any of the preconditions are violated or if any of the methods of `interface`
/// failed. The IR is left in an unspecified state in such cases.
///
/// If successful, returns a boolean indicating whether the IR was changed.
pub fn transform_cfg_to_scf(
    region: RegionRef,
    interface: &mut dyn CFGToSCFInterface,
    dominance_info: &mut DominanceInfo,
) -> Result<bool, Report> {
    {
        let region = region.borrow();
        if region.is_empty() || region.has_one_block() {
            return Ok(false);
        }

        check_transformation_preconditions(&region)?;
    }

    let mut transform_ctx = TransformationContext::new(region, interface, dominance_info)?;

    let mut worklist = SmallVec::<[BlockRef; 4]>::from_slice(&[transform_ctx.entry()]);
    while let Some(current) = worklist.pop() {
        // Turn all top-level cycles in the CFG to structured control flow first.
        // After this transformation, the remaining CFG ops form a DAG.
        let mut new_regions = transform_ctx.transform_cycles_to_scf_loops(current)?;

        // Add the newly created subregions to the worklist. These are the bodies of the loops.
        worklist.extend(new_regions.iter().copied());
        // Invalidate the dominance tree as blocks have been moved, created and added during the
        // cycle to structured loop transformation.
        if !new_regions.is_empty() {
            let parent_region = current.parent().unwrap();
            transform_ctx.invalidate_dominance_info_for_region(parent_region);
        }
        new_regions = transform_ctx.transform_to_structured_cf_branches(current)?;

        // Invalidating the dominance tree is generally not required by the transformation above as
        // the new region entries correspond to unaffected subtrees in the dominator tree. Only its
        // parent nodes have changed but won't be visited again.
        worklist.extend(new_regions);
    }

    // Clean up garbage we may have created during the transformation
    //
    // NOTE: This is not guaranteed to clean up _everything_ that may be garbage, only things we
    // have accounted for. Canonicalization and other optimization passes can take care of anything
    // else that may remain
    transform_ctx.garbage_collect();

    Ok(true)
}

/// Checks all preconditions of the transformation prior to any transformations.
///
/// Returns failure if any precondition is violated.
fn check_transformation_preconditions(region: &Region) -> Result<(), Report> {
    use midenc_hir::SuccessorOperands;

    for block in region.body() {
        if !block.has_predecessors() && !block.is_entry_block() {
            return Err(Report::msg("transformation does not support unreachable blocks"));
        }
    }

    let walk_result = region.prewalk(|op: &Operation| {
        if !op.has_successors() {
            return WalkResult::Skip;
        }

        // This transformation requires all ops with successors to implement the branch op interface.
        // It is impossible to adjust their block arguments otherwise.
        let branch_op_interface = match op.as_trait::<dyn BranchOpInterface>().ok_or_else(|| {
            Report::msg(
                "transformation does not support terminators with successors not implementing \
                 BranchOpInterface",
            )
        }) {
            Ok(boi) => boi,
            Err(err) => return WalkResult::Break(err),
        };

        // Branch operations must have no side effects. Replacing them would not be valid otherwise.
        if !op.is_memory_effect_free() {
            return WalkResult::Break(Report::msg(
                "transformation does not support terminators with side effects",
            ));
        }

        for index in 0..op.num_successors() {
            let succ_ops = branch_op_interface.get_successor_operands(index);

            // We cannot support operations with operation-produced successor operands as it is
            // currently not possible to pass them to any block arguments other than the first. This
            // breaks creating multiplexer blocks and would likely need special handling elsewhere
            // too.
            if succ_ops.num_produced() == 0 {
                continue;
            }

            return WalkResult::Break(Report::msg(
                "transformation does not support operations with operation-produced successor \
                 operands",
            ));
        }

        WalkResult::Continue(())
    });

    walk_result.into_result()
}