midenc_hir_transform/
cfg_to_scf.rs

1//! This code is an implementation of the algorithm described in _Perfect Reconstructability of
2//! Control Flow from Demand Dependence Graphs_, by Bahmann, Reismann, Jahre, and Meyer. 2015.
3//! See https://doi.org/10.1145/2693261.
4//!
5//! It defines an algorithm to translate any control flow graph with a single entry and single exit
6//! block into structured control flow operations consisting of regions of do-while loops and
7//! operations conditionally dispatching to one out of multiple regions before continuing after the
8//! operation. This includes control flow graphs containing irreducible control flow.
9//!
10//! The implementation here additionally supports the transformation on regions with multiple exit
11//! blocks. This is implemented by first transforming all occurrences of return-like operations to
12//! branch to a single exit block containing an instance of that return-like operation. If there are
13//! multiple kinds of return-like operations, multiple exit blocks are created. In that case the
14//! transformation leaves behind a conditional control flow graph operation that dispatches to the
15//! given regions terminating with different kinds of return-like operations each.
16//!
17//! If the function only contains a single kind of return-like operations, it is guaranteed that all
18//! control flow graph ops will be lifted to structured control flow, and that no more control flow
19//! graph ops remain after the operation.
20//!
21//! The algorithm to lift CFGs consists of two transformations applied after each other on any
22//! single-entry, single-exit region:
23//!
24//! 1. Lifting cycles to structured control flow loops
25//! 2. Lifting conditional branches to structured control flow branches
26//!
27//! These are then applied recursively on any new single-entry single-exit regions created by the
28//! transformation until no more CFG operations remain.
29//!
30//! The first part of cycle lifting is to detect any cycles in the CFG. This is done using an
31//! algorithm for iterating over SCCs. Every SCC representing a cycle is then transformed into a
32//! structured loop with a single entry block and a single latch containing the only back edge to
33//! the entry block and the only edge to an exit block outside the loop. Rerouting control flow to
34//! create single entry and exit blocks is achieved via a multiplexer construct that can be
35//! visualized as follows:
36//!
37//! ```text,ignore
38//! +-----+ +-----+   +-----+
39//! | bb0 | | bb1 |...| bbN |
40//! +--+--+ +--+--+   +-+---+
41//!    |       |        |
42//!    |       v        |
43//!    |  +------+      |
44//!    | ++      ++<----+
45//!    | | Region |
46//!    +>|        |<----+
47//!      ++      ++     |
48//!       +------+------+
49//! ```
50//!
51//! The above transforms to:
52//!
53//! ```text,ignore
54//! +-----+ +-----+   +-----+
55//! | bb0 | | bb1 |...| bbN |
56//! +-----+ +--|--+   ++----+
57//!      |     v       |
58//!      +->+-----+<---+
59//!         | bbM |<-------+
60//!         +---+-+        |
61//!     +---+   | +----+   |
62//!     |       v      |   |
63//!     |   +------+   |   |
64//!     |  ++      ++<-+   |
65//!     +->| Region |      |
66//!        ++      ++      |
67//!         +------+-------+
68//! ```
69//!
70//! bbM in the above is the multiplexer block, and any block previously branching to an entry block
71//! of the region are redirected to it. This includes any branches from within the region. Using a
72//! block argument, bbM then dispatches to the correct entry block of the region dependent on the
73//! predecessor.
74//!
75//! A similar transformation is done to create the latch block with the single back edge and loop
76//! exit edge.
77//!
78//! The above form has the advantage that bbM now acts as the loop header of the loop body. After
79//! the transformation on the latch, this results in a structured loop that can then be lifted to
80//! structured control flow. The conditional branches created in bbM are later lifted to conditional
81//! branches.
82//!
83//! Lifting conditional branches is done by analyzing the *first* conditional branch encountered in
84//! the entry region. The algorithm then identifies all blocks that are dominated by a specific
85//! control flow edge and the region where control flow continues:
86//!
87//! ```text,ignore
88//!                  +-----+
89//!            +-----+ bb0 +----+
90//!            v     +-----+    v
91//! Region 1 +-+-+    ...     +-+-+ Region n
92//!          +---+            +---+
93//!           ...              ...
94//!            |                |
95//!            |      +---+     |
96//!            +---->++   ++<---+
97//!                  |     |
98//!                  ++   ++ Region T
99//!                   +---+
100//! ```
101//!
102//! Every region following bb0 consists of 0 or more blocks that eventually branch to Region T. If
103//! there are multiple entry blocks into Region T, a single entry block is created using a
104//! multiplexer block as shown above. Region 1 to Region n are then lifted together with the
105//! conditional control flow operation terminating bb0 into a structured conditional operation
106//! followed by the operations of the entry block of Region T.
107mod edges;
108mod transform;
109
110use midenc_hir::{
111    adt::SmallSet, dominance::DominanceInfo, traits::BranchOpInterface, BlockRef, Builder,
112    OpBuilder, Operation, OperationRef, Region, RegionRef, Report, SmallVec, SourceSpan, Type,
113    Value, ValueRange, ValueRef, WalkResult,
114};
115
116use self::transform::TransformationContext;
117
118/// This trait is used to abstract over the dialect-specific aspects of the control flow lifting
119/// transformation performed by [transform_cfg_to_scf].
120///
121/// Implementations must be able to create switch-like control flow operations in order to
122/// facilitate intermediate transformations; as well as the various structured control flow ops
123/// represented by each method (e.g. `scf.if`, `scf.while`).
124pub trait CFGToSCFInterface {
125    /// Creates a structured control flow operation branching to one of `regions`.
126    ///
127    /// It replaces `control_flow_cond_op` and must produce `result_types` as results.
128    ///
129    /// `regions` contains the list of branch regions corresponding to each successor of
130    /// `control_flow_cond_op`. Their bodies must simply be taken and left as is.
131    ///
132    /// Returns `Err` if incapable of converting the control flow graph operation.
133    fn create_structured_branch_region_op(
134        &self,
135        builder: &mut OpBuilder,
136        control_flow_cond_op: OperationRef,
137        result_types: &[Type],
138        regions: &mut SmallVec<[RegionRef; 2]>,
139    ) -> Result<OperationRef, Report>;
140
141    /// Creates a return-like terminator for a branch region of the op returned by
142    /// [CFGToSCFInterface::create_structured_branch_region_op].
143    ///
144    /// * `branch_region_op` is the operation returned by `create_structured_branch_region_op`.
145    /// * `replaced_control_flow_op` is the control flow op being replaced by the terminator or
146    ///   `None` if the terminator is not replacing any existing control flow op.
147    /// * `results` are the values that should be returned by the branch region.
148    fn create_structured_branch_region_terminator_op(
149        &self,
150        span: SourceSpan,
151        builder: &mut OpBuilder,
152        branch_region_op: OperationRef,
153        replaced_control_flow_op: Option<OperationRef>,
154        results: ValueRange<'_, 2>,
155    ) -> Result<(), Report>;
156
157    /// Creates a structured control flow operation representing a do-while loop.
158    ///
159    /// The do-while loop is expected to have the exact same result types as the types of the
160    /// iteration values. `loop_body` is the body of the loop.
161    ///
162    /// Implementations must create a suitable terminator op at the end of the last block in
163    /// `loop_body` which continues the loop if `condition` is 1, and exits the loop if 0.
164    ///
165    /// `loop_values_next_iter` are the values that have to be passed as the iteration values for
166    /// the next iteration if continuing, or the result of the loop if exiting.
167    ///
168    /// `condition` is guaranteed to be of the same type as values returned by
169    /// `get_cfg_switch_value` with either 0 or 1 as value.
170    ///
171    /// `loop_values_init` are the values used to initialize the iteration values of the loop.
172    ///
173    /// Returns `Err` if incapable of creating a loop op.
174    fn create_structured_do_while_loop_op(
175        &self,
176        builder: &mut OpBuilder,
177        replaced_op: OperationRef,
178        loop_values_init: ValueRange<'_, 2>,
179        condition: ValueRef,
180        loop_values_next_iter: ValueRange<'_, 2>,
181        loop_body: RegionRef,
182    ) -> Result<OperationRef, Report>;
183
184    /// Creates a constant operation with a result representing `value` that is suitable as flag
185    /// for `create_cfg_switch_op`.
186    fn get_cfg_switch_value(
187        &self,
188        span: SourceSpan,
189        builder: &mut OpBuilder,
190        value: u32,
191    ) -> ValueRef;
192
193    /// Creates a switch-like unstructured branch operation, branching to one of `case_destinations`
194    /// or `default_dest`.
195    ///
196    /// This is used by [transform_cfg_to_scfg] for intermediate transformations before lifting to
197    /// structured control flow.
198    ///
199    /// The switch op branches based on `flag` which is guaranteed to be of the same type as values
200    /// returned by `get_cfg_switch_value`. The insertion block of the builder is guaranteed to have
201    /// its predecessors already set to create an equivalent CFG after this operation.
202    ///
203    /// NOTE: `case_values` and other related slices may be empty to represent an unconditional
204    /// branch.
205    #[allow(clippy::too_many_arguments)]
206    fn create_cfg_switch_op(
207        &self,
208        span: SourceSpan,
209        builder: &mut OpBuilder,
210        flag: ValueRef,
211        case_values: &[u32],
212        case_destinations: &[BlockRef],
213        case_arguments: &[ValueRange<'_, 2>],
214        default_dest: BlockRef,
215        default_args: ValueRange<'_, 2>,
216    ) -> Result<(), Report>;
217
218    /// Creates a constant operation returning an undefined instance of `type`.
219    ///
220    /// This is required by the transformation as the lifting process might create control flow
221    /// paths where an SSA-value is undefined.
222    fn get_undef_value(&self, span: SourceSpan, builder: &mut OpBuilder, ty: Type) -> ValueRef;
223
224    /// Creates a return-like terminator indicating unreachable.
225    ///
226    /// This is required when the transformation encounters a statically known infinite loop. Since
227    /// structured control flow ops are not terminators, after lifting an infinite loop, a
228    /// terminator has to be placed after to possibly satisfy the terminator requirement of the
229    /// region originally passed to [transform_cfg_to_scf].
230    ///
231    /// `region` is guaranteed to be the region originally passed to [transform_cfg_to_scf] and the
232    /// op is guaranteed to always be an op in a block directly nested under `region` after the
233    /// transformation.
234    ///
235    /// Returns `Err` if incapable of creating an unreachable terminator.
236    fn create_unreachable_terminator(
237        &self,
238        span: SourceSpan,
239        builder: &mut OpBuilder,
240        region: RegionRef,
241    ) -> Result<OperationRef, Report>;
242
243    /// Helper function to create an unconditional branch using [create_cfg_switch_op].
244    fn create_single_destination_branch(
245        &self,
246        span: SourceSpan,
247        builder: &mut OpBuilder,
248        dummy_flag: ValueRef,
249        destination: BlockRef,
250        arguments: ValueRange<'_, 2>,
251    ) -> Result<(), Report> {
252        self.create_cfg_switch_op(span, builder, dummy_flag, &[], &[], &[], destination, arguments)
253    }
254
255    /// Helper function to create a conditional branch using [create_cfg_switch_op].
256    #[allow(clippy::too_many_arguments)]
257    fn create_conditional_branch(
258        &self,
259        span: SourceSpan,
260        builder: &mut OpBuilder,
261        condition: ValueRef,
262        true_dest: BlockRef,
263        true_args: ValueRange<'_, 2>,
264        false_dest: BlockRef,
265        false_args: ValueRange<'_, 2>,
266    ) -> Result<(), Report> {
267        self.create_cfg_switch_op(
268            span,
269            builder,
270            condition,
271            &[0],
272            &[false_dest],
273            &[false_args],
274            true_dest,
275            true_args,
276        )
277    }
278}
279
280/// This function transforms all unstructured control flow operations within `region`, to equivalent
281/// structured control flow operations.
282///
283/// This transformation is dialect-agnostic, facilitated by delegating the dialect-specific aspects
284/// of lifting operations, to the implementation of [CFGToSCFInterface] that is provided.
285///
286/// If the region contains only a single kind of return-like operation, all control flow graph
287/// operations will be converted successfully. Otherwise a single control flow graph operation
288/// branching to one block per return-like operation kind remains.
289///
290/// The transformation currently requires that all control flow graph operations have no side
291/// effects, implement the [crate::traits::BranchOpInterface], and do not have any operation-
292/// produced successor operands.
293///
294/// Returns `Err` if any of the preconditions are violated or if any of the methods of `interface`
295/// failed. The IR is left in an unspecified state in such cases.
296///
297/// If successful, returns a boolean indicating whether the IR was changed.
298pub fn transform_cfg_to_scf(
299    region: RegionRef,
300    interface: &mut dyn CFGToSCFInterface,
301    dominance_info: &mut DominanceInfo,
302) -> Result<bool, Report> {
303    {
304        let region = region.borrow();
305        if region.is_empty() || region.has_one_block() {
306            return Ok(false);
307        }
308
309        check_transformation_preconditions(&region)?;
310    }
311
312    let mut transform_ctx = TransformationContext::new(region, interface, dominance_info)?;
313
314    let mut worklist = SmallVec::<[BlockRef; 4]>::from_slice(&[transform_ctx.entry()]);
315    while let Some(current) = worklist.pop() {
316        // Turn all top-level cycles in the CFG to structured control flow first.
317        // After this transformation, the remaining CFG ops form a DAG.
318        let mut new_regions = transform_ctx.transform_cycles_to_scf_loops(current)?;
319
320        // Add the newly created subregions to the worklist. These are the bodies of the loops.
321        worklist.extend(new_regions.iter().copied());
322        // Invalidate the dominance tree as blocks have been moved, created and added during the
323        // cycle to structured loop transformation.
324        if !new_regions.is_empty() {
325            let parent_region = current.parent().unwrap();
326            transform_ctx.invalidate_dominance_info_for_region(parent_region);
327        }
328        new_regions = transform_ctx.transform_to_structured_cf_branches(current)?;
329
330        // Invalidating the dominance tree is generally not required by the transformation above as
331        // the new region entries correspond to unaffected subtrees in the dominator tree. Only its
332        // parent nodes have changed but won't be visited again.
333        worklist.extend(new_regions);
334    }
335
336    // Clean up garbage we may have created during the transformation
337    //
338    // NOTE: This is not guaranteed to clean up _everything_ that may be garbage, only things we
339    // have accounted for. Canonicalization and other optimization passes can take care of anything
340    // else that may remain
341    transform_ctx.garbage_collect();
342
343    Ok(true)
344}
345
346/// Checks all preconditions of the transformation prior to any transformations.
347///
348/// Returns failure if any precondition is violated.
349fn check_transformation_preconditions(region: &Region) -> Result<(), Report> {
350    use midenc_hir::SuccessorOperands;
351
352    for block in region.body() {
353        if !block.has_predecessors() && !block.is_entry_block() {
354            return Err(Report::msg("transformation does not support unreachable blocks"));
355        }
356    }
357
358    let walk_result = region.prewalk(|op: &Operation| {
359        if !op.has_successors() {
360            return WalkResult::Skip;
361        }
362
363        // This transformation requires all ops with successors to implement the branch op interface.
364        // It is impossible to adjust their block arguments otherwise.
365        let branch_op_interface = match op.as_trait::<dyn BranchOpInterface>().ok_or_else(|| {
366            Report::msg(
367                "transformation does not support terminators with successors not implementing \
368                 BranchOpInterface",
369            )
370        }) {
371            Ok(boi) => boi,
372            Err(err) => return WalkResult::Break(err),
373        };
374
375        // Branch operations must have no side effects. Replacing them would not be valid otherwise.
376        if !op.is_memory_effect_free() {
377            return WalkResult::Break(Report::msg(
378                "transformation does not support terminators with side effects",
379            ));
380        }
381
382        for index in 0..op.num_successors() {
383            let succ_ops = branch_op_interface.get_successor_operands(index);
384
385            // We cannot support operations with operation-produced successor operands as it is
386            // currently not possible to pass them to any block arguments other than the first. This
387            // breaks creating multiplexer blocks and would likely need special handling elsewhere
388            // too.
389            if succ_ops.num_produced() == 0 {
390                continue;
391            }
392
393            return WalkResult::Break(Report::msg(
394                "transformation does not support operations with operation-produced successor \
395                 operands",
396            ));
397        }
398
399        WalkResult::Continue(())
400    });
401
402    walk_result.into_result()
403}