midenc_hir_transform/cfg_to_scf.rs
1//! This code is an implementation of the algorithm described in _Perfect Reconstructability of
2//! Control Flow from Demand Dependence Graphs_, by Bahmann, Reismann, Jahre, and Meyer. 2015.
3//! See https://doi.org/10.1145/2693261.
4//!
5//! It defines an algorithm to translate any control flow graph with a single entry and single exit
6//! block into structured control flow operations consisting of regions of do-while loops and
7//! operations conditionally dispatching to one out of multiple regions before continuing after the
8//! operation. This includes control flow graphs containing irreducible control flow.
9//!
10//! The implementation here additionally supports the transformation on regions with multiple exit
11//! blocks. This is implemented by first transforming all occurrences of return-like operations to
12//! branch to a single exit block containing an instance of that return-like operation. If there are
13//! multiple kinds of return-like operations, multiple exit blocks are created. In that case the
14//! transformation leaves behind a conditional control flow graph operation that dispatches to the
15//! given regions terminating with different kinds of return-like operations each.
16//!
17//! If the function only contains a single kind of return-like operations, it is guaranteed that all
18//! control flow graph ops will be lifted to structured control flow, and that no more control flow
19//! graph ops remain after the operation.
20//!
21//! The algorithm to lift CFGs consists of two transformations applied after each other on any
22//! single-entry, single-exit region:
23//!
24//! 1. Lifting cycles to structured control flow loops
25//! 2. Lifting conditional branches to structured control flow branches
26//!
27//! These are then applied recursively on any new single-entry single-exit regions created by the
28//! transformation until no more CFG operations remain.
29//!
30//! The first part of cycle lifting is to detect any cycles in the CFG. This is done using an
31//! algorithm for iterating over SCCs. Every SCC representing a cycle is then transformed into a
32//! structured loop with a single entry block and a single latch containing the only back edge to
33//! the entry block and the only edge to an exit block outside the loop. Rerouting control flow to
34//! create single entry and exit blocks is achieved via a multiplexer construct that can be
35//! visualized as follows:
36//!
37//! ```text,ignore
38//! +-----+ +-----+ +-----+
39//! | bb0 | | bb1 |...| bbN |
40//! +--+--+ +--+--+ +-+---+
41//! | | |
42//! | v |
43//! | +------+ |
44//! | ++ ++<----+
45//! | | Region |
46//! +>| |<----+
47//! ++ ++ |
48//! +------+------+
49//! ```
50//!
51//! The above transforms to:
52//!
53//! ```text,ignore
54//! +-----+ +-----+ +-----+
55//! | bb0 | | bb1 |...| bbN |
56//! +-----+ +--|--+ ++----+
57//! | v |
58//! +->+-----+<---+
59//! | bbM |<-------+
60//! +---+-+ |
61//! +---+ | +----+ |
62//! | v | |
63//! | +------+ | |
64//! | ++ ++<-+ |
65//! +->| Region | |
66//! ++ ++ |
67//! +------+-------+
68//! ```
69//!
70//! bbM in the above is the multiplexer block, and any block previously branching to an entry block
71//! of the region are redirected to it. This includes any branches from within the region. Using a
72//! block argument, bbM then dispatches to the correct entry block of the region dependent on the
73//! predecessor.
74//!
75//! A similar transformation is done to create the latch block with the single back edge and loop
76//! exit edge.
77//!
78//! The above form has the advantage that bbM now acts as the loop header of the loop body. After
79//! the transformation on the latch, this results in a structured loop that can then be lifted to
80//! structured control flow. The conditional branches created in bbM are later lifted to conditional
81//! branches.
82//!
83//! Lifting conditional branches is done by analyzing the *first* conditional branch encountered in
84//! the entry region. The algorithm then identifies all blocks that are dominated by a specific
85//! control flow edge and the region where control flow continues:
86//!
87//! ```text,ignore
88//! +-----+
89//! +-----+ bb0 +----+
90//! v +-----+ v
91//! Region 1 +-+-+ ... +-+-+ Region n
92//! +---+ +---+
93//! ... ...
94//! | |
95//! | +---+ |
96//! +---->++ ++<---+
97//! | |
98//! ++ ++ Region T
99//! +---+
100//! ```
101//!
102//! Every region following bb0 consists of 0 or more blocks that eventually branch to Region T. If
103//! there are multiple entry blocks into Region T, a single entry block is created using a
104//! multiplexer block as shown above. Region 1 to Region n are then lifted together with the
105//! conditional control flow operation terminating bb0 into a structured conditional operation
106//! followed by the operations of the entry block of Region T.
107mod edges;
108mod transform;
109
110use midenc_hir::{
111 adt::SmallSet, dominance::DominanceInfo, traits::BranchOpInterface, BlockRef, Builder,
112 OpBuilder, Operation, OperationRef, Region, RegionRef, Report, SmallVec, SourceSpan, Type,
113 Value, ValueRange, ValueRef, WalkResult,
114};
115
116use self::transform::TransformationContext;
117
118/// This trait is used to abstract over the dialect-specific aspects of the control flow lifting
119/// transformation performed by [transform_cfg_to_scf].
120///
121/// Implementations must be able to create switch-like control flow operations in order to
122/// facilitate intermediate transformations; as well as the various structured control flow ops
123/// represented by each method (e.g. `scf.if`, `scf.while`).
124pub trait CFGToSCFInterface {
125 /// Creates a structured control flow operation branching to one of `regions`.
126 ///
127 /// It replaces `control_flow_cond_op` and must produce `result_types` as results.
128 ///
129 /// `regions` contains the list of branch regions corresponding to each successor of
130 /// `control_flow_cond_op`. Their bodies must simply be taken and left as is.
131 ///
132 /// Returns `Err` if incapable of converting the control flow graph operation.
133 fn create_structured_branch_region_op(
134 &self,
135 builder: &mut OpBuilder,
136 control_flow_cond_op: OperationRef,
137 result_types: &[Type],
138 regions: &mut SmallVec<[RegionRef; 2]>,
139 ) -> Result<OperationRef, Report>;
140
141 /// Creates a return-like terminator for a branch region of the op returned by
142 /// [CFGToSCFInterface::create_structured_branch_region_op].
143 ///
144 /// * `branch_region_op` is the operation returned by `create_structured_branch_region_op`.
145 /// * `replaced_control_flow_op` is the control flow op being replaced by the terminator or
146 /// `None` if the terminator is not replacing any existing control flow op.
147 /// * `results` are the values that should be returned by the branch region.
148 fn create_structured_branch_region_terminator_op(
149 &self,
150 span: SourceSpan,
151 builder: &mut OpBuilder,
152 branch_region_op: OperationRef,
153 replaced_control_flow_op: Option<OperationRef>,
154 results: ValueRange<'_, 2>,
155 ) -> Result<(), Report>;
156
157 /// Creates a structured control flow operation representing a do-while loop.
158 ///
159 /// The do-while loop is expected to have the exact same result types as the types of the
160 /// iteration values. `loop_body` is the body of the loop.
161 ///
162 /// Implementations must create a suitable terminator op at the end of the last block in
163 /// `loop_body` which continues the loop if `condition` is 1, and exits the loop if 0.
164 ///
165 /// `loop_values_next_iter` are the values that have to be passed as the iteration values for
166 /// the next iteration if continuing, or the result of the loop if exiting.
167 ///
168 /// `condition` is guaranteed to be of the same type as values returned by
169 /// `get_cfg_switch_value` with either 0 or 1 as value.
170 ///
171 /// `loop_values_init` are the values used to initialize the iteration values of the loop.
172 ///
173 /// Returns `Err` if incapable of creating a loop op.
174 fn create_structured_do_while_loop_op(
175 &self,
176 builder: &mut OpBuilder,
177 replaced_op: OperationRef,
178 loop_values_init: ValueRange<'_, 2>,
179 condition: ValueRef,
180 loop_values_next_iter: ValueRange<'_, 2>,
181 loop_body: RegionRef,
182 ) -> Result<OperationRef, Report>;
183
184 /// Creates a constant operation with a result representing `value` that is suitable as flag
185 /// for `create_cfg_switch_op`.
186 fn get_cfg_switch_value(
187 &self,
188 span: SourceSpan,
189 builder: &mut OpBuilder,
190 value: u32,
191 ) -> ValueRef;
192
193 /// Creates a switch-like unstructured branch operation, branching to one of `case_destinations`
194 /// or `default_dest`.
195 ///
196 /// This is used by [transform_cfg_to_scfg] for intermediate transformations before lifting to
197 /// structured control flow.
198 ///
199 /// The switch op branches based on `flag` which is guaranteed to be of the same type as values
200 /// returned by `get_cfg_switch_value`. The insertion block of the builder is guaranteed to have
201 /// its predecessors already set to create an equivalent CFG after this operation.
202 ///
203 /// NOTE: `case_values` and other related slices may be empty to represent an unconditional
204 /// branch.
205 #[allow(clippy::too_many_arguments)]
206 fn create_cfg_switch_op(
207 &self,
208 span: SourceSpan,
209 builder: &mut OpBuilder,
210 flag: ValueRef,
211 case_values: &[u32],
212 case_destinations: &[BlockRef],
213 case_arguments: &[ValueRange<'_, 2>],
214 default_dest: BlockRef,
215 default_args: ValueRange<'_, 2>,
216 ) -> Result<(), Report>;
217
218 /// Creates a constant operation returning an undefined instance of `type`.
219 ///
220 /// This is required by the transformation as the lifting process might create control flow
221 /// paths where an SSA-value is undefined.
222 fn get_undef_value(&self, span: SourceSpan, builder: &mut OpBuilder, ty: Type) -> ValueRef;
223
224 /// Creates a return-like terminator indicating unreachable.
225 ///
226 /// This is required when the transformation encounters a statically known infinite loop. Since
227 /// structured control flow ops are not terminators, after lifting an infinite loop, a
228 /// terminator has to be placed after to possibly satisfy the terminator requirement of the
229 /// region originally passed to [transform_cfg_to_scf].
230 ///
231 /// `region` is guaranteed to be the region originally passed to [transform_cfg_to_scf] and the
232 /// op is guaranteed to always be an op in a block directly nested under `region` after the
233 /// transformation.
234 ///
235 /// Returns `Err` if incapable of creating an unreachable terminator.
236 fn create_unreachable_terminator(
237 &self,
238 span: SourceSpan,
239 builder: &mut OpBuilder,
240 region: RegionRef,
241 ) -> Result<OperationRef, Report>;
242
243 /// Helper function to create an unconditional branch using [create_cfg_switch_op].
244 fn create_single_destination_branch(
245 &self,
246 span: SourceSpan,
247 builder: &mut OpBuilder,
248 dummy_flag: ValueRef,
249 destination: BlockRef,
250 arguments: ValueRange<'_, 2>,
251 ) -> Result<(), Report> {
252 self.create_cfg_switch_op(span, builder, dummy_flag, &[], &[], &[], destination, arguments)
253 }
254
255 /// Helper function to create a conditional branch using [create_cfg_switch_op].
256 #[allow(clippy::too_many_arguments)]
257 fn create_conditional_branch(
258 &self,
259 span: SourceSpan,
260 builder: &mut OpBuilder,
261 condition: ValueRef,
262 true_dest: BlockRef,
263 true_args: ValueRange<'_, 2>,
264 false_dest: BlockRef,
265 false_args: ValueRange<'_, 2>,
266 ) -> Result<(), Report> {
267 self.create_cfg_switch_op(
268 span,
269 builder,
270 condition,
271 &[0],
272 &[false_dest],
273 &[false_args],
274 true_dest,
275 true_args,
276 )
277 }
278}
279
280/// This function transforms all unstructured control flow operations within `region`, to equivalent
281/// structured control flow operations.
282///
283/// This transformation is dialect-agnostic, facilitated by delegating the dialect-specific aspects
284/// of lifting operations, to the implementation of [CFGToSCFInterface] that is provided.
285///
286/// If the region contains only a single kind of return-like operation, all control flow graph
287/// operations will be converted successfully. Otherwise a single control flow graph operation
288/// branching to one block per return-like operation kind remains.
289///
290/// The transformation currently requires that all control flow graph operations have no side
291/// effects, implement the [crate::traits::BranchOpInterface], and do not have any operation-
292/// produced successor operands.
293///
294/// Returns `Err` if any of the preconditions are violated or if any of the methods of `interface`
295/// failed. The IR is left in an unspecified state in such cases.
296///
297/// If successful, returns a boolean indicating whether the IR was changed.
298pub fn transform_cfg_to_scf(
299 region: RegionRef,
300 interface: &mut dyn CFGToSCFInterface,
301 dominance_info: &mut DominanceInfo,
302) -> Result<bool, Report> {
303 {
304 let region = region.borrow();
305 if region.is_empty() || region.has_one_block() {
306 return Ok(false);
307 }
308
309 check_transformation_preconditions(®ion)?;
310 }
311
312 let mut transform_ctx = TransformationContext::new(region, interface, dominance_info)?;
313
314 let mut worklist = SmallVec::<[BlockRef; 4]>::from_slice(&[transform_ctx.entry()]);
315 while let Some(current) = worklist.pop() {
316 // Turn all top-level cycles in the CFG to structured control flow first.
317 // After this transformation, the remaining CFG ops form a DAG.
318 let mut new_regions = transform_ctx.transform_cycles_to_scf_loops(current)?;
319
320 // Add the newly created subregions to the worklist. These are the bodies of the loops.
321 worklist.extend(new_regions.iter().copied());
322 // Invalidate the dominance tree as blocks have been moved, created and added during the
323 // cycle to structured loop transformation.
324 if !new_regions.is_empty() {
325 let parent_region = current.parent().unwrap();
326 transform_ctx.invalidate_dominance_info_for_region(parent_region);
327 }
328 new_regions = transform_ctx.transform_to_structured_cf_branches(current)?;
329
330 // Invalidating the dominance tree is generally not required by the transformation above as
331 // the new region entries correspond to unaffected subtrees in the dominator tree. Only its
332 // parent nodes have changed but won't be visited again.
333 worklist.extend(new_regions);
334 }
335
336 // Clean up garbage we may have created during the transformation
337 //
338 // NOTE: This is not guaranteed to clean up _everything_ that may be garbage, only things we
339 // have accounted for. Canonicalization and other optimization passes can take care of anything
340 // else that may remain
341 transform_ctx.garbage_collect();
342
343 Ok(true)
344}
345
346/// Checks all preconditions of the transformation prior to any transformations.
347///
348/// Returns failure if any precondition is violated.
349fn check_transformation_preconditions(region: &Region) -> Result<(), Report> {
350 use midenc_hir::SuccessorOperands;
351
352 for block in region.body() {
353 if !block.has_predecessors() && !block.is_entry_block() {
354 return Err(Report::msg("transformation does not support unreachable blocks"));
355 }
356 }
357
358 let walk_result = region.prewalk(|op: &Operation| {
359 if !op.has_successors() {
360 return WalkResult::Skip;
361 }
362
363 // This transformation requires all ops with successors to implement the branch op interface.
364 // It is impossible to adjust their block arguments otherwise.
365 let branch_op_interface = match op.as_trait::<dyn BranchOpInterface>().ok_or_else(|| {
366 Report::msg(
367 "transformation does not support terminators with successors not implementing \
368 BranchOpInterface",
369 )
370 }) {
371 Ok(boi) => boi,
372 Err(err) => return WalkResult::Break(err),
373 };
374
375 // Branch operations must have no side effects. Replacing them would not be valid otherwise.
376 if !op.is_memory_effect_free() {
377 return WalkResult::Break(Report::msg(
378 "transformation does not support terminators with side effects",
379 ));
380 }
381
382 for index in 0..op.num_successors() {
383 let succ_ops = branch_op_interface.get_successor_operands(index);
384
385 // We cannot support operations with operation-produced successor operands as it is
386 // currently not possible to pass them to any block arguments other than the first. This
387 // breaks creating multiplexer blocks and would likely need special handling elsewhere
388 // too.
389 if succ_ops.num_produced() == 0 {
390 continue;
391 }
392
393 return WalkResult::Break(Report::msg(
394 "transformation does not support operations with operation-produced successor \
395 operands",
396 ));
397 }
398
399 WalkResult::Continue(())
400 });
401
402 walk_result.into_result()
403}