cranelift_codegen/machinst/buffer.rs
1//! In-memory representation of compiled machine code, with labels and fixups to
2//! refer to those labels. Handles constant-pool island insertion and also
3//! veneer insertion for out-of-range jumps.
4//!
5//! This code exists to solve three problems:
6//!
7//! - Branch targets for forward branches are not known until later, when we
8//! emit code in a single pass through the instruction structs.
9//!
10//! - On many architectures, address references or offsets have limited range.
11//! For example, on AArch64, conditional branches can only target code +/- 1MB
12//! from the branch itself.
13//!
14//! - The lowering of control flow from the CFG-with-edges produced by
15//! [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
16//! edge blocks when the register allocator does not need to insert any
17//! spills/reloads/moves in edge blocks, results in many suboptimal branch
18//! patterns. The lowering also pays no attention to block order, and so
19//! two-target conditional forms (cond-br followed by uncond-br) can often by
20//! avoided because one of the targets is the fallthrough. There are several
21//! cases here where we can simplify to use fewer branches.
22//!
23//! This "buffer" implements a single-pass code emission strategy (with a later
24//! "fixup" pass, but only through recorded fixups, not all instructions). The
25//! basic idea is:
26//!
27//! - Emit branches as they are, including two-target (cond/uncond) compound
28//! forms, but with zero offsets and optimistically assuming the target will be
29//! in range. Record the "fixup" for later. Targets are denoted instead by
30//! symbolic "labels" that are then bound to certain offsets in the buffer as
31//! we emit code. (Nominally, there is a label at the start of every basic
32//! block.)
33//!
34//! - As we do this, track the offset in the buffer at which the first label
35//! reference "goes out of range". We call this the "deadline". If we reach the
36//! deadline and we still have not bound the label to which an unresolved branch
37//! refers, we have a problem!
38//!
39//! - To solve this problem, we emit "islands" full of "veneers". An island is
40//! simply a chunk of code inserted in the middle of the code actually produced
41//! by the emitter (e.g., vcode iterating over instruction structs). The emitter
42//! has some awareness of this: it either asks for an island between blocks, so
43//! it is not accidentally executed, or else it emits a branch around the island
44//! when all other options fail (see `Inst::EmitIsland` meta-instruction).
45//!
46//! - A "veneer" is an instruction (or sequence of instructions) in an "island"
47//! that implements a longer-range reference to a label. The idea is that, for
48//! example, a branch with a limited range can branch to a "veneer" instead,
49//! which is simply a branch in a form that can use a longer-range reference. On
50//! AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
51//! can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
52//! conditional branch's label reference can be fixed up with a "veneer" to
53//! achieve a longer range.
54//!
55//! - To implement all of this, we require the backend to provide a `LabelUse`
56//! type that implements a trait. This is nominally an enum that records one of
57//! several kinds of references to an offset in code -- basically, a relocation
58//! type -- and will usually correspond to different instruction formats. The
59//! `LabelUse` implementation specifies the maximum range, how to patch in the
60//! actual label location when known, and how to generate a veneer to extend the
61//! range.
62//!
63//! That satisfies label references, but we still may have suboptimal branch
64//! patterns. To clean up the branches, we do a simple "peephole"-style
65//! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
66//! informs the buffer of branches in the code and, in the case of conditionals,
67//! the code that would have been emitted to invert this branch's condition. We
68//! track the "latest branches": these are branches that are contiguous up to
69//! the current offset. (If any code is emitted after a branch, that branch or
70//! run of contiguous branches is no longer "latest".) The latest branches are
71//! those that we can edit by simply truncating the buffer and doing something
72//! else instead.
73//!
74//! To optimize branches, we implement several simple rules, and try to apply
75//! them to the "latest branches" when possible:
76//!
77//! - A branch with a label target, when that label is bound to the ending
78//! offset of the branch (the fallthrough location), can be removed altogether,
79//! because the branch would have no effect).
80//!
81//! - An unconditional branch that starts at a label location, and branches to
82//! another label, results in a "label alias": all references to the label bound
83//! *to* this branch instruction are instead resolved to the *target* of the
84//! branch instruction. This effectively removes empty blocks that just
85//! unconditionally branch to the next block. We call this "branch threading".
86//!
87//! - A conditional followed by an unconditional, when the conditional branches
88//! to the unconditional's fallthrough, results in (i) the truncation of the
89//! unconditional, (ii) the inversion of the condition's condition, and (iii)
90//! replacement of the conditional's target (using the original target of the
91//! unconditional). This is a fancy way of saying "we can flip a two-target
92//! conditional branch's taken/not-taken targets if it works better with our
93//! fallthrough". To make this work, the emitter actually gives the buffer
94//! *both* forms of every conditional branch: the true form is emitted into the
95//! buffer, and the "inverted" machine-code bytes are provided as part of the
96//! branch-fixup metadata.
97//!
98//! - An unconditional B preceded by another unconditional P, when B's label(s) have
99//! been redirected to target(B), can be removed entirely. This is an extension
100//! of the branch-threading optimization, and is valid because if we know there
101//! will be no fallthrough into this branch instruction (the prior instruction
102//! is an unconditional jump), and if we know we have successfully redirected
103//! all labels, then this branch instruction is unreachable. Note that this
104//! works because the redirection happens before the label is ever resolved
105//! (fixups happen at island emission time, at which point latest-branches are
106//! cleared, or at the end of emission), so we are sure to catch and redirect
107//! all possible paths to this instruction.
108//!
109//! # Branch-optimization Correctness
110//!
111//! The branch-optimization mechanism depends on a few data structures with
112//! invariants, which are always held outside the scope of top-level public
113//! methods:
114//!
115//! - The latest-branches list. Each entry describes a span of the buffer
116//! (start/end offsets), the label target, the corresponding fixup-list entry
117//! index, and the bytes (must be the same length) for the inverted form, if
118//! conditional. The list of labels that are bound to the start-offset of this
119//! branch is *complete* (if any label has a resolved offset equal to `start`
120//! and is not an alias, it must appear in this list) and *precise* (no label
121//! in this list can be bound to another offset). No label in this list should
122//! be an alias. No two branch ranges can overlap, and branches are in
123//! ascending-offset order.
124//!
125//! - The labels-at-tail list. This contains all MachLabels that have been bound
126//! to (whose resolved offsets are equal to) the tail offset of the buffer.
127//! No label in this list should be an alias.
128//!
129//! - The label_offsets array, containing the bound offset of a label or
130//! UNKNOWN. No label can be bound at an offset greater than the current
131//! buffer tail.
132//!
133//! - The label_aliases array, containing another label to which a label is
134//! bound or UNKNOWN. A label's resolved offset is the resolved offset
135//! of the label it is aliased to, if this is set.
136//!
137//! We argue below, at each method, how the invariants in these data structures
138//! are maintained (grep for "Post-invariant").
139//!
140//! Given these invariants, we argue why each optimization preserves execution
141//! semantics below (grep for "Preserves execution semantics").
142//!
143//! # Avoiding Quadratic Behavior
144//!
145//! There are two cases where we've had to take some care to avoid
146//! quadratic worst-case behavior:
147//!
148//! - The "labels at this branch" list can grow unboundedly if the
149//! code generator binds many labels at one location. If the count
150//! gets too high (defined by the `LABEL_LIST_THRESHOLD` constant), we
151//! simply abort an optimization early in a way that is always correct
152//! but is conservative.
153//!
154//! - The fixup list can interact with island emission to create
155//! "quadratic island behvior". In a little more detail, one can hit
156//! this behavior by having some pending fixups (forward label
157//! references) with long-range label-use kinds, and some others
158//! with shorter-range references that nonetheless still are pending
159//! long enough to trigger island generation. In such a case, we
160//! process the fixup list, generate veneers to extend some forward
161//! references' ranges, but leave the other (longer-range) ones
162//! alone. The way this was implemented put them back on a list and
163//! resulted in quadratic behavior.
164//!
165//! To avoid this fixups are split into two lists: one "pending" list and one
166//! final list. The pending list is kept around for handling fixups related to
167//! branches so it can be edited/truncated. When an island is reached, which
168//! starts processing fixups, all pending fixups are flushed into the final
169//! list. The final list is a `BinaryHeap` which enables fixup processing to
170//! only process those which are required during island emission, deferring
171//! all longer-range fixups to later.
172
173use crate::binemit::{Addend, CodeOffset, Reloc, StackMap};
174use crate::ir::function::FunctionParameters;
175use crate::ir::{ExternalName, RelSourceLoc, SourceLoc, TrapCode};
176use crate::isa::unwind::UnwindInst;
177use crate::machinst::{
178 BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,
179};
180use crate::trace;
181use crate::{ir, MachInstEmitState};
182use crate::{timing, VCodeConstantData};
183use cranelift_control::ControlPlane;
184use cranelift_entity::{entity_impl, PrimaryMap};
185use smallvec::SmallVec;
186use std::cmp::Ordering;
187use std::collections::BinaryHeap;
188use std::mem;
189use std::string::String;
190use std::vec::Vec;
191
192#[cfg(feature = "enable-serde")]
193use serde::{Deserialize, Serialize};
194
195#[cfg(feature = "enable-serde")]
196pub trait CompilePhase {
197 type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
198 type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
199}
200
201#[cfg(not(feature = "enable-serde"))]
202pub trait CompilePhase {
203 type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;
204 type SourceLocType: core::fmt::Debug + PartialEq + Clone;
205}
206
207/// Status of a compiled artifact that needs patching before being used.
208#[derive(Clone, Debug, PartialEq)]
209#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
210pub struct Stencil;
211
212/// Status of a compiled artifact ready to use.
213#[derive(Clone, Debug, PartialEq)]
214pub struct Final;
215
216impl CompilePhase for Stencil {
217 type MachSrcLocType = MachSrcLoc<Stencil>;
218 type SourceLocType = RelSourceLoc;
219}
220
221impl CompilePhase for Final {
222 type MachSrcLocType = MachSrcLoc<Final>;
223 type SourceLocType = SourceLoc;
224}
225
226#[derive(Clone, Copy, Debug, PartialEq, Eq)]
227enum ForceVeneers {
228 Yes,
229 No,
230}
231
232/// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
233/// in bulk.
234///
235/// This struct uses `SmallVec`s to support small-ish function bodies without
236/// any heap allocation. As such, it will be several kilobytes large. This is
237/// likely fine as long as it is stack-allocated for function emission then
238/// thrown away; but beware if many buffer objects are retained persistently.
239pub struct MachBuffer<I: VCodeInst> {
240 /// The buffer contents, as raw bytes.
241 data: SmallVec<[u8; 1024]>,
242 /// Any relocations referring to this code. Note that only *external*
243 /// relocations are tracked here; references to labels within the buffer are
244 /// resolved before emission.
245 relocs: SmallVec<[MachReloc; 16]>,
246 /// Any trap records referring to this code.
247 traps: SmallVec<[MachTrap; 16]>,
248 /// Any call site records referring to this code.
249 call_sites: SmallVec<[MachCallSite; 16]>,
250 /// Any source location mappings referring to this code.
251 srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,
252 /// Any stack maps referring to this code.
253 stack_maps: SmallVec<[MachStackMap; 8]>,
254 /// Any user stack maps for this code.
255 ///
256 /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
257 /// by code offset, and each stack map covers `span` bytes on the stack.
258 user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
259 /// Any unwind info at a given location.
260 unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
261 /// The current source location in progress (after `start_srcloc()` and
262 /// before `end_srcloc()`). This is a (start_offset, src_loc) tuple.
263 cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,
264 /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
265 label_offsets: SmallVec<[CodeOffset; 16]>,
266 /// Label aliases: when one label points to an unconditional jump, and that
267 /// jump points to another label, we can redirect references to the first
268 /// label immediately to the second.
269 ///
270 /// Invariant: we don't have label-alias cycles. We ensure this by,
271 /// before setting label A to alias label B, resolving B's alias
272 /// target (iteratively until a non-aliased label); if B is already
273 /// aliased to A, then we cannot alias A back to B.
274 label_aliases: SmallVec<[MachLabel; 16]>,
275 /// Constants that must be emitted at some point.
276 pending_constants: SmallVec<[VCodeConstant; 16]>,
277 /// Byte size of all constants in `pending_constants`.
278 pending_constants_size: CodeOffset,
279 /// Traps that must be emitted at some point.
280 pending_traps: SmallVec<[MachLabelTrap; 16]>,
281 /// Fixups that haven't yet been flushed into `fixup_records` below and may
282 /// be related to branches that are chomped. These all get added to
283 /// `fixup_records` during island emission.
284 pending_fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
285 /// The nearest upcoming deadline for entries in `pending_fixup_records`.
286 pending_fixup_deadline: CodeOffset,
287 /// Fixups that must be performed after all code is emitted.
288 fixup_records: BinaryHeap<MachLabelFixup<I>>,
289 /// Latest branches, to facilitate in-place editing for better fallthrough
290 /// behavior and empty-block removal.
291 latest_branches: SmallVec<[MachBranch; 4]>,
292 /// All labels at the current offset (emission tail). This is lazily
293 /// cleared: it is actually accurate as long as the current offset is
294 /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
295 /// be considered as empty.
296 ///
297 /// For correctness, this *must* be complete (i.e., the vector must contain
298 /// all labels whose offsets are resolved to the current tail), because we
299 /// rely on it to update labels when we truncate branches.
300 labels_at_tail: SmallVec<[MachLabel; 4]>,
301 /// The last offset at which `labels_at_tail` is valid. It is conceptually
302 /// always describing the tail of the buffer, but we do not clear
303 /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
304 /// when the offset has grown past this (`labels_at_tail_off`) point.
305 /// Always <= `cur_offset()`.
306 labels_at_tail_off: CodeOffset,
307 /// Metadata about all constants that this function has access to.
308 ///
309 /// This records the size/alignment of all constants (not the actual data)
310 /// along with the last available label generated for the constant. This map
311 /// is consulted when constants are referred to and the label assigned to a
312 /// constant may change over time as well.
313 constants: PrimaryMap<VCodeConstant, MachBufferConstant>,
314 /// All recorded usages of constants as pairs of the constant and where the
315 /// constant needs to be placed within `self.data`. Note that the same
316 /// constant may appear in this array multiple times if it was emitted
317 /// multiple times.
318 used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>,
319 /// Indicates when a patchable region is currently open, to guard that it's
320 /// not possible to nest patchable regions.
321 open_patchable: bool,
322}
323
324impl MachBufferFinalized<Stencil> {
325 /// Get a finalized machine buffer by applying the function's base source location.
326 pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {
327 MachBufferFinalized {
328 data: self.data,
329 relocs: self.relocs,
330 traps: self.traps,
331 call_sites: self.call_sites,
332 srclocs: self
333 .srclocs
334 .into_iter()
335 .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))
336 .collect(),
337 stack_maps: self.stack_maps,
338 user_stack_maps: self.user_stack_maps,
339 unwind_info: self.unwind_info,
340 alignment: self.alignment,
341 }
342 }
343}
344
345/// A `MachBuffer` once emission is completed: holds generated code and records,
346/// without fixups. This allows the type to be independent of the backend.
347#[derive(PartialEq, Debug, Clone)]
348#[cfg_attr(
349 feature = "enable-serde",
350 derive(serde_derive::Serialize, serde_derive::Deserialize)
351)]
352pub struct MachBufferFinalized<T: CompilePhase> {
353 /// The buffer contents, as raw bytes.
354 pub(crate) data: SmallVec<[u8; 1024]>,
355 /// Any relocations referring to this code. Note that only *external*
356 /// relocations are tracked here; references to labels within the buffer are
357 /// resolved before emission.
358 pub(crate) relocs: SmallVec<[FinalizedMachReloc; 16]>,
359 /// Any trap records referring to this code.
360 pub(crate) traps: SmallVec<[MachTrap; 16]>,
361 /// Any call site records referring to this code.
362 pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,
363 /// Any source location mappings referring to this code.
364 pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,
365 /// Any stack maps referring to this code.
366 pub(crate) stack_maps: SmallVec<[MachStackMap; 8]>,
367 /// Any user stack maps for this code.
368 ///
369 /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
370 /// by code offset, and each stack map covers `span` bytes on the stack.
371 pub(crate) user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
372 /// Any unwind info at a given location.
373 pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
374 /// The required alignment of this buffer.
375 pub alignment: u32,
376}
377
378const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
379const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
380
381/// Threshold on max length of `labels_at_this_branch` list to avoid
382/// unbounded quadratic behavior (see comment below at use-site).
383const LABEL_LIST_THRESHOLD: usize = 100;
384
385/// A label refers to some offset in a `MachBuffer`. It may not be resolved at
386/// the point at which it is used by emitted code; the buffer records "fixups"
387/// for references to the label, and will come back and patch the code
388/// appropriately when the label's location is eventually known.
389#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
390pub struct MachLabel(u32);
391entity_impl!(MachLabel);
392
393impl MachLabel {
394 /// Get a label for a block. (The first N MachLabels are always reserved for
395 /// the N blocks in the vcode.)
396 pub fn from_block(bindex: BlockIndex) -> MachLabel {
397 MachLabel(bindex.index() as u32)
398 }
399
400 /// Get the numeric label index.
401 pub fn get(self) -> u32 {
402 self.0
403 }
404
405 /// Creates a string representing this label, for convenience.
406 pub fn to_string(&self) -> String {
407 format!("label{}", self.0)
408 }
409}
410
411impl Default for MachLabel {
412 fn default() -> Self {
413 UNKNOWN_LABEL
414 }
415}
416
417/// A stack map extent, when creating a stack map.
418pub enum StackMapExtent {
419 /// The stack map starts at this instruction, and ends after the number of upcoming bytes
420 /// (note: this is a code offset diff).
421 UpcomingBytes(CodeOffset),
422
423 /// The stack map started at the given offset and ends at the current one. This helps
424 /// architectures where the instruction size has not a fixed length.
425 StartedAtOffset(CodeOffset),
426}
427
428/// Represents the beginning of an editable region in the [`MachBuffer`], while code emission is
429/// still occurring. An [`OpenPatchRegion`] is closed by [`MachBuffer::end_patchable`], consuming
430/// the [`OpenPatchRegion`] token in the process.
431pub struct OpenPatchRegion(usize);
432
433/// A region in the [`MachBuffer`] code buffer that can be edited prior to finalization. An example
434/// of where you might want to use this is for patching instructions that mention constants that
435/// won't be known until later: [`MachBuffer::start_patchable`] can be used to begin the patchable
436/// region, instructions can be emitted with placeholder constants, and the [`PatchRegion`] token
437/// can be produced by [`MachBuffer::end_patchable`]. Once the values of those constants are known,
438/// the [`PatchRegion::patch`] function can be used to get a mutable buffer to the instruction
439/// bytes, and the constants uses can be updated directly.
440pub struct PatchRegion {
441 range: std::ops::Range<usize>,
442}
443
444impl PatchRegion {
445 /// Consume the patch region to yield a mutable slice of the [`MachBuffer`] data buffer.
446 pub fn patch<I: VCodeInst>(self, buffer: &mut MachBuffer<I>) -> &mut [u8] {
447 &mut buffer.data[self.range]
448 }
449}
450
451impl<I: VCodeInst> MachBuffer<I> {
452 /// Create a new section, known to start at `start_offset` and with a size limited to
453 /// `length_limit`.
454 pub fn new() -> MachBuffer<I> {
455 MachBuffer {
456 data: SmallVec::new(),
457 relocs: SmallVec::new(),
458 traps: SmallVec::new(),
459 call_sites: SmallVec::new(),
460 srclocs: SmallVec::new(),
461 stack_maps: SmallVec::new(),
462 user_stack_maps: SmallVec::new(),
463 unwind_info: SmallVec::new(),
464 cur_srcloc: None,
465 label_offsets: SmallVec::new(),
466 label_aliases: SmallVec::new(),
467 pending_constants: SmallVec::new(),
468 pending_constants_size: 0,
469 pending_traps: SmallVec::new(),
470 pending_fixup_records: SmallVec::new(),
471 pending_fixup_deadline: u32::MAX,
472 fixup_records: Default::default(),
473 latest_branches: SmallVec::new(),
474 labels_at_tail: SmallVec::new(),
475 labels_at_tail_off: 0,
476 constants: Default::default(),
477 used_constants: Default::default(),
478 open_patchable: false,
479 }
480 }
481
482 /// Current offset from start of buffer.
483 pub fn cur_offset(&self) -> CodeOffset {
484 self.data.len() as CodeOffset
485 }
486
487 /// Add a byte.
488 pub fn put1(&mut self, value: u8) {
489 self.data.push(value);
490
491 // Post-invariant: conceptual-labels_at_tail contains a complete and
492 // precise list of labels bound at `cur_offset()`. We have advanced
493 // `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
494 // before, it is not anymore (and it cannot become equal, because
495 // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
496 // conceptually empty (even though it is only lazily cleared). No labels
497 // can be bound at this new offset (by invariant on `label_offsets`).
498 // Hence the invariant holds.
499 }
500
501 /// Add 2 bytes.
502 pub fn put2(&mut self, value: u16) {
503 let bytes = value.to_le_bytes();
504 self.data.extend_from_slice(&bytes[..]);
505
506 // Post-invariant: as for `put1()`.
507 }
508
509 /// Add 4 bytes.
510 pub fn put4(&mut self, value: u32) {
511 let bytes = value.to_le_bytes();
512 self.data.extend_from_slice(&bytes[..]);
513
514 // Post-invariant: as for `put1()`.
515 }
516
517 /// Add 8 bytes.
518 pub fn put8(&mut self, value: u64) {
519 let bytes = value.to_le_bytes();
520 self.data.extend_from_slice(&bytes[..]);
521
522 // Post-invariant: as for `put1()`.
523 }
524
525 /// Add a slice of bytes.
526 pub fn put_data(&mut self, data: &[u8]) {
527 self.data.extend_from_slice(data);
528
529 // Post-invariant: as for `put1()`.
530 }
531
532 /// Reserve appended space and return a mutable slice referring to it.
533 pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
534 let off = self.data.len();
535 let new_len = self.data.len() + len;
536 self.data.resize(new_len, 0);
537 &mut self.data[off..]
538
539 // Post-invariant: as for `put1()`.
540 }
541
542 /// Align up to the given alignment.
543 pub fn align_to(&mut self, align_to: CodeOffset) {
544 trace!("MachBuffer: align to {}", align_to);
545 assert!(
546 align_to.is_power_of_two(),
547 "{} is not a power of two",
548 align_to
549 );
550 while self.cur_offset() & (align_to - 1) != 0 {
551 self.put1(0);
552 }
553
554 // Post-invariant: as for `put1()`.
555 }
556
557 /// Begin a region of patchable code. There is one requirement for the
558 /// code that is emitted: It must not introduce any instructions that
559 /// could be chomped (branches are an example of this). In other words,
560 /// you must not call [`MachBuffer::add_cond_branch`] or
561 /// [`MachBuffer::add_uncond_branch`] between calls to this method and
562 /// [`MachBuffer::end_patchable`].
563 pub fn start_patchable(&mut self) -> OpenPatchRegion {
564 assert!(!self.open_patchable, "Patchable regions may not be nested");
565 self.open_patchable = true;
566 OpenPatchRegion(usize::try_from(self.cur_offset()).unwrap())
567 }
568
569 /// End a region of patchable code, yielding a [`PatchRegion`] value that
570 /// can be consumed later to produce a one-off mutable slice to the
571 /// associated region of the data buffer.
572 pub fn end_patchable(&mut self, open: OpenPatchRegion) -> PatchRegion {
573 // No need to assert the state of `open_patchable` here, as we take
574 // ownership of the only `OpenPatchable` value.
575 self.open_patchable = false;
576 let end = usize::try_from(self.cur_offset()).unwrap();
577 PatchRegion { range: open.0..end }
578 }
579
580 /// Allocate a `Label` to refer to some offset. May not be bound to a fixed
581 /// offset yet.
582 pub fn get_label(&mut self) -> MachLabel {
583 let l = self.label_offsets.len() as u32;
584 self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
585 self.label_aliases.push(UNKNOWN_LABEL);
586 trace!("MachBuffer: new label -> {:?}", MachLabel(l));
587 MachLabel(l)
588
589 // Post-invariant: the only mutation is to add a new label; it has no
590 // bound offset yet, so it trivially satisfies all invariants.
591 }
592
593 /// Reserve the first N MachLabels for blocks.
594 pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {
595 trace!("MachBuffer: first {} labels are for blocks", blocks);
596 debug_assert!(self.label_offsets.is_empty());
597 self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);
598 self.label_aliases.resize(blocks, UNKNOWN_LABEL);
599
600 // Post-invariant: as for `get_label()`.
601 }
602
603 /// Registers metadata in this `MachBuffer` about the `constants` provided.
604 ///
605 /// This will record the size/alignment of all constants which will prepare
606 /// them for emission later on.
607 pub fn register_constants(&mut self, constants: &VCodeConstants) {
608 for (c, val) in constants.iter() {
609 self.register_constant(&c, val);
610 }
611 }
612
613 /// Similar to [`MachBuffer::register_constants`] but registers a
614 /// single constant metadata. This function is useful in
615 /// situations where not all constants are known at the time of
616 /// emission.
617 pub fn register_constant(&mut self, constant: &VCodeConstant, data: &VCodeConstantData) {
618 let c2 = self.constants.push(MachBufferConstant {
619 upcoming_label: None,
620 align: data.alignment(),
621 size: data.as_slice().len(),
622 });
623 assert_eq!(*constant, c2);
624 }
625
626 /// Completes constant emission by iterating over `self.used_constants` and
627 /// filling in the "holes" with the constant values provided by `constants`.
628 ///
629 /// Returns the alignment required for this entire buffer. Alignment starts
630 /// at the ISA's minimum function alignment and can be increased due to
631 /// constant requirements.
632 fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 {
633 let mut alignment = I::function_alignment().minimum;
634 for (constant, offset) in mem::take(&mut self.used_constants) {
635 let constant = constants.get(constant);
636 let data = constant.as_slice();
637 self.data[offset as usize..][..data.len()].copy_from_slice(data);
638 alignment = constant.alignment().max(alignment);
639 }
640 alignment
641 }
642
643 /// Returns a label that can be used to refer to the `constant` provided.
644 ///
645 /// This will automatically defer a new constant to be emitted for
646 /// `constant` if it has not been previously emitted. Note that this
647 /// function may return a different label for the same constant at
648 /// different points in time. The label is valid to use only from the
649 /// current location; the MachBuffer takes care to emit the same constant
650 /// multiple times if needed so the constant is always in range.
651 pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel {
652 let MachBufferConstant {
653 align,
654 size,
655 upcoming_label,
656 } = self.constants[constant];
657 if let Some(label) = upcoming_label {
658 return label;
659 }
660
661 let label = self.get_label();
662 trace!(
663 "defer constant: eventually emit {size} bytes aligned \
664 to {align} at label {label:?}",
665 );
666 self.pending_constants.push(constant);
667 self.pending_constants_size += size as u32;
668 self.constants[constant].upcoming_label = Some(label);
669 label
670 }
671
672 /// Bind a label to the current offset. A label can only be bound once.
673 pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) {
674 trace!(
675 "MachBuffer: bind label {:?} at offset {}",
676 label,
677 self.cur_offset()
678 );
679 debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
680 debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
681 let offset = self.cur_offset();
682 self.label_offsets[label.0 as usize] = offset;
683 self.lazily_clear_labels_at_tail();
684 self.labels_at_tail.push(label);
685
686 // Invariants hold: bound offset of label is <= cur_offset (in fact it
687 // is equal). If the `labels_at_tail` list was complete and precise
688 // before, it is still, because we have bound this label to the current
689 // offset and added it to the list (which contains all labels at the
690 // current offset).
691
692 self.optimize_branches(ctrl_plane);
693
694 // Post-invariant: by `optimize_branches()` (see argument there).
695 }
696
697 /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
698 /// offset that it applies to.
699 fn lazily_clear_labels_at_tail(&mut self) {
700 let offset = self.cur_offset();
701 if offset > self.labels_at_tail_off {
702 self.labels_at_tail_off = offset;
703 self.labels_at_tail.clear();
704 }
705
706 // Post-invariant: either labels_at_tail_off was at cur_offset, and
707 // state is untouched, or was less than cur_offset, in which case the
708 // labels_at_tail list was conceptually empty, and is now actually
709 // empty.
710 }
711
712 /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
713 pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
714 let mut iters = 0;
715 while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
716 label = self.label_aliases[label.0 as usize];
717 // To protect against an infinite loop (despite our assurances to
718 // ourselves that the invariants make this impossible), assert out
719 // after 1M iterations. The number of basic blocks is limited
720 // in most contexts anyway so this should be impossible to hit with
721 // a legitimate input.
722 iters += 1;
723 assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
724 }
725 self.label_offsets[label.0 as usize]
726
727 // Post-invariant: no mutations.
728 }
729
730 /// Emit a reference to the given label with the given reference type (i.e.,
731 /// branch-instruction format) at the current offset. This is like a
732 /// relocation, but handled internally.
733 ///
734 /// This can be called before the branch is actually emitted; fixups will
735 /// not happen until an island is emitted or the buffer is finished.
736 pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
737 trace!(
738 "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
739 offset,
740 label,
741 kind
742 );
743
744 // Add the fixup, and update the worst-case island size based on a
745 // veneer for this label use.
746 let fixup = MachLabelFixup {
747 label,
748 offset,
749 kind,
750 };
751 self.pending_fixup_deadline = self.pending_fixup_deadline.min(fixup.deadline());
752 self.pending_fixup_records.push(fixup);
753
754 // Post-invariant: no mutations to branches/labels data structures.
755 }
756
757 /// Inform the buffer of an unconditional branch at the given offset,
758 /// targeting the given label. May be used to optimize branches.
759 /// The last added label-use must correspond to this branch.
760 /// This must be called when the current offset is equal to `start`; i.e.,
761 /// before actually emitting the branch. This implies that for a branch that
762 /// uses a label and is eligible for optimizations by the MachBuffer, the
763 /// proper sequence is:
764 ///
765 /// - Call `use_label_at_offset()` to emit the fixup record.
766 /// - Call `add_uncond_branch()` to make note of the branch.
767 /// - Emit the bytes for the branch's machine code.
768 ///
769 /// Additional requirement: no labels may be bound between `start` and `end`
770 /// (exclusive on both ends).
771 pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
772 debug_assert!(
773 !self.open_patchable,
774 "Branch instruction inserted within a patchable region"
775 );
776 assert!(self.cur_offset() == start);
777 debug_assert!(end > start);
778 assert!(!self.pending_fixup_records.is_empty());
779 let fixup = self.pending_fixup_records.len() - 1;
780 self.lazily_clear_labels_at_tail();
781 self.latest_branches.push(MachBranch {
782 start,
783 end,
784 target,
785 fixup,
786 inverted: None,
787 labels_at_this_branch: self.labels_at_tail.clone(),
788 });
789
790 // Post-invariant: we asserted branch start is current tail; the list of
791 // labels at branch is cloned from list of labels at current tail.
792 }
793
794 /// Inform the buffer of a conditional branch at the given offset,
795 /// targeting the given label. May be used to optimize branches.
796 /// The last added label-use must correspond to this branch.
797 ///
798 /// Additional requirement: no labels may be bound between `start` and `end`
799 /// (exclusive on both ends).
800 pub fn add_cond_branch(
801 &mut self,
802 start: CodeOffset,
803 end: CodeOffset,
804 target: MachLabel,
805 inverted: &[u8],
806 ) {
807 debug_assert!(
808 !self.open_patchable,
809 "Branch instruction inserted within a patchable region"
810 );
811 assert!(self.cur_offset() == start);
812 debug_assert!(end > start);
813 assert!(!self.pending_fixup_records.is_empty());
814 debug_assert!(inverted.len() == (end - start) as usize);
815 let fixup = self.pending_fixup_records.len() - 1;
816 let inverted = Some(SmallVec::from(inverted));
817 self.lazily_clear_labels_at_tail();
818 self.latest_branches.push(MachBranch {
819 start,
820 end,
821 target,
822 fixup,
823 inverted,
824 labels_at_this_branch: self.labels_at_tail.clone(),
825 });
826
827 // Post-invariant: we asserted branch start is current tail; labels at
828 // branch list is cloned from list of labels at current tail.
829 }
830
831 fn truncate_last_branch(&mut self) {
832 debug_assert!(
833 !self.open_patchable,
834 "Branch instruction truncated within a patchable region"
835 );
836
837 self.lazily_clear_labels_at_tail();
838 // Invariants hold at this point.
839
840 let b = self.latest_branches.pop().unwrap();
841 assert!(b.end == self.cur_offset());
842
843 // State:
844 // [PRE CODE]
845 // Offset b.start, b.labels_at_this_branch:
846 // [BRANCH CODE]
847 // cur_off, self.labels_at_tail -->
848 // (end of buffer)
849 self.data.truncate(b.start as usize);
850 self.pending_fixup_records.truncate(b.fixup);
851 while let Some(last_srcloc) = self.srclocs.last_mut() {
852 if last_srcloc.end <= b.start {
853 break;
854 }
855 if last_srcloc.start < b.start {
856 last_srcloc.end = b.start;
857 break;
858 }
859 self.srclocs.pop();
860 }
861 // State:
862 // [PRE CODE]
863 // cur_off, Offset b.start, b.labels_at_this_branch:
864 // (end of buffer)
865 //
866 // self.labels_at_tail --> (past end of buffer)
867 let cur_off = self.cur_offset();
868 self.labels_at_tail_off = cur_off;
869 // State:
870 // [PRE CODE]
871 // cur_off, Offset b.start, b.labels_at_this_branch,
872 // self.labels_at_tail:
873 // (end of buffer)
874 //
875 // resolve_label_offset(l) for l in labels_at_tail:
876 // (past end of buffer)
877
878 trace!(
879 "truncate_last_branch: truncated {:?}; off now {}",
880 b,
881 cur_off
882 );
883
884 // Fix up resolved label offsets for labels at tail.
885 for &l in &self.labels_at_tail {
886 self.label_offsets[l.0 as usize] = cur_off;
887 }
888 // Old labels_at_this_branch are now at cur_off.
889 self.labels_at_tail
890 .extend(b.labels_at_this_branch.into_iter());
891
892 // Post-invariant: this operation is defined to truncate the buffer,
893 // which moves cur_off backward, and to move labels at the end of the
894 // buffer back to the start-of-branch offset.
895 //
896 // latest_branches satisfies all invariants:
897 // - it has no branches past the end of the buffer (branches are in
898 // order, we removed the last one, and we truncated the buffer to just
899 // before the start of that branch)
900 // - no labels were moved to lower offsets than the (new) cur_off, so
901 // the labels_at_this_branch list for any other branch need not change.
902 //
903 // labels_at_tail satisfies all invariants:
904 // - all labels that were at the tail after the truncated branch are
905 // moved backward to just before the branch, which becomes the new tail;
906 // thus every element in the list should remain (ensured by `.extend()`
907 // above).
908 // - all labels that refer to the new tail, which is the start-offset of
909 // the truncated branch, must be present. The `labels_at_this_branch`
910 // list in the truncated branch's record is a complete and precise list
911 // of exactly these labels; we append these to labels_at_tail.
912 // - labels_at_tail_off is at cur_off after truncation occurs, so the
913 // list is valid (not to be lazily cleared).
914 //
915 // The stated operation was performed:
916 // - For each label at the end of the buffer prior to this method, it
917 // now resolves to the new (truncated) end of the buffer: it must have
918 // been in `labels_at_tail` (this list is precise and complete, and
919 // the tail was at the end of the truncated branch on entry), and we
920 // iterate over this list and set `label_offsets` to the new tail.
921 // None of these labels could have been an alias (by invariant), so
922 // `label_offsets` is authoritative for each.
923 // - No other labels will be past the end of the buffer, because of the
924 // requirement that no labels be bound to the middle of branch ranges
925 // (see comments to `add_{cond,uncond}_branch()`).
926 // - The buffer is truncated to just before the last branch, and the
927 // fixup record referring to that last branch is removed.
928 }
929
930 /// Performs various optimizations on branches pointing at the current label.
931 pub fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) {
932 if ctrl_plane.get_decision() {
933 return;
934 }
935
936 self.lazily_clear_labels_at_tail();
937 // Invariants valid at this point.
938
939 trace!(
940 "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
941 self.latest_branches,
942 self.labels_at_tail,
943 self.pending_fixup_records
944 );
945
946 // We continue to munch on branches at the tail of the buffer until no
947 // more rules apply. Note that the loop only continues if a branch is
948 // actually truncated (or if labels are redirected away from a branch),
949 // so this always makes progress.
950 while let Some(b) = self.latest_branches.last() {
951 let cur_off = self.cur_offset();
952 trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
953 // If there has been any code emission since the end of the last branch or
954 // label definition, then there's nothing we can edit (because we
955 // don't move code once placed, only back up and overwrite), so
956 // clear the records and finish.
957 if b.end < cur_off {
958 break;
959 }
960
961 // If the "labels at this branch" list on this branch is
962 // longer than a threshold, don't do any simplification,
963 // and let the branch remain to separate those labels from
964 // the current tail. This avoids quadratic behavior (see
965 // #3468): otherwise, if a long string of "goto next;
966 // next:" patterns are emitted, all of the labels will
967 // coalesce into a long list of aliases for the current
968 // buffer tail. We must track all aliases of the current
969 // tail for correctness, but we are also allowed to skip
970 // optimization (removal) of any branch, so we take the
971 // escape hatch here and let it stand. In effect this
972 // "spreads" the many thousands of labels in the
973 // pathological case among an actual (harmless but
974 // suboptimal) instruction once per N labels.
975 if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
976 break;
977 }
978
979 // Invariant: we are looking at a branch that ends at the tail of
980 // the buffer.
981
982 // For any branch, conditional or unconditional:
983 // - If the target is a label at the current offset, then remove
984 // the conditional branch, and reset all labels that targeted
985 // the current offset (end of branch) to the truncated
986 // end-of-code.
987 //
988 // Preserves execution semantics: a branch to its own fallthrough
989 // address is equivalent to a no-op; in both cases, nextPC is the
990 // fallthrough.
991 if self.resolve_label_offset(b.target) == cur_off {
992 trace!("branch with target == cur off; truncating");
993 self.truncate_last_branch();
994 continue;
995 }
996
997 // If latest is an unconditional branch:
998 //
999 // - If the branch's target is not its own start address, then for
1000 // each label at the start of branch, make the label an alias of the
1001 // branch target, and remove the label from the "labels at this
1002 // branch" list.
1003 //
1004 // - Preserves execution semantics: an unconditional branch's
1005 // only effect is to set PC to a new PC; this change simply
1006 // collapses one step in the step-semantics.
1007 //
1008 // - Post-invariant: the labels that were bound to the start of
1009 // this branch become aliases, so they must not be present in any
1010 // labels-at-this-branch list or the labels-at-tail list. The
1011 // labels are removed form the latest-branch record's
1012 // labels-at-this-branch list, and are never placed in the
1013 // labels-at-tail list. Furthermore, it is correct that they are
1014 // not in either list, because they are now aliases, and labels
1015 // that are aliases remain aliases forever.
1016 //
1017 // - If there is a prior unconditional branch that ends just before
1018 // this one begins, and this branch has no labels bound to its
1019 // start, then we can truncate this branch, because it is entirely
1020 // unreachable (we have redirected all labels that make it
1021 // reachable otherwise). Do so and continue around the loop.
1022 //
1023 // - Preserves execution semantics: the branch is unreachable,
1024 // because execution can only flow into an instruction from the
1025 // prior instruction's fallthrough or from a branch bound to that
1026 // instruction's start offset. Unconditional branches have no
1027 // fallthrough, so if the prior instruction is an unconditional
1028 // branch, no fallthrough entry can happen. The
1029 // labels-at-this-branch list is complete (by invariant), so if it
1030 // is empty, then the instruction is entirely unreachable. Thus,
1031 // it can be removed.
1032 //
1033 // - Post-invariant: ensured by truncate_last_branch().
1034 //
1035 // - If there is a prior conditional branch whose target label
1036 // resolves to the current offset (branches around the
1037 // unconditional branch), then remove the unconditional branch,
1038 // and make the target of the unconditional the target of the
1039 // conditional instead.
1040 //
1041 // - Preserves execution semantics: previously we had:
1042 //
1043 // L1:
1044 // cond_br L2
1045 // br L3
1046 // L2:
1047 // (end of buffer)
1048 //
1049 // by removing the last branch, we have:
1050 //
1051 // L1:
1052 // cond_br L2
1053 // L2:
1054 // (end of buffer)
1055 //
1056 // we then fix up the records for the conditional branch to
1057 // have:
1058 //
1059 // L1:
1060 // cond_br.inverted L3
1061 // L2:
1062 //
1063 // In the original code, control flow reaches L2 when the
1064 // conditional branch's predicate is true, and L3 otherwise. In
1065 // the optimized code, the same is true.
1066 //
1067 // - Post-invariant: all edits to latest_branches and
1068 // labels_at_tail are performed by `truncate_last_branch()`,
1069 // which maintains the invariants at each step.
1070
1071 if b.is_uncond() {
1072 // Set any label equal to current branch's start as an alias of
1073 // the branch's target, if the target is not the branch itself
1074 // (i.e., an infinite loop).
1075 //
1076 // We cannot perform this aliasing if the target of this branch
1077 // ultimately aliases back here; if so, we need to keep this
1078 // branch, so break out of this loop entirely (and clear the
1079 // latest-branches list below).
1080 //
1081 // Note that this check is what prevents cycles from forming in
1082 // `self.label_aliases`. To see why, consider an arbitrary start
1083 // state:
1084 //
1085 // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
1086 // Ln, which is not aliased.
1087 //
1088 // We would create a cycle if we assigned label_aliases[Ln]
1089 // = L1. Note that the below assignment is the only write
1090 // to label_aliases.
1091 //
1092 // By our other invariants, we have that Ln (`l` below)
1093 // resolves to the offset `b.start`, because it is in the
1094 // set `b.labels_at_this_branch`.
1095 //
1096 // If L1 were already aliased, through some arbitrarily deep
1097 // chain, to Ln, then it must also resolve to this offset
1098 // `b.start`.
1099 //
1100 // By checking the resolution of `L1` against this offset,
1101 // and aborting this branch-simplification if they are
1102 // equal, we prevent the below assignment from ever creating
1103 // a cycle.
1104 if self.resolve_label_offset(b.target) != b.start {
1105 let redirected = b.labels_at_this_branch.len();
1106 for &l in &b.labels_at_this_branch {
1107 trace!(
1108 " -> label at start of branch {:?} redirected to target {:?}",
1109 l,
1110 b.target
1111 );
1112 self.label_aliases[l.0 as usize] = b.target;
1113 // NOTE: we continue to ensure the invariant that labels
1114 // pointing to tail of buffer are in `labels_at_tail`
1115 // because we already ensured above that the last branch
1116 // cannot have a target of `cur_off`; so we never have
1117 // to put the label into `labels_at_tail` when moving it
1118 // here.
1119 }
1120 // Maintain invariant: all branches have been redirected
1121 // and are no longer pointing at the start of this branch.
1122 let mut_b = self.latest_branches.last_mut().unwrap();
1123 mut_b.labels_at_this_branch.clear();
1124
1125 if redirected > 0 {
1126 trace!(" -> after label redirects, restarting loop");
1127 continue;
1128 }
1129 } else {
1130 break;
1131 }
1132
1133 let b = self.latest_branches.last().unwrap();
1134
1135 // Examine any immediately preceding branch.
1136 if self.latest_branches.len() > 1 {
1137 let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
1138 trace!(" -> more than one branch; prev_b = {:?}", prev_b);
1139 // This uncond is immediately after another uncond; we
1140 // should have already redirected labels to this uncond away
1141 // (but check to be sure); so we can truncate this uncond.
1142 if prev_b.is_uncond()
1143 && prev_b.end == b.start
1144 && b.labels_at_this_branch.is_empty()
1145 {
1146 trace!(" -> uncond follows another uncond; truncating");
1147 self.truncate_last_branch();
1148 continue;
1149 }
1150
1151 // This uncond is immediately after a conditional, and the
1152 // conditional's target is the end of this uncond, and we've
1153 // already redirected labels to this uncond away; so we can
1154 // truncate this uncond, flip the sense of the conditional, and
1155 // set the conditional's target (in `latest_branches` and in
1156 // `fixup_records`) to the uncond's target.
1157 if prev_b.is_cond()
1158 && prev_b.end == b.start
1159 && self.resolve_label_offset(prev_b.target) == cur_off
1160 {
1161 trace!(" -> uncond follows a conditional, and conditional's target resolves to current offset");
1162 // Save the target of the uncond (this becomes the
1163 // target of the cond), and truncate the uncond.
1164 let target = b.target;
1165 let data = prev_b.inverted.clone().unwrap();
1166 self.truncate_last_branch();
1167
1168 // Mutate the code and cond branch.
1169 let off_before_edit = self.cur_offset();
1170 let prev_b = self.latest_branches.last_mut().unwrap();
1171 let not_inverted = SmallVec::from(
1172 &self.data[(prev_b.start as usize)..(prev_b.end as usize)],
1173 );
1174
1175 // Low-level edit: replaces bytes of branch with
1176 // inverted form. cur_off remains the same afterward, so
1177 // we do not need to modify label data structures.
1178 self.data.truncate(prev_b.start as usize);
1179 self.data.extend_from_slice(&data[..]);
1180
1181 // Save the original code as the inversion of the
1182 // inverted branch, in case we later edit this branch
1183 // again.
1184 prev_b.inverted = Some(not_inverted);
1185 self.pending_fixup_records[prev_b.fixup].label = target;
1186 trace!(" -> reassigning target of condbr to {:?}", target);
1187 prev_b.target = target;
1188 debug_assert_eq!(off_before_edit, self.cur_offset());
1189 continue;
1190 }
1191 }
1192 }
1193
1194 // If we couldn't do anything with the last branch, then break.
1195 break;
1196 }
1197
1198 self.purge_latest_branches();
1199
1200 trace!(
1201 "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1202 self.latest_branches,
1203 self.labels_at_tail,
1204 self.pending_fixup_records
1205 );
1206 }
1207
1208 fn purge_latest_branches(&mut self) {
1209 // All of our branch simplification rules work only if a branch ends at
1210 // the tail of the buffer, with no following code; and branches are in
1211 // order in latest_branches; so if the last entry ends prior to
1212 // cur_offset, then clear all entries.
1213 let cur_off = self.cur_offset();
1214 if let Some(l) = self.latest_branches.last() {
1215 if l.end < cur_off {
1216 trace!("purge_latest_branches: removing branch {:?}", l);
1217 self.latest_branches.clear();
1218 }
1219 }
1220
1221 // Post-invariant: no invariant requires any branch to appear in
1222 // `latest_branches`; it is always optional. The list-clear above thus
1223 // preserves all semantics.
1224 }
1225
1226 /// Emit a trap at some point in the future with the specified code and
1227 /// stack map.
1228 ///
1229 /// This function returns a [`MachLabel`] which will be the future address
1230 /// of the trap. Jumps should refer to this label, likely by using the
1231 /// [`MachBuffer::use_label_at_offset`] method, to get a relocation
1232 /// patched in once the address of the trap is known.
1233 ///
1234 /// This will batch all traps into the end of the function.
1235 pub fn defer_trap(&mut self, code: TrapCode) -> MachLabel {
1236 let label = self.get_label();
1237 self.pending_traps.push(MachLabelTrap {
1238 label,
1239 code,
1240 loc: self.cur_srcloc.map(|(_start, loc)| loc),
1241 });
1242 label
1243 }
1244
1245 /// Is an island needed within the next N bytes?
1246 pub fn island_needed(&self, distance: CodeOffset) -> bool {
1247 let deadline = match self.fixup_records.peek() {
1248 Some(fixup) => fixup.deadline().min(self.pending_fixup_deadline),
1249 None => self.pending_fixup_deadline,
1250 };
1251 deadline < u32::MAX && self.worst_case_end_of_island(distance) > deadline
1252 }
1253
1254 /// Returns the maximal offset that islands can reach if `distance` more
1255 /// bytes are appended.
1256 ///
1257 /// This is used to determine if veneers need insertions since jumps that
1258 /// can't reach past this point must get a veneer of some form.
1259 fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {
1260 // Assume that all fixups will require veneers and that the veneers are
1261 // the worst-case size for each platform. This is an over-generalization
1262 // to avoid iterating over the `fixup_records` list or maintaining
1263 // information about it as we go along.
1264 let island_worst_case_size = ((self.fixup_records.len() + self.pending_fixup_records.len())
1265 as u32)
1266 * (I::LabelUse::worst_case_veneer_size())
1267 + self.pending_constants_size
1268 + (self.pending_traps.len() * I::TRAP_OPCODE.len()) as u32;
1269 self.cur_offset()
1270 .saturating_add(distance)
1271 .saturating_add(island_worst_case_size)
1272 }
1273
1274 /// Emit all pending constants and required pending veneers.
1275 ///
1276 /// Should only be called if `island_needed()` returns true, i.e., if we
1277 /// actually reach a deadline. It's not necessarily a problem to do so
1278 /// otherwise but it may result in unnecessary work during emission.
1279 pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) {
1280 self.emit_island_maybe_forced(ForceVeneers::No, distance, ctrl_plane);
1281 }
1282
1283 /// Same as `emit_island`, but an internal API with a `force_veneers`
1284 /// argument to force all veneers to always get emitted for debugging.
1285 fn emit_island_maybe_forced(
1286 &mut self,
1287 force_veneers: ForceVeneers,
1288 distance: CodeOffset,
1289 ctrl_plane: &mut ControlPlane,
1290 ) {
1291 // We're going to purge fixups, so no latest-branch editing can happen
1292 // anymore.
1293 self.latest_branches.clear();
1294
1295 // End the current location tracking since anything emitted during this
1296 // function shouldn't be attributed to whatever the current source
1297 // location is.
1298 //
1299 // Note that the current source location, if it's set right now, will be
1300 // restored at the end of this island emission.
1301 let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);
1302 if cur_loc.is_some() {
1303 self.end_srcloc();
1304 }
1305
1306 let forced_threshold = self.worst_case_end_of_island(distance);
1307
1308 // First flush out all traps/constants so we have more labels in case
1309 // fixups are applied against these labels.
1310 //
1311 // Note that traps are placed first since this typically happens at the
1312 // end of the function and for disassemblers we try to keep all the code
1313 // contiguously together.
1314 for MachLabelTrap { label, code, loc } in mem::take(&mut self.pending_traps) {
1315 // If this trap has source information associated with it then
1316 // emit this information for the trap instruction going out now too.
1317 if let Some(loc) = loc {
1318 self.start_srcloc(loc);
1319 }
1320 self.align_to(I::LabelUse::ALIGN);
1321 self.bind_label(label, ctrl_plane);
1322 self.add_trap(code);
1323 self.put_data(I::TRAP_OPCODE);
1324 if loc.is_some() {
1325 self.end_srcloc();
1326 }
1327 }
1328
1329 for constant in mem::take(&mut self.pending_constants) {
1330 let MachBufferConstant { align, size, .. } = self.constants[constant];
1331 let label = self.constants[constant].upcoming_label.take().unwrap();
1332 self.align_to(align);
1333 self.bind_label(label, ctrl_plane);
1334 self.used_constants.push((constant, self.cur_offset()));
1335 self.get_appended_space(size);
1336 }
1337
1338 // Either handle all pending fixups because they're ready or move them
1339 // onto the `BinaryHeap` tracking all pending fixups if they aren't
1340 // ready.
1341 assert!(self.latest_branches.is_empty());
1342 for fixup in mem::take(&mut self.pending_fixup_records) {
1343 if self.should_apply_fixup(&fixup, forced_threshold) {
1344 self.handle_fixup(fixup, force_veneers, forced_threshold);
1345 } else {
1346 self.fixup_records.push(fixup);
1347 }
1348 }
1349 self.pending_fixup_deadline = u32::MAX;
1350 while let Some(fixup) = self.fixup_records.peek() {
1351 trace!("emit_island: fixup {:?}", fixup);
1352
1353 // If this fixup shouldn't be applied, that means its label isn't
1354 // defined yet and there'll be remaining space to apply a veneer if
1355 // necessary in the future after this island. In that situation
1356 // because `fixup_records` is sorted by deadline this loop can
1357 // exit.
1358 if !self.should_apply_fixup(fixup, forced_threshold) {
1359 break;
1360 }
1361
1362 let fixup = self.fixup_records.pop().unwrap();
1363 self.handle_fixup(fixup, force_veneers, forced_threshold);
1364 }
1365
1366 if let Some(loc) = cur_loc {
1367 self.start_srcloc(loc);
1368 }
1369 }
1370
1371 fn should_apply_fixup(&self, fixup: &MachLabelFixup<I>, forced_threshold: CodeOffset) -> bool {
1372 let label_offset = self.resolve_label_offset(fixup.label);
1373 label_offset != UNKNOWN_LABEL_OFFSET || fixup.deadline() < forced_threshold
1374 }
1375
1376 fn handle_fixup(
1377 &mut self,
1378 fixup: MachLabelFixup<I>,
1379 force_veneers: ForceVeneers,
1380 forced_threshold: CodeOffset,
1381 ) {
1382 let MachLabelFixup {
1383 label,
1384 offset,
1385 kind,
1386 } = fixup;
1387 let start = offset as usize;
1388 let end = (offset + kind.patch_size()) as usize;
1389 let label_offset = self.resolve_label_offset(label);
1390
1391 if label_offset != UNKNOWN_LABEL_OFFSET {
1392 // If the offset of the label for this fixup is known then
1393 // we're going to do something here-and-now. We're either going
1394 // to patch the original offset because it's an in-bounds jump,
1395 // or we're going to generate a veneer, patch the fixup to jump
1396 // to the veneer, and then keep going.
1397 //
1398 // If the label comes after the original fixup, then we should
1399 // be guaranteed that the jump is in-bounds. Otherwise there's
1400 // a bug somewhere because this method wasn't called soon
1401 // enough. All forward-jumps are tracked and should get veneers
1402 // before their deadline comes and they're unable to jump
1403 // further.
1404 //
1405 // Otherwise if the label is before the fixup, then that's a
1406 // backwards jump. If it's past the maximum negative range
1407 // then we'll emit a veneer that to jump forward to which can
1408 // then jump backwards.
1409 let veneer_required = if label_offset >= offset {
1410 assert!((label_offset - offset) <= kind.max_pos_range());
1411 false
1412 } else {
1413 (offset - label_offset) > kind.max_neg_range()
1414 };
1415 trace!(
1416 " -> label_offset = {}, known, required = {} (pos {} neg {})",
1417 label_offset,
1418 veneer_required,
1419 kind.max_pos_range(),
1420 kind.max_neg_range()
1421 );
1422
1423 if (force_veneers == ForceVeneers::Yes && kind.supports_veneer()) || veneer_required {
1424 self.emit_veneer(label, offset, kind);
1425 } else {
1426 let slice = &mut self.data[start..end];
1427 trace!("patching in-range!");
1428 kind.patch(slice, offset, label_offset);
1429 }
1430 } else {
1431 // If the offset of this label is not known at this time then
1432 // that means that a veneer is required because after this
1433 // island the target can't be in range of the original target.
1434 assert!(forced_threshold - offset > kind.max_pos_range());
1435 self.emit_veneer(label, offset, kind);
1436 }
1437 }
1438
1439 /// Emits a "veneer" the `kind` code at `offset` to jump to `label`.
1440 ///
1441 /// This will generate extra machine code, using `kind`, to get a
1442 /// larger-jump-kind than `kind` allows. The code at `offset` is then
1443 /// patched to jump to our new code, and then the new code is enqueued for
1444 /// a fixup to get processed at some later time.
1445 fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {
1446 // If this `kind` doesn't support a veneer then that's a bug in the
1447 // backend because we need to implement support for such a veneer.
1448 assert!(
1449 kind.supports_veneer(),
1450 "jump beyond the range of {:?} but a veneer isn't supported",
1451 kind,
1452 );
1453
1454 // Allocate space for a veneer in the island.
1455 self.align_to(I::LabelUse::ALIGN);
1456 let veneer_offset = self.cur_offset();
1457 trace!("making a veneer at {}", veneer_offset);
1458 let start = offset as usize;
1459 let end = (offset + kind.patch_size()) as usize;
1460 let slice = &mut self.data[start..end];
1461 // Patch the original label use to refer to the veneer.
1462 trace!(
1463 "patching original at offset {} to veneer offset {}",
1464 offset,
1465 veneer_offset
1466 );
1467 kind.patch(slice, offset, veneer_offset);
1468 // Generate the veneer.
1469 let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
1470 let (veneer_fixup_off, veneer_label_use) =
1471 kind.generate_veneer(veneer_slice, veneer_offset);
1472 trace!(
1473 "generated veneer; fixup offset {}, label_use {:?}",
1474 veneer_fixup_off,
1475 veneer_label_use
1476 );
1477 // Register a new use of `label` with our new veneer fixup and
1478 // offset. This'll recalculate deadlines accordingly and
1479 // enqueue this fixup to get processed at some later
1480 // time.
1481 self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);
1482 }
1483
1484 fn finish_emission_maybe_forcing_veneers(
1485 &mut self,
1486 force_veneers: ForceVeneers,
1487 ctrl_plane: &mut ControlPlane,
1488 ) {
1489 while !self.pending_constants.is_empty()
1490 || !self.pending_traps.is_empty()
1491 || !self.fixup_records.is_empty()
1492 || !self.pending_fixup_records.is_empty()
1493 {
1494 // `emit_island()` will emit any pending veneers and constants, and
1495 // as a side-effect, will also take care of any fixups with resolved
1496 // labels eagerly.
1497 self.emit_island_maybe_forced(force_veneers, u32::MAX, ctrl_plane);
1498 }
1499
1500 // Ensure that all labels have been fixed up after the last island is emitted. This is a
1501 // full (release-mode) assert because an unresolved label means the emitted code is
1502 // incorrect.
1503 assert!(self.fixup_records.is_empty());
1504 assert!(self.pending_fixup_records.is_empty());
1505 }
1506
1507 /// Finish any deferred emissions and/or fixups.
1508 pub fn finish(
1509 mut self,
1510 constants: &VCodeConstants,
1511 ctrl_plane: &mut ControlPlane,
1512 ) -> MachBufferFinalized<Stencil> {
1513 let _tt = timing::vcode_emit_finish();
1514
1515 self.finish_emission_maybe_forcing_veneers(ForceVeneers::No, ctrl_plane);
1516
1517 let alignment = self.finish_constants(constants);
1518
1519 // Resolve all labels to their offsets.
1520 let finalized_relocs = self
1521 .relocs
1522 .iter()
1523 .map(|reloc| FinalizedMachReloc {
1524 offset: reloc.offset,
1525 kind: reloc.kind,
1526 addend: reloc.addend,
1527 target: match &reloc.target {
1528 RelocTarget::ExternalName(name) => {
1529 FinalizedRelocTarget::ExternalName(name.clone())
1530 }
1531 RelocTarget::Label(label) => {
1532 FinalizedRelocTarget::Func(self.resolve_label_offset(*label))
1533 }
1534 },
1535 })
1536 .collect();
1537
1538 let mut srclocs = self.srclocs;
1539 srclocs.sort_by_key(|entry| entry.start);
1540
1541 MachBufferFinalized {
1542 data: self.data,
1543 relocs: finalized_relocs,
1544 traps: self.traps,
1545 call_sites: self.call_sites,
1546 srclocs,
1547 stack_maps: self.stack_maps,
1548 user_stack_maps: self.user_stack_maps,
1549 unwind_info: self.unwind_info,
1550 alignment,
1551 }
1552 }
1553
1554 /// Add an external relocation at the given offset from current offset.
1555 pub fn add_reloc_at_offset<T: Into<RelocTarget> + Clone>(
1556 &mut self,
1557 offset: CodeOffset,
1558 kind: Reloc,
1559 target: &T,
1560 addend: Addend,
1561 ) {
1562 let target: RelocTarget = target.clone().into();
1563 // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally
1564 // generate a label-use statement to track whether an island is possibly
1565 // needed to escape this function to actually get to the external name.
1566 // This is most likely to come up on AArch64 where calls between
1567 // functions use a 26-bit signed offset which gives +/- 64MB. This means
1568 // that if a function is 128MB in size and there's a call in the middle
1569 // it's impossible to reach the actual target. Also, while it's
1570 // technically possible to jump to the start of a function and then jump
1571 // further, island insertion below always inserts islands after
1572 // previously appended code so for Cranelift's own implementation this
1573 // is also a problem for 64MB functions on AArch64 which start with a
1574 // call instruction, those won't be able to escape.
1575 //
1576 // Ideally what needs to happen here is that a `LabelUse` is
1577 // transparently generated (or call-sites of this function are audited
1578 // to generate a `LabelUse` instead) and tracked internally. The actual
1579 // relocation would then change over time if and when a veneer is
1580 // inserted, where the relocation here would be patched by this
1581 // `MachBuffer` to jump to the veneer. The problem, though, is that all
1582 // this still needs to end up, in the case of a singular function,
1583 // generating a final relocation pointing either to this particular
1584 // relocation or to the veneer inserted. Additionally
1585 // `MachBuffer` needs the concept of a label which will never be
1586 // resolved, so `emit_island` doesn't trip over not actually ever
1587 // knowning what some labels are. Currently the loop in
1588 // `finish_emission_maybe_forcing_veneers` would otherwise infinitely
1589 // loop.
1590 //
1591 // For now this means that because relocs aren't tracked at all that
1592 // AArch64 functions have a rough size limits of 64MB. For now that's
1593 // somewhat reasonable and the failure mode is a panic in `MachBuffer`
1594 // when a relocation can't otherwise be resolved later, so it shouldn't
1595 // actually result in any memory unsafety or anything like that.
1596 self.relocs.push(MachReloc {
1597 offset: self.data.len() as CodeOffset + offset,
1598 kind,
1599 target,
1600 addend,
1601 });
1602 }
1603
1604 /// Add an external relocation at the current offset.
1605 pub fn add_reloc<T: Into<RelocTarget> + Clone>(
1606 &mut self,
1607 kind: Reloc,
1608 target: &T,
1609 addend: Addend,
1610 ) {
1611 self.add_reloc_at_offset(0, kind, target, addend);
1612 }
1613
1614 /// Add a trap record at the current offset.
1615 pub fn add_trap(&mut self, code: TrapCode) {
1616 self.traps.push(MachTrap {
1617 offset: self.data.len() as CodeOffset,
1618 code,
1619 });
1620 }
1621
1622 /// Add a call-site record at the current offset.
1623 pub fn add_call_site(&mut self) {
1624 self.call_sites.push(MachCallSite {
1625 ret_addr: self.data.len() as CodeOffset,
1626 });
1627 }
1628
1629 /// Add an unwind record at the current offset.
1630 pub fn add_unwind(&mut self, unwind: UnwindInst) {
1631 self.unwind_info.push((self.cur_offset(), unwind));
1632 }
1633
1634 /// Set the `SourceLoc` for code from this offset until the offset at the
1635 /// next call to `end_srcloc()`.
1636 /// Returns the current [CodeOffset] and [RelSourceLoc].
1637 pub fn start_srcloc(&mut self, loc: RelSourceLoc) -> (CodeOffset, RelSourceLoc) {
1638 let cur = (self.cur_offset(), loc);
1639 self.cur_srcloc = Some(cur);
1640 cur
1641 }
1642
1643 /// Mark the end of the `SourceLoc` segment started at the last
1644 /// `start_srcloc()` call.
1645 pub fn end_srcloc(&mut self) {
1646 let (start, loc) = self
1647 .cur_srcloc
1648 .take()
1649 .expect("end_srcloc() called without start_srcloc()");
1650 let end = self.cur_offset();
1651 // Skip zero-length extends.
1652 debug_assert!(end >= start);
1653 if end > start {
1654 self.srclocs.push(MachSrcLoc { start, end, loc });
1655 }
1656 }
1657
1658 /// Add stack map metadata for this program point: a set of stack offsets
1659 /// (from SP upward) that contain live references.
1660 pub fn add_stack_map(&mut self, extent: StackMapExtent, stack_map: StackMap) {
1661 let (start, end) = match extent {
1662 StackMapExtent::UpcomingBytes(insn_len) => {
1663 let start_offset = self.cur_offset();
1664 (start_offset, start_offset + insn_len)
1665 }
1666 StackMapExtent::StartedAtOffset(start_offset) => {
1667 let end_offset = self.cur_offset();
1668 debug_assert!(end_offset >= start_offset);
1669 (start_offset, end_offset)
1670 }
1671 };
1672 trace!("Adding stack map for offsets {start:#x}..{end:#x}: {stack_map:?}");
1673 self.stack_maps.push(MachStackMap {
1674 offset: start,
1675 offset_end: end,
1676 stack_map,
1677 });
1678 }
1679
1680 /// Push a user stack map onto this buffer.
1681 ///
1682 /// The stack map is associated with the given `return_addr` code
1683 /// offset. This must be the PC for the instruction just *after* this stack
1684 /// map's associated instruction. For example in the sequence `call $foo;
1685 /// add r8, rax`, the `return_addr` must be the offset of the start of the
1686 /// `add` instruction.
1687 ///
1688 /// Stack maps must be pushed in sorted `return_addr` order.
1689 pub fn push_user_stack_map(
1690 &mut self,
1691 emit_state: &I::State,
1692 return_addr: CodeOffset,
1693 stack_map: ir::UserStackMap,
1694 ) {
1695 let span = emit_state.frame_layout().active_size();
1696 trace!("Adding user stack map @ {return_addr:#x} spanning {span} bytes: {stack_map:?}");
1697
1698 debug_assert!(
1699 self.user_stack_maps
1700 .last()
1701 .map_or(true, |(prev_addr, _, _)| *prev_addr < return_addr),
1702 "pushed stack maps out of order: {} is not less than {}",
1703 self.user_stack_maps.last().unwrap().0,
1704 return_addr,
1705 );
1706
1707 self.user_stack_maps.push((return_addr, span, stack_map));
1708 }
1709}
1710
1711impl<T: CompilePhase> MachBufferFinalized<T> {
1712 /// Get a list of source location mapping tuples in sorted-by-start-offset order.
1713 pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {
1714 &self.srclocs[..]
1715 }
1716
1717 /// Get the total required size for the code.
1718 pub fn total_size(&self) -> CodeOffset {
1719 self.data.len() as CodeOffset
1720 }
1721
1722 /// Return the code in this mach buffer as a hex string for testing purposes.
1723 pub fn stringify_code_bytes(&self) -> String {
1724 // This is pretty lame, but whatever ..
1725 use std::fmt::Write;
1726 let mut s = String::with_capacity(self.data.len() * 2);
1727 for b in &self.data {
1728 write!(&mut s, "{:02X}", b).unwrap();
1729 }
1730 s
1731 }
1732
1733 /// Get the code bytes.
1734 pub fn data(&self) -> &[u8] {
1735 // N.B.: we emit every section into the .text section as far as
1736 // the `CodeSink` is concerned; we do not bother to segregate
1737 // the contents into the actual program text, the jumptable and the
1738 // rodata (constant pool). This allows us to generate code assuming
1739 // that these will not be relocated relative to each other, and avoids
1740 // having to designate each section as belonging in one of the three
1741 // fixed categories defined by `CodeSink`. If this becomes a problem
1742 // later (e.g. because of memory permissions or similar), we can
1743 // add this designation and segregate the output; take care, however,
1744 // to add the appropriate relocations in this case.
1745
1746 &self.data[..]
1747 }
1748
1749 /// Get the list of external relocations for this code.
1750 pub fn relocs(&self) -> &[FinalizedMachReloc] {
1751 &self.relocs[..]
1752 }
1753
1754 /// Get the list of trap records for this code.
1755 pub fn traps(&self) -> &[MachTrap] {
1756 &self.traps[..]
1757 }
1758
1759 /// Get the stack map metadata for this code.
1760 pub fn stack_maps(&self) -> &[MachStackMap] {
1761 &self.stack_maps[..]
1762 }
1763
1764 /// Take this buffer's stack map metadata.
1765 pub fn take_stack_maps(&mut self) -> SmallVec<[MachStackMap; 8]> {
1766 mem::take(&mut self.stack_maps)
1767 }
1768
1769 /// Get the list of call sites for this code.
1770 pub fn call_sites(&self) -> &[MachCallSite] {
1771 &self.call_sites[..]
1772 }
1773}
1774
1775/// Metadata about a constant.
1776struct MachBufferConstant {
1777 /// A label which has not yet been bound which can be used for this
1778 /// constant.
1779 ///
1780 /// This is lazily created when a label is requested for a constant and is
1781 /// cleared when a constant is emitted.
1782 upcoming_label: Option<MachLabel>,
1783 /// Required alignment.
1784 align: CodeOffset,
1785 /// The byte size of this constant.
1786 size: usize,
1787}
1788
1789/// A trap that is deferred to the next time an island is emitted for either
1790/// traps, constants, or fixups.
1791struct MachLabelTrap {
1792 /// This label will refer to the trap's offset.
1793 label: MachLabel,
1794 /// The code associated with this trap.
1795 code: TrapCode,
1796 /// An optional source location to assign for this trap.
1797 loc: Option<RelSourceLoc>,
1798}
1799
1800/// A fixup to perform on the buffer once code is emitted. Fixups always refer
1801/// to labels and patch the code based on label offsets. Hence, they are like
1802/// relocations, but internal to one buffer.
1803#[derive(Debug)]
1804struct MachLabelFixup<I: VCodeInst> {
1805 /// The label whose offset controls this fixup.
1806 label: MachLabel,
1807 /// The offset to fix up / patch to refer to this label.
1808 offset: CodeOffset,
1809 /// The kind of fixup. This is architecture-specific; each architecture may have,
1810 /// e.g., several types of branch instructions, each with differently-sized
1811 /// offset fields and different places within the instruction to place the
1812 /// bits.
1813 kind: I::LabelUse,
1814}
1815
1816impl<I: VCodeInst> MachLabelFixup<I> {
1817 fn deadline(&self) -> CodeOffset {
1818 self.offset.saturating_add(self.kind.max_pos_range())
1819 }
1820}
1821
1822impl<I: VCodeInst> PartialEq for MachLabelFixup<I> {
1823 fn eq(&self, other: &Self) -> bool {
1824 self.deadline() == other.deadline()
1825 }
1826}
1827
1828impl<I: VCodeInst> Eq for MachLabelFixup<I> {}
1829
1830impl<I: VCodeInst> PartialOrd for MachLabelFixup<I> {
1831 fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
1832 Some(self.cmp(other))
1833 }
1834}
1835
1836impl<I: VCodeInst> Ord for MachLabelFixup<I> {
1837 fn cmp(&self, other: &Self) -> Ordering {
1838 other.deadline().cmp(&self.deadline())
1839 }
1840}
1841
1842/// A relocation resulting from a compilation.
1843#[derive(Clone, Debug, PartialEq)]
1844#[cfg_attr(
1845 feature = "enable-serde",
1846 derive(serde_derive::Serialize, serde_derive::Deserialize)
1847)]
1848pub struct MachRelocBase<T> {
1849 /// The offset at which the relocation applies, *relative to the
1850 /// containing section*.
1851 pub offset: CodeOffset,
1852 /// The kind of relocation.
1853 pub kind: Reloc,
1854 /// The external symbol / name to which this relocation refers.
1855 pub target: T,
1856 /// The addend to add to the symbol value.
1857 pub addend: i64,
1858}
1859
1860type MachReloc = MachRelocBase<RelocTarget>;
1861
1862/// A relocation resulting from a compilation.
1863pub type FinalizedMachReloc = MachRelocBase<FinalizedRelocTarget>;
1864
1865/// A Relocation target
1866#[derive(Debug, Clone, PartialEq, Eq, Hash)]
1867pub enum RelocTarget {
1868 /// Points to an [ExternalName] outside the current function.
1869 ExternalName(ExternalName),
1870 /// Points to a [MachLabel] inside this function.
1871 /// This is different from [MachLabelFixup] in that both the relocation and the
1872 /// label will be emitted and are only resolved at link time.
1873 ///
1874 /// There is no reason to prefer this over [MachLabelFixup] unless the ABI requires it.
1875 Label(MachLabel),
1876}
1877
1878impl From<ExternalName> for RelocTarget {
1879 fn from(name: ExternalName) -> Self {
1880 Self::ExternalName(name)
1881 }
1882}
1883
1884impl From<MachLabel> for RelocTarget {
1885 fn from(label: MachLabel) -> Self {
1886 Self::Label(label)
1887 }
1888}
1889
1890/// A Relocation target
1891#[derive(Debug, Clone, PartialEq, Eq, Hash)]
1892#[cfg_attr(
1893 feature = "enable-serde",
1894 derive(serde_derive::Serialize, serde_derive::Deserialize)
1895)]
1896pub enum FinalizedRelocTarget {
1897 /// Points to an [ExternalName] outside the current function.
1898 ExternalName(ExternalName),
1899 /// Points to a [CodeOffset] from the start of the current function.
1900 Func(CodeOffset),
1901}
1902
1903impl FinalizedRelocTarget {
1904 /// Returns a display for the current [FinalizedRelocTarget], with extra context to prettify the
1905 /// output.
1906 pub fn display<'a>(&'a self, params: Option<&'a FunctionParameters>) -> String {
1907 match self {
1908 FinalizedRelocTarget::ExternalName(name) => format!("{}", name.display(params)),
1909 FinalizedRelocTarget::Func(offset) => format!("func+{offset}"),
1910 }
1911 }
1912}
1913
1914/// A trap record resulting from a compilation.
1915#[derive(Clone, Debug, PartialEq)]
1916#[cfg_attr(
1917 feature = "enable-serde",
1918 derive(serde_derive::Serialize, serde_derive::Deserialize)
1919)]
1920pub struct MachTrap {
1921 /// The offset at which the trap instruction occurs, *relative to the
1922 /// containing section*.
1923 pub offset: CodeOffset,
1924 /// The trap code.
1925 pub code: TrapCode,
1926}
1927
1928/// A call site record resulting from a compilation.
1929#[derive(Clone, Debug, PartialEq)]
1930#[cfg_attr(
1931 feature = "enable-serde",
1932 derive(serde_derive::Serialize, serde_derive::Deserialize)
1933)]
1934pub struct MachCallSite {
1935 /// The offset of the call's return address, *relative to the containing section*.
1936 pub ret_addr: CodeOffset,
1937}
1938
1939/// A source-location mapping resulting from a compilation.
1940#[derive(PartialEq, Debug, Clone)]
1941#[cfg_attr(
1942 feature = "enable-serde",
1943 derive(serde_derive::Serialize, serde_derive::Deserialize)
1944)]
1945pub struct MachSrcLoc<T: CompilePhase> {
1946 /// The start of the region of code corresponding to a source location.
1947 /// This is relative to the start of the function, not to the start of the
1948 /// section.
1949 pub start: CodeOffset,
1950 /// The end of the region of code corresponding to a source location.
1951 /// This is relative to the start of the section, not to the start of the
1952 /// section.
1953 pub end: CodeOffset,
1954 /// The source location.
1955 pub loc: T::SourceLocType,
1956}
1957
1958impl MachSrcLoc<Stencil> {
1959 fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {
1960 MachSrcLoc {
1961 start: self.start,
1962 end: self.end,
1963 loc: self.loc.expand(base_srcloc),
1964 }
1965 }
1966}
1967
1968/// Record of stack map metadata: stack offsets containing references.
1969#[derive(Clone, Debug, PartialEq)]
1970#[cfg_attr(
1971 feature = "enable-serde",
1972 derive(serde_derive::Serialize, serde_derive::Deserialize)
1973)]
1974pub struct MachStackMap {
1975 /// The code offset at which this stack map applies.
1976 pub offset: CodeOffset,
1977 /// The code offset just past the "end" of the instruction: that is, the
1978 /// offset of the first byte of the following instruction, or equivalently,
1979 /// the start offset plus the instruction length.
1980 pub offset_end: CodeOffset,
1981 /// The stack map itself.
1982 pub stack_map: StackMap,
1983}
1984
1985/// Record of branch instruction in the buffer, to facilitate editing.
1986#[derive(Clone, Debug)]
1987struct MachBranch {
1988 start: CodeOffset,
1989 end: CodeOffset,
1990 target: MachLabel,
1991 fixup: usize,
1992 inverted: Option<SmallVec<[u8; 8]>>,
1993 /// All labels pointing to the start of this branch. For correctness, this
1994 /// *must* be complete (i.e., must contain all labels whose resolved offsets
1995 /// are at the start of this branch): we rely on being able to redirect all
1996 /// labels that could jump to this branch before removing it, if it is
1997 /// otherwise unreachable.
1998 labels_at_this_branch: SmallVec<[MachLabel; 4]>,
1999}
2000
2001impl MachBranch {
2002 fn is_cond(&self) -> bool {
2003 self.inverted.is_some()
2004 }
2005 fn is_uncond(&self) -> bool {
2006 self.inverted.is_none()
2007 }
2008}
2009
2010/// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.
2011///
2012/// Note that `MachBuffer` was primarily written for intra-function references
2013/// of jumps between basic blocks, but it's also quite usable for entire text
2014/// sections and resolving references between functions themselves. This
2015/// builder interprets "blocks" as labeled functions for the purposes of
2016/// resolving labels internally in the buffer.
2017pub struct MachTextSectionBuilder<I: VCodeInst> {
2018 buf: MachBuffer<I>,
2019 next_func: usize,
2020 force_veneers: ForceVeneers,
2021}
2022
2023impl<I: VCodeInst> MachTextSectionBuilder<I> {
2024 /// Creates a new text section builder which will have `num_funcs` functions
2025 /// pushed into it.
2026 pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {
2027 let mut buf = MachBuffer::new();
2028 buf.reserve_labels_for_blocks(num_funcs);
2029 MachTextSectionBuilder {
2030 buf,
2031 next_func: 0,
2032 force_veneers: ForceVeneers::No,
2033 }
2034 }
2035}
2036
2037impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {
2038 fn append(
2039 &mut self,
2040 labeled: bool,
2041 func: &[u8],
2042 align: u32,
2043 ctrl_plane: &mut ControlPlane,
2044 ) -> u64 {
2045 // Conditionally emit an island if it's necessary to resolve jumps
2046 // between functions which are too far away.
2047 let size = func.len() as u32;
2048 if self.force_veneers == ForceVeneers::Yes || self.buf.island_needed(size) {
2049 self.buf
2050 .emit_island_maybe_forced(self.force_veneers, size, ctrl_plane);
2051 }
2052
2053 self.buf.align_to(align);
2054 let pos = self.buf.cur_offset();
2055 if labeled {
2056 self.buf.bind_label(
2057 MachLabel::from_block(BlockIndex::new(self.next_func)),
2058 ctrl_plane,
2059 );
2060 self.next_func += 1;
2061 }
2062 self.buf.put_data(func);
2063 u64::from(pos)
2064 }
2065
2066 fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {
2067 crate::trace!(
2068 "Resolving relocation @ {offset:#x} + {addend:#x} to target {target} of kind {reloc:?}"
2069 );
2070 let label = MachLabel::from_block(BlockIndex::new(target));
2071 let offset = u32::try_from(offset).unwrap();
2072 match I::LabelUse::from_reloc(reloc, addend) {
2073 Some(label_use) => {
2074 self.buf.use_label_at_offset(offset, label, label_use);
2075 true
2076 }
2077 None => false,
2078 }
2079 }
2080
2081 fn force_veneers(&mut self) {
2082 self.force_veneers = ForceVeneers::Yes;
2083 }
2084
2085 fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> {
2086 // Double-check all functions were pushed.
2087 assert_eq!(self.next_func, self.buf.label_offsets.len());
2088
2089 // Finish up any veneers, if necessary.
2090 self.buf
2091 .finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane);
2092
2093 // We don't need the data any more, so return it to the caller.
2094 mem::take(&mut self.buf.data).into_vec()
2095 }
2096}
2097
2098// We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
2099#[cfg(all(test, feature = "arm64"))]
2100mod test {
2101 use cranelift_entity::EntityRef as _;
2102
2103 use super::*;
2104 use crate::ir::UserExternalNameRef;
2105 use crate::isa::aarch64::inst::xreg;
2106 use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
2107 use crate::machinst::{MachInstEmit, MachInstEmitState};
2108 use crate::settings;
2109
2110 fn label(n: u32) -> MachLabel {
2111 MachLabel::from_block(BlockIndex::new(n as usize))
2112 }
2113 fn target(n: u32) -> BranchTarget {
2114 BranchTarget::Label(label(n))
2115 }
2116
2117 #[test]
2118 fn test_elide_jump_to_next() {
2119 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2120 let mut buf = MachBuffer::new();
2121 let mut state = <Inst as MachInstEmit>::State::default();
2122 let constants = Default::default();
2123
2124 buf.reserve_labels_for_blocks(2);
2125 buf.bind_label(label(0), state.ctrl_plane_mut());
2126 let inst = Inst::Jump { dest: target(1) };
2127 inst.emit(&mut buf, &info, &mut state);
2128 buf.bind_label(label(1), state.ctrl_plane_mut());
2129 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2130 assert_eq!(0, buf.total_size());
2131 }
2132
2133 #[test]
2134 fn test_elide_trivial_jump_blocks() {
2135 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2136 let mut buf = MachBuffer::new();
2137 let mut state = <Inst as MachInstEmit>::State::default();
2138 let constants = Default::default();
2139
2140 buf.reserve_labels_for_blocks(4);
2141
2142 buf.bind_label(label(0), state.ctrl_plane_mut());
2143 let inst = Inst::CondBr {
2144 kind: CondBrKind::NotZero(xreg(0)),
2145 taken: target(1),
2146 not_taken: target(2),
2147 };
2148 inst.emit(&mut buf, &info, &mut state);
2149
2150 buf.bind_label(label(1), state.ctrl_plane_mut());
2151 let inst = Inst::Jump { dest: target(3) };
2152 inst.emit(&mut buf, &info, &mut state);
2153
2154 buf.bind_label(label(2), state.ctrl_plane_mut());
2155 let inst = Inst::Jump { dest: target(3) };
2156 inst.emit(&mut buf, &info, &mut state);
2157
2158 buf.bind_label(label(3), state.ctrl_plane_mut());
2159
2160 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2161 assert_eq!(0, buf.total_size());
2162 }
2163
2164 #[test]
2165 fn test_flip_cond() {
2166 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2167 let mut buf = MachBuffer::new();
2168 let mut state = <Inst as MachInstEmit>::State::default();
2169 let constants = Default::default();
2170
2171 buf.reserve_labels_for_blocks(4);
2172
2173 buf.bind_label(label(0), state.ctrl_plane_mut());
2174 let inst = Inst::CondBr {
2175 kind: CondBrKind::Zero(xreg(0)),
2176 taken: target(1),
2177 not_taken: target(2),
2178 };
2179 inst.emit(&mut buf, &info, &mut state);
2180
2181 buf.bind_label(label(1), state.ctrl_plane_mut());
2182 let inst = Inst::Nop4;
2183 inst.emit(&mut buf, &info, &mut state);
2184
2185 buf.bind_label(label(2), state.ctrl_plane_mut());
2186 let inst = Inst::Udf {
2187 trap_code: TrapCode::Interrupt,
2188 };
2189 inst.emit(&mut buf, &info, &mut state);
2190
2191 buf.bind_label(label(3), state.ctrl_plane_mut());
2192
2193 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2194
2195 let mut buf2 = MachBuffer::new();
2196 let mut state = Default::default();
2197 let inst = Inst::TrapIf {
2198 kind: CondBrKind::NotZero(xreg(0)),
2199 trap_code: TrapCode::Interrupt,
2200 };
2201 inst.emit(&mut buf2, &info, &mut state);
2202 let inst = Inst::Nop4;
2203 inst.emit(&mut buf2, &info, &mut state);
2204
2205 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2206
2207 assert_eq!(buf.data, buf2.data);
2208 }
2209
2210 #[test]
2211 fn test_island() {
2212 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2213 let mut buf = MachBuffer::new();
2214 let mut state = <Inst as MachInstEmit>::State::default();
2215 let constants = Default::default();
2216
2217 buf.reserve_labels_for_blocks(4);
2218
2219 buf.bind_label(label(0), state.ctrl_plane_mut());
2220 let inst = Inst::CondBr {
2221 kind: CondBrKind::NotZero(xreg(0)),
2222 taken: target(2),
2223 not_taken: target(3),
2224 };
2225 inst.emit(&mut buf, &info, &mut state);
2226
2227 buf.bind_label(label(1), state.ctrl_plane_mut());
2228 while buf.cur_offset() < 2000000 {
2229 if buf.island_needed(0) {
2230 buf.emit_island(0, state.ctrl_plane_mut());
2231 }
2232 let inst = Inst::Nop4;
2233 inst.emit(&mut buf, &info, &mut state);
2234 }
2235
2236 buf.bind_label(label(2), state.ctrl_plane_mut());
2237 let inst = Inst::Nop4;
2238 inst.emit(&mut buf, &info, &mut state);
2239
2240 buf.bind_label(label(3), state.ctrl_plane_mut());
2241 let inst = Inst::Nop4;
2242 inst.emit(&mut buf, &info, &mut state);
2243
2244 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2245
2246 assert_eq!(2000000 + 8, buf.total_size());
2247
2248 let mut buf2 = MachBuffer::new();
2249 let mut state = Default::default();
2250 let inst = Inst::CondBr {
2251 kind: CondBrKind::NotZero(xreg(0)),
2252
2253 // This conditionally taken branch has a 19-bit constant, shifted
2254 // to the left by two, giving us a 21-bit range in total. Half of
2255 // this range positive so the we should be around 1 << 20 bytes
2256 // away for our jump target.
2257 //
2258 // There are two pending fixups by the time we reach this point,
2259 // one for this 19-bit jump and one for the unconditional 26-bit
2260 // jump below. A 19-bit veneer is 4 bytes large and the 26-bit
2261 // veneer is 20 bytes large, which means that pessimistically
2262 // assuming we'll need two veneers. Currently each veneer is
2263 // pessimistically assumed to be the maximal size which means we
2264 // need 40 bytes of extra space, meaning that the actual island
2265 // should come 40-bytes before the deadline.
2266 taken: BranchTarget::ResolvedOffset((1 << 20) - 20 - 20),
2267
2268 // This branch is in-range so no veneers should be needed, it should
2269 // go directly to the target.
2270 not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
2271 };
2272 inst.emit(&mut buf2, &info, &mut state);
2273
2274 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2275
2276 assert_eq!(&buf.data[0..8], &buf2.data[..]);
2277 }
2278
2279 #[test]
2280 fn test_island_backward() {
2281 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2282 let mut buf = MachBuffer::new();
2283 let mut state = <Inst as MachInstEmit>::State::default();
2284 let constants = Default::default();
2285
2286 buf.reserve_labels_for_blocks(4);
2287
2288 buf.bind_label(label(0), state.ctrl_plane_mut());
2289 let inst = Inst::Nop4;
2290 inst.emit(&mut buf, &info, &mut state);
2291
2292 buf.bind_label(label(1), state.ctrl_plane_mut());
2293 let inst = Inst::Nop4;
2294 inst.emit(&mut buf, &info, &mut state);
2295
2296 buf.bind_label(label(2), state.ctrl_plane_mut());
2297 while buf.cur_offset() < 2000000 {
2298 let inst = Inst::Nop4;
2299 inst.emit(&mut buf, &info, &mut state);
2300 }
2301
2302 buf.bind_label(label(3), state.ctrl_plane_mut());
2303 let inst = Inst::CondBr {
2304 kind: CondBrKind::NotZero(xreg(0)),
2305 taken: target(0),
2306 not_taken: target(1),
2307 };
2308 inst.emit(&mut buf, &info, &mut state);
2309
2310 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2311
2312 assert_eq!(2000000 + 12, buf.total_size());
2313
2314 let mut buf2 = MachBuffer::new();
2315 let mut state = Default::default();
2316 let inst = Inst::CondBr {
2317 kind: CondBrKind::NotZero(xreg(0)),
2318 taken: BranchTarget::ResolvedOffset(8),
2319 not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
2320 };
2321 inst.emit(&mut buf2, &info, &mut state);
2322 let inst = Inst::Jump {
2323 dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
2324 };
2325 inst.emit(&mut buf2, &info, &mut state);
2326
2327 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2328
2329 assert_eq!(&buf.data[2000000..], &buf2.data[..]);
2330 }
2331
2332 #[test]
2333 fn test_multiple_redirect() {
2334 // label0:
2335 // cbz x0, label1
2336 // b label2
2337 // label1:
2338 // b label3
2339 // label2:
2340 // nop
2341 // nop
2342 // b label0
2343 // label3:
2344 // b label4
2345 // label4:
2346 // b label5
2347 // label5:
2348 // b label7
2349 // label6:
2350 // nop
2351 // label7:
2352 // ret
2353 //
2354 // -- should become:
2355 //
2356 // label0:
2357 // cbz x0, label7
2358 // label2:
2359 // nop
2360 // nop
2361 // b label0
2362 // label6:
2363 // nop
2364 // label7:
2365 // ret
2366
2367 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2368 let mut buf = MachBuffer::new();
2369 let mut state = <Inst as MachInstEmit>::State::default();
2370 let constants = Default::default();
2371
2372 buf.reserve_labels_for_blocks(8);
2373
2374 buf.bind_label(label(0), state.ctrl_plane_mut());
2375 let inst = Inst::CondBr {
2376 kind: CondBrKind::Zero(xreg(0)),
2377 taken: target(1),
2378 not_taken: target(2),
2379 };
2380 inst.emit(&mut buf, &info, &mut state);
2381
2382 buf.bind_label(label(1), state.ctrl_plane_mut());
2383 let inst = Inst::Jump { dest: target(3) };
2384 inst.emit(&mut buf, &info, &mut state);
2385
2386 buf.bind_label(label(2), state.ctrl_plane_mut());
2387 let inst = Inst::Nop4;
2388 inst.emit(&mut buf, &info, &mut state);
2389 inst.emit(&mut buf, &info, &mut state);
2390 let inst = Inst::Jump { dest: target(0) };
2391 inst.emit(&mut buf, &info, &mut state);
2392
2393 buf.bind_label(label(3), state.ctrl_plane_mut());
2394 let inst = Inst::Jump { dest: target(4) };
2395 inst.emit(&mut buf, &info, &mut state);
2396
2397 buf.bind_label(label(4), state.ctrl_plane_mut());
2398 let inst = Inst::Jump { dest: target(5) };
2399 inst.emit(&mut buf, &info, &mut state);
2400
2401 buf.bind_label(label(5), state.ctrl_plane_mut());
2402 let inst = Inst::Jump { dest: target(7) };
2403 inst.emit(&mut buf, &info, &mut state);
2404
2405 buf.bind_label(label(6), state.ctrl_plane_mut());
2406 let inst = Inst::Nop4;
2407 inst.emit(&mut buf, &info, &mut state);
2408
2409 buf.bind_label(label(7), state.ctrl_plane_mut());
2410 let inst = Inst::Ret {};
2411 inst.emit(&mut buf, &info, &mut state);
2412
2413 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2414
2415 let golden_data = vec![
2416 0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
2417 0x1f, 0x20, 0x03, 0xd5, // nop
2418 0x1f, 0x20, 0x03, 0xd5, // nop
2419 0xfd, 0xff, 0xff, 0x17, // b 0
2420 0x1f, 0x20, 0x03, 0xd5, // nop
2421 0xc0, 0x03, 0x5f, 0xd6, // ret
2422 ];
2423
2424 assert_eq!(&golden_data[..], &buf.data[..]);
2425 }
2426
2427 #[test]
2428 fn test_handle_branch_cycle() {
2429 // label0:
2430 // b label1
2431 // label1:
2432 // b label2
2433 // label2:
2434 // b label3
2435 // label3:
2436 // b label4
2437 // label4:
2438 // b label1 // note: not label0 (to make it interesting).
2439 //
2440 // -- should become:
2441 //
2442 // label0, label1, ..., label4:
2443 // b label0
2444 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2445 let mut buf = MachBuffer::new();
2446 let mut state = <Inst as MachInstEmit>::State::default();
2447 let constants = Default::default();
2448
2449 buf.reserve_labels_for_blocks(5);
2450
2451 buf.bind_label(label(0), state.ctrl_plane_mut());
2452 let inst = Inst::Jump { dest: target(1) };
2453 inst.emit(&mut buf, &info, &mut state);
2454
2455 buf.bind_label(label(1), state.ctrl_plane_mut());
2456 let inst = Inst::Jump { dest: target(2) };
2457 inst.emit(&mut buf, &info, &mut state);
2458
2459 buf.bind_label(label(2), state.ctrl_plane_mut());
2460 let inst = Inst::Jump { dest: target(3) };
2461 inst.emit(&mut buf, &info, &mut state);
2462
2463 buf.bind_label(label(3), state.ctrl_plane_mut());
2464 let inst = Inst::Jump { dest: target(4) };
2465 inst.emit(&mut buf, &info, &mut state);
2466
2467 buf.bind_label(label(4), state.ctrl_plane_mut());
2468 let inst = Inst::Jump { dest: target(1) };
2469 inst.emit(&mut buf, &info, &mut state);
2470
2471 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2472
2473 let golden_data = vec![
2474 0x00, 0x00, 0x00, 0x14, // b 0
2475 ];
2476
2477 assert_eq!(&golden_data[..], &buf.data[..]);
2478 }
2479
2480 #[test]
2481 fn metadata_records() {
2482 let mut buf = MachBuffer::<Inst>::new();
2483 let ctrl_plane = &mut Default::default();
2484 let constants = Default::default();
2485
2486 buf.reserve_labels_for_blocks(1);
2487
2488 buf.bind_label(label(0), ctrl_plane);
2489 buf.put1(1);
2490 buf.add_trap(TrapCode::HeapOutOfBounds);
2491 buf.put1(2);
2492 buf.add_trap(TrapCode::IntegerOverflow);
2493 buf.add_trap(TrapCode::IntegerDivisionByZero);
2494 buf.add_call_site();
2495 buf.add_reloc(
2496 Reloc::Abs4,
2497 &ExternalName::User(UserExternalNameRef::new(0)),
2498 0,
2499 );
2500 buf.put1(3);
2501 buf.add_reloc(
2502 Reloc::Abs8,
2503 &ExternalName::User(UserExternalNameRef::new(1)),
2504 1,
2505 );
2506 buf.put1(4);
2507
2508 let buf = buf.finish(&constants, ctrl_plane);
2509
2510 assert_eq!(buf.data(), &[1, 2, 3, 4]);
2511 assert_eq!(
2512 buf.traps()
2513 .iter()
2514 .map(|trap| (trap.offset, trap.code))
2515 .collect::<Vec<_>>(),
2516 vec![
2517 (1, TrapCode::HeapOutOfBounds),
2518 (2, TrapCode::IntegerOverflow),
2519 (2, TrapCode::IntegerDivisionByZero)
2520 ]
2521 );
2522 assert_eq!(
2523 buf.call_sites()
2524 .iter()
2525 .map(|call_site| call_site.ret_addr)
2526 .collect::<Vec<_>>(),
2527 vec![2]
2528 );
2529 assert_eq!(
2530 buf.relocs()
2531 .iter()
2532 .map(|reloc| (reloc.offset, reloc.kind))
2533 .collect::<Vec<_>>(),
2534 vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]
2535 );
2536 }
2537}