ARM64 backend, part 3 / 11: MachInst infrastructure.

This patch adds the MachInst, or Machine Instruction, infrastructure. This is the machine-independent portion of the new backend design. It contains the implementation of the "vcode" (virtual-registerized code) container, the top-level lowering algorithm and compilation pipeline, and the trait definitions that the machine backends will fill in. This backend infrastructure is included in the compilation of the `codegen` crate, but it is not yet tied into the public APIs; that patch will come last, after all the other pieces are filled in. This patch contains code written by Julian Seward <jseward@acm.org> and Benjamin Bouvier <public@benj.me>, originally developed on a side-branch before rebasing and condensing into this patch series. See the `arm64` branch at `https://github.com/cfallin/wasmtime` for original development history. Co-authored-by: Julian Seward <jseward@acm.org> Co-authored-by: Benjamin Bouvier <public@benj.me>
2020-04-09 12:27:26 -07:00
parent f80fe949c6
commit d83574261c
14 changed files with 2662 additions and 2 deletions
--- a/cranelift/codegen/src/machinst/vcode.rs
+++ b/cranelift/codegen/src/machinst/vcode.rs
@@ -0,0 +1,738 @@
+//! This implements the VCode container: a CFG of Insts that have been lowered.
+//!
+//! VCode is virtual-register code. An instruction in VCode is almost a machine
+//! instruction; however, its register slots can refer to virtual registers in
+//! addition to real machine registers.
+//!
+//! VCode is structured with traditional basic blocks, and
+//! each block must be terminated by an unconditional branch (one target), a
+//! conditional branch (two targets), or a return (no targets). Note that this
+//! slightly differs from the machine code of most ISAs: in most ISAs, a
+//! conditional branch has one target (and the not-taken case falls through).
+//! However, we expect that machine backends will elide branches to the following
+//! block (i.e., zero-offset jumps), and will be able to codegen a branch-cond /
+//! branch-uncond pair if *both* targets are not fallthrough. This allows us to
+//! play with layout prior to final binary emission, as well, if we want.
+//!
+//! See the main module comment in `mod.rs` for more details on the VCode-based
+//! backend pipeline.
+
+use crate::binemit::Reloc;
+use crate::ir;
+use crate::machinst::*;
+use crate::settings;
+
+use regalloc::Function as RegallocFunction;
+use regalloc::Set as RegallocSet;
+use regalloc::{BlockIx, InstIx, Range, RegAllocResult, RegClass, RegUsageCollector};
+
+use alloc::boxed::Box;
+use alloc::vec::Vec;
+use log::debug;
+use smallvec::SmallVec;
+use std::fmt;
+use std::iter;
+use std::ops::Index;
+use std::string::String;
+
+/// Index referring to an instruction in VCode.
+pub type InsnIndex = u32;
+/// Index referring to a basic block in VCode.
+pub type BlockIndex = u32;
+
+/// VCodeInst wraps all requirements for a MachInst to be in VCode: it must be
+/// a `MachInst` and it must be able to emit itself at least to a `SizeCodeSink`.
+pub trait VCodeInst: MachInst + MachInstEmit<MachSection> + MachInstEmit<MachSectionSize> {}
+impl<I: MachInst + MachInstEmit<MachSection> + MachInstEmit<MachSectionSize>> VCodeInst for I {}
+
+/// A function in "VCode" (virtualized-register code) form, after lowering.
+/// This is essentially a standard CFG of basic blocks, where each basic block
+/// consists of lowered instructions produced by the machine-specific backend.
+pub struct VCode<I: VCodeInst> {
+    /// Function liveins.
+    liveins: RegallocSet<RealReg>,
+
+    /// Function liveouts.
+    liveouts: RegallocSet<RealReg>,
+
+    /// VReg IR-level types.
+    vreg_types: Vec<Type>,
+
+    /// Lowered machine instructions in order corresponding to the original IR.
+    pub insts: Vec<I>,
+
+    /// Entry block.
+    entry: BlockIndex,
+
+    /// Block instruction indices.
+    pub block_ranges: Vec<(InsnIndex, InsnIndex)>,
+
+    /// Block successors: index range in the successor-list below.
+    block_succ_range: Vec<(usize, usize)>,
+
+    /// Block successor lists, concatenated into one Vec. The `block_succ_range`
+    /// list of tuples above gives (start, end) ranges within this list that
+    /// correspond to each basic block's successors.
+    block_succs: Vec<BlockIndex>,
+
+    /// Block indices by IR block.
+    block_by_bb: SecondaryMap<ir::Block, BlockIndex>,
+
+    /// IR block for each VCode Block. The length of this Vec will likely be
+    /// less than the total number of Blocks, because new Blocks (for edge
+    /// splits, for example) are appended during lowering.
+    bb_by_block: Vec<ir::Block>,
+
+    /// Order of block IDs in final generated code.
+    final_block_order: Vec<BlockIndex>,
+
+    /// Final block offsets. Computed during branch finalization and used
+    /// during emission.
+    final_block_offsets: Vec<CodeOffset>,
+
+    /// Size of code, accounting for block layout / alignment.
+    code_size: CodeOffset,
+
+    /// ABI object.
+    abi: Box<dyn ABIBody<I>>,
+}
+
+/// A builder for a VCode function body. This builder is designed for the
+/// lowering approach that we take: we traverse basic blocks in forward
+/// (original IR) order, but within each basic block, we generate code from
+/// bottom to top; and within each IR instruction that we visit in this reverse
+/// order, we emit machine instructions in *forward* order again.
+///
+/// Hence, to produce the final instructions in proper order, we perform two
+/// swaps.  First, the machine instructions (`I` instances) are produced in
+/// forward order for an individual IR instruction. Then these are *reversed*
+/// and concatenated to `bb_insns` at the end of the IR instruction lowering.
+/// The `bb_insns` vec will thus contain all machine instructions for a basic
+/// block, in reverse order. Finally, when we're done with a basic block, we
+/// reverse the whole block's vec of instructions again, and concatenate onto
+/// the VCode's insts.
+pub struct VCodeBuilder<I: VCodeInst> {
+    /// In-progress VCode.
+    vcode: VCode<I>,
+
+    /// Current basic block instructions, in reverse order (because blocks are
+    /// built bottom-to-top).
+    bb_insns: SmallVec<[I; 32]>,
+
+    /// Current IR-inst instructions, in forward order.
+    ir_inst_insns: SmallVec<[I; 4]>,
+
+    /// Start of succs for the current block in the concatenated succs list.
+    succ_start: usize,
+}
+
+impl<I: VCodeInst> VCodeBuilder<I> {
+    /// Create a new VCodeBuilder.
+    pub fn new(abi: Box<dyn ABIBody<I>>) -> VCodeBuilder<I> {
+        let vcode = VCode::new(abi);
+        VCodeBuilder {
+            vcode,
+            bb_insns: SmallVec::new(),
+            ir_inst_insns: SmallVec::new(),
+            succ_start: 0,
+        }
+    }
+
+    /// Access the ABI object.
+    pub fn abi(&mut self) -> &mut dyn ABIBody<I> {
+        &mut *self.vcode.abi
+    }
+
+    /// Set the type of a VReg.
+    pub fn set_vreg_type(&mut self, vreg: VirtualReg, ty: Type) {
+        while self.vcode.vreg_types.len() <= vreg.get_index() {
+            self.vcode.vreg_types.push(ir::types::I8); // Default type.
+        }
+        self.vcode.vreg_types[vreg.get_index()] = ty;
+    }
+
+    /// Return the underlying bb-to-BlockIndex map.
+    pub fn blocks_by_bb(&self) -> &SecondaryMap<ir::Block, BlockIndex> {
+        &self.vcode.block_by_bb
+    }
+
+    /// Initialize the bb-to-BlockIndex map. Returns the first free
+    /// BlockIndex.
+    pub fn init_bb_map(&mut self, blocks: &[ir::Block]) -> BlockIndex {
+        let mut bindex: BlockIndex = 0;
+        for bb in blocks.iter() {
+            self.vcode.block_by_bb[*bb] = bindex;
+            self.vcode.bb_by_block.push(*bb);
+            bindex += 1;
+        }
+        bindex
+    }
+
+    /// Get the BlockIndex for an IR block.
+    pub fn bb_to_bindex(&self, bb: ir::Block) -> BlockIndex {
+        self.vcode.block_by_bb[bb]
+    }
+
+    /// Set the current block as the entry block.
+    pub fn set_entry(&mut self, block: BlockIndex) {
+        self.vcode.entry = block;
+    }
+
+    /// End the current IR instruction. Must be called after pushing any
+    /// instructions and prior to ending the basic block.
+    pub fn end_ir_inst(&mut self) {
+        while let Some(i) = self.ir_inst_insns.pop() {
+            self.bb_insns.push(i);
+        }
+    }
+
+    /// End the current basic block. Must be called after emitting vcode insts
+    /// for IR insts and prior to ending the function (building the VCode).
+    pub fn end_bb(&mut self) -> BlockIndex {
+        assert!(self.ir_inst_insns.is_empty());
+        let block_num = self.vcode.block_ranges.len() as BlockIndex;
+        // Push the instructions.
+        let start_idx = self.vcode.insts.len() as InsnIndex;
+        while let Some(i) = self.bb_insns.pop() {
+            self.vcode.insts.push(i);
+        }
+        let end_idx = self.vcode.insts.len() as InsnIndex;
+        // Add the instruction index range to the list of blocks.
+        self.vcode.block_ranges.push((start_idx, end_idx));
+        // End the successors list.
+        let succ_end = self.vcode.block_succs.len();
+        self.vcode
+            .block_succ_range
+            .push((self.succ_start, succ_end));
+        self.succ_start = succ_end;
+
+        block_num
+    }
+
+    /// Push an instruction for the current BB and current IR inst within the BB.
+    pub fn push(&mut self, insn: I) {
+        match insn.is_term() {
+            MachTerminator::None | MachTerminator::Ret => {}
+            MachTerminator::Uncond(target) => {
+                self.vcode.block_succs.push(target);
+            }
+            MachTerminator::Cond(true_branch, false_branch) => {
+                self.vcode.block_succs.push(true_branch);
+                self.vcode.block_succs.push(false_branch);
+            }
+            MachTerminator::Indirect(targets) => {
+                for target in targets {
+                    self.vcode.block_succs.push(*target);
+                }
+            }
+        }
+        self.ir_inst_insns.push(insn);
+    }
+
+    /// Build the final VCode.
+    pub fn build(self) -> VCode<I> {
+        assert!(self.ir_inst_insns.is_empty());
+        assert!(self.bb_insns.is_empty());
+        self.vcode
+    }
+}
+
+fn block_ranges(indices: &[InstIx], len: usize) -> Vec<(usize, usize)> {
+    let v = indices
+        .iter()
+        .map(|iix| iix.get() as usize)
+        .chain(iter::once(len))
+        .collect::<Vec<usize>>();
+    v.windows(2).map(|p| (p[0], p[1])).collect()
+}
+
+fn is_redundant_move<I: VCodeInst>(insn: &I) -> bool {
+    if let Some((to, from)) = insn.is_move() {
+        to.to_reg() == from
+    } else {
+        false
+    }
+}
+
+fn is_trivial_jump_block<I: VCodeInst>(vcode: &VCode<I>, block: BlockIndex) -> Option<BlockIndex> {
+    let range = vcode.block_insns(BlockIx::new(block));
+
+    debug!(
+        "is_trivial_jump_block: block {} has len {}",
+        block,
+        range.len()
+    );
+
+    if range.len() != 1 {
+        return None;
+    }
+    let insn = range.first();
+
+    debug!(
+        " -> only insn is: {:?} with terminator {:?}",
+        vcode.get_insn(insn),
+        vcode.get_insn(insn).is_term()
+    );
+
+    match vcode.get_insn(insn).is_term() {
+        MachTerminator::Uncond(target) => Some(target),
+        _ => None,
+    }
+}
+
+impl<I: VCodeInst> VCode<I> {
+    /// New empty VCode.
+    fn new(abi: Box<dyn ABIBody<I>>) -> VCode<I> {
+        VCode {
+            liveins: abi.liveins(),
+            liveouts: abi.liveouts(),
+            vreg_types: vec![],
+            insts: vec![],
+            entry: 0,
+            block_ranges: vec![],
+            block_succ_range: vec![],
+            block_succs: vec![],
+            block_by_bb: SecondaryMap::with_default(0),
+            bb_by_block: vec![],
+            final_block_order: vec![],
+            final_block_offsets: vec![],
+            code_size: 0,
+            abi,
+        }
+    }
+
+    /// Get the IR-level type of a VReg.
+    pub fn vreg_type(&self, vreg: VirtualReg) -> Type {
+        self.vreg_types[vreg.get_index()]
+    }
+
+    /// Get the entry block.
+    pub fn entry(&self) -> BlockIndex {
+        self.entry
+    }
+
+    /// Get the number of blocks. Block indices will be in the range `0 ..
+    /// (self.num_blocks() - 1)`.
+    pub fn num_blocks(&self) -> usize {
+        self.block_ranges.len()
+    }
+
+    /// Stack frame size for the full function's body.
+    pub fn frame_size(&self) -> u32 {
+        self.abi.frame_size()
+    }
+
+    /// Get the successors for a block.
+    pub fn succs(&self, block: BlockIndex) -> &[BlockIndex] {
+        let (start, end) = self.block_succ_range[block as usize];
+        &self.block_succs[start..end]
+    }
+
+    /// Take the results of register allocation, with a sequence of
+    /// instructions including spliced fill/reload/move instructions, and replace
+    /// the VCode with them.
+    pub fn replace_insns_from_regalloc(
+        &mut self,
+        result: RegAllocResult<Self>,
+        flags: &settings::Flags,
+    ) {
+        self.final_block_order = compute_final_block_order(self);
+
+        // Record the spillslot count and clobbered registers for the ABI/stack
+        // setup code.
+        self.abi.set_num_spillslots(result.num_spill_slots as usize);
+        self.abi
+            .set_clobbered(result.clobbered_registers.map(|r| Writable::from_reg(*r)));
+
+        // We want to move instructions over in final block order, using the new
+        // block-start map given by the regalloc.
+        let block_ranges: Vec<(usize, usize)> =
+            block_ranges(result.target_map.elems(), result.insns.len());
+        let mut final_insns = vec![];
+        let mut final_block_ranges = vec![(0, 0); self.num_blocks()];
+
+        for block in &self.final_block_order {
+            let (start, end) = block_ranges[*block as usize];
+            let final_start = final_insns.len() as InsnIndex;
+
+            if *block == self.entry {
+                // Start with the prologue.
+                final_insns.extend(self.abi.gen_prologue(flags).into_iter());
+            }
+
+            for i in start..end {
+                let insn = &result.insns[i];
+
+                // Elide redundant moves at this point (we only know what is
+                // redundant once registers are allocated).
+                if is_redundant_move(insn) {
+                    continue;
+                }
+
+                // Whenever encountering a return instruction, replace it
+                // with the epilogue.
+                let is_ret = insn.is_term() == MachTerminator::Ret;
+                if is_ret {
+                    final_insns.extend(self.abi.gen_epilogue(flags).into_iter());
+                } else {
+                    final_insns.push(insn.clone());
+                }
+            }
+
+            let final_end = final_insns.len() as InsnIndex;
+            final_block_ranges[*block as usize] = (final_start, final_end);
+        }
+
+        self.insts = final_insns;
+        self.block_ranges = final_block_ranges;
+    }
+
+    /// Removes redundant branches, rewriting targets to point directly to the
+    /// ultimate block at the end of a chain of trivial one-target jumps.
+    pub fn remove_redundant_branches(&mut self) {
+        // For each block, compute the actual target block, looking through up to one
+        // block with single-target jumps (this will remove empty edge blocks inserted
+        // by phi-lowering).
+        let block_rewrites: Vec<BlockIndex> = (0..self.num_blocks() as u32)
+            .map(|bix| is_trivial_jump_block(self, bix).unwrap_or(bix))
+            .collect();
+        let mut refcounts: Vec<usize> = vec![0; self.num_blocks()];
+
+        debug!(
+            "remove_redundant_branches: block_rewrites = {:?}",
+            block_rewrites
+        );
+
+        refcounts[self.entry as usize] = 1;
+
+        for block in 0..self.num_blocks() as u32 {
+            for insn in self.block_insns(BlockIx::new(block)) {
+                self.get_insn_mut(insn)
+                    .with_block_rewrites(&block_rewrites[..]);
+                match self.get_insn(insn).is_term() {
+                    MachTerminator::Uncond(bix) => {
+                        refcounts[bix as usize] += 1;
+                    }
+                    MachTerminator::Cond(bix1, bix2) => {
+                        refcounts[bix1 as usize] += 1;
+                        refcounts[bix2 as usize] += 1;
+                    }
+                    MachTerminator::Indirect(blocks) => {
+                        for block in blocks {
+                            refcounts[*block as usize] += 1;
+                        }
+                    }
+                    _ => {}
+                }
+            }
+        }
+
+        let deleted: Vec<bool> = refcounts.iter().map(|r| *r == 0).collect();
+
+        let block_order = std::mem::replace(&mut self.final_block_order, vec![]);
+        self.final_block_order = block_order
+            .into_iter()
+            .filter(|b| !deleted[*b as usize])
+            .collect();
+
+        // Rewrite successor information based on the block-rewrite map.
+        for succ in &mut self.block_succs {
+            let new_succ = block_rewrites[*succ as usize];
+            *succ = new_succ;
+        }
+    }
+
+    /// Mutate branch instructions to (i) lower two-way condbrs to one-way,
+    /// depending on fallthrough; and (ii) use concrete offsets.
+    pub fn finalize_branches(&mut self)
+    where
+        I: MachInstEmit<MachSectionSize>,
+    {
+        // Compute fallthrough block, indexed by block.
+        let num_final_blocks = self.final_block_order.len();
+        let mut block_fallthrough: Vec<Option<BlockIndex>> = vec![None; self.num_blocks()];
+        for i in 0..(num_final_blocks - 1) {
+            let from = self.final_block_order[i];
+            let to = self.final_block_order[i + 1];
+            block_fallthrough[from as usize] = Some(to);
+        }
+
+        // Pass over VCode instructions and finalize two-way branches into
+        // one-way branches with fallthrough.
+        for block in 0..self.num_blocks() {
+            let next_block = block_fallthrough[block];
+            let (start, end) = self.block_ranges[block];
+
+            for iix in start..end {
+                let insn = &mut self.insts[iix as usize];
+                insn.with_fallthrough_block(next_block);
+            }
+        }
+
+        // Compute block offsets.
+        let mut code_section = MachSectionSize::new(0);
+        let mut block_offsets = vec![0; self.num_blocks()];
+        for block in &self.final_block_order {
+            code_section.offset = I::align_basic_block(code_section.offset);
+            block_offsets[*block as usize] = code_section.offset;
+            let (start, end) = self.block_ranges[*block as usize];
+            for iix in start..end {
+                self.insts[iix as usize].emit(&mut code_section);
+            }
+        }
+
+        // We now have the section layout.
+        self.final_block_offsets = block_offsets;
+        self.code_size = code_section.size();
+
+        // Update branches with known block offsets. This looks like the
+        // traversal above, but (i) does not update block_offsets, rather uses
+        // it (so forward references are now possible), and (ii) mutates the
+        // instructions.
+        let mut code_section = MachSectionSize::new(0);
+        for block in &self.final_block_order {
+            code_section.offset = I::align_basic_block(code_section.offset);
+            let (start, end) = self.block_ranges[*block as usize];
+            for iix in start..end {
+                self.insts[iix as usize]
+                    .with_block_offsets(code_section.offset, &self.final_block_offsets[..]);
+                self.insts[iix as usize].emit(&mut code_section);
+            }
+        }
+    }
+
+    /// Emit the instructions to a list of sections.
+    pub fn emit(&self) -> MachSections
+    where
+        I: MachInstEmit<MachSection>,
+    {
+        let mut sections = MachSections::new();
+        let code_idx = sections.add_section(0, self.code_size);
+        let code_section = sections.get_section(code_idx);
+
+        for block in &self.final_block_order {
+            let new_offset = I::align_basic_block(code_section.cur_offset_from_start());
+            while new_offset > code_section.cur_offset_from_start() {
+                // Pad with NOPs up to the aligned block offset.
+                let nop = I::gen_nop((new_offset - code_section.cur_offset_from_start()) as usize);
+                nop.emit(code_section);
+            }
+            assert_eq!(code_section.cur_offset_from_start(), new_offset);
+
+            let (start, end) = self.block_ranges[*block as usize];
+            for iix in start..end {
+                self.insts[iix as usize].emit(code_section);
+            }
+        }
+
+        sections
+    }
+
+    /// Get the IR block for a BlockIndex, if one exists.
+    pub fn bindex_to_bb(&self, block: BlockIndex) -> Option<ir::Block> {
+        if (block as usize) < self.bb_by_block.len() {
+            Some(self.bb_by_block[block as usize])
+        } else {
+            None
+        }
+    }
+}
+
+impl<I: VCodeInst> RegallocFunction for VCode<I> {
+    type Inst = I;
+
+    fn insns(&self) -> &[I] {
+        &self.insts[..]
+    }
+
+    fn insns_mut(&mut self) -> &mut [I] {
+        &mut self.insts[..]
+    }
+
+    fn get_insn(&self, insn: InstIx) -> &I {
+        &self.insts[insn.get() as usize]
+    }
+
+    fn get_insn_mut(&mut self, insn: InstIx) -> &mut I {
+        &mut self.insts[insn.get() as usize]
+    }
+
+    fn blocks(&self) -> Range<BlockIx> {
+        Range::new(BlockIx::new(0), self.block_ranges.len())
+    }
+
+    fn entry_block(&self) -> BlockIx {
+        BlockIx::new(self.entry)
+    }
+
+    fn block_insns(&self, block: BlockIx) -> Range<InstIx> {
+        let (start, end) = self.block_ranges[block.get() as usize];
+        Range::new(InstIx::new(start), (end - start) as usize)
+    }
+
+    fn block_succs(&self, block: BlockIx) -> Vec<BlockIx> {
+        let (start, end) = self.block_succ_range[block.get() as usize];
+        self.block_succs[start..end]
+            .iter()
+            .cloned()
+            .map(BlockIx::new)
+            .collect()
+    }
+
+    fn is_ret(&self, insn: InstIx) -> bool {
+        match self.insts[insn.get() as usize].is_term() {
+            MachTerminator::Ret => true,
+            _ => false,
+        }
+    }
+
+    fn get_regs(insn: &I, collector: &mut RegUsageCollector) {
+        insn.get_regs(collector)
+    }
+
+    fn map_regs(
+        insn: &mut I,
+        pre_map: &RegallocMap<VirtualReg, RealReg>,
+        post_map: &RegallocMap<VirtualReg, RealReg>,
+    ) {
+        insn.map_regs(pre_map, post_map);
+    }
+
+    fn is_move(&self, insn: &I) -> Option<(Writable<Reg>, Reg)> {
+        insn.is_move()
+    }
+
+    fn get_spillslot_size(&self, regclass: RegClass, vreg: VirtualReg) -> u32 {
+        let ty = self.vreg_type(vreg);
+        self.abi.get_spillslot_size(regclass, ty)
+    }
+
+    fn gen_spill(&self, to_slot: SpillSlot, from_reg: RealReg, vreg: VirtualReg) -> I {
+        let ty = self.vreg_type(vreg);
+        self.abi.gen_spill(to_slot, from_reg, ty)
+    }
+
+    fn gen_reload(&self, to_reg: Writable<RealReg>, from_slot: SpillSlot, vreg: VirtualReg) -> I {
+        let ty = self.vreg_type(vreg);
+        self.abi.gen_reload(to_reg, from_slot, ty)
+    }
+
+    fn gen_move(&self, to_reg: Writable<RealReg>, from_reg: RealReg, vreg: VirtualReg) -> I {
+        let ty = self.vreg_type(vreg);
+        I::gen_move(to_reg.map(|r| r.to_reg()), from_reg.to_reg(), ty)
+    }
+
+    fn gen_zero_len_nop(&self) -> I {
+        I::gen_zero_len_nop()
+    }
+
+    fn maybe_direct_reload(&self, insn: &I, reg: VirtualReg, slot: SpillSlot) -> Option<I> {
+        insn.maybe_direct_reload(reg, slot)
+    }
+
+    fn func_liveins(&self) -> RegallocSet<RealReg> {
+        self.liveins.clone()
+    }
+
+    fn func_liveouts(&self) -> RegallocSet<RealReg> {
+        self.liveouts.clone()
+    }
+}
+
+// N.B.: Debug impl assumes that VCode has already been through all compilation
+// passes, and so has a final block order and offsets.
+
+impl<I: VCodeInst> fmt::Debug for VCode<I> {
+    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
+        writeln!(f, "VCode_Debug {{")?;
+        writeln!(f, "  Entry block: {}", self.entry)?;
+        writeln!(f, "  Final block order: {:?}", self.final_block_order)?;
+
+        for block in 0..self.num_blocks() {
+            writeln!(f, "Block {}:", block,)?;
+            for succ in self.succs(block as BlockIndex) {
+                writeln!(f, "  (successor: Block {})", succ)?;
+            }
+            let (start, end) = self.block_ranges[block];
+            writeln!(f, "  (instruction range: {} .. {})", start, end)?;
+            for inst in start..end {
+                writeln!(f, "  Inst {}: {:?}", inst, self.insts[inst as usize])?;
+            }
+        }
+
+        writeln!(f, "}}")?;
+        Ok(())
+    }
+}
+
+// Pretty-printing with `RealRegUniverse` context.
+impl<I: VCodeInst + ShowWithRRU> ShowWithRRU for VCode<I> {
+    fn show_rru(&self, mb_rru: Option<&RealRegUniverse>) -> String {
+        use crate::alloc::string::ToString;
+        use std::fmt::Write;
+
+        // Calculate an order in which to display the blocks.  This is the same
+        // as final_block_order, but also includes blocks which are in the
+        // representation but not in final_block_order.
+        let mut display_order = Vec::<usize>::new();
+        // First display blocks in |final_block_order|
+        for bix in &self.final_block_order {
+            assert!((*bix as usize) < self.num_blocks());
+            display_order.push(*bix as usize);
+        }
+        // Now also take care of those not listed in |final_block_order|.
+        // This is quadratic, but it's also debug-only code.
+        for bix in 0..self.num_blocks() {
+            if display_order.contains(&bix) {
+                continue;
+            }
+            display_order.push(bix);
+        }
+
+        let mut s = String::new();
+        s = s + &format!("VCode_ShowWithRRU {{{{");
+        s = s + &"\n".to_string();
+        s = s + &format!("  Entry block: {}", self.entry);
+        s = s + &"\n".to_string();
+        s = s + &format!("  Final block order: {:?}", self.final_block_order);
+        s = s + &"\n".to_string();
+
+        for i in 0..self.num_blocks() {
+            let block = display_order[i];
+
+            let omitted =
+                (if !self.final_block_order.is_empty() && i >= self.final_block_order.len() {
+                    "** OMITTED **"
+                } else {
+                    ""
+                })
+                .to_string();
+
+            s = s + &format!("Block {}: {}", block, omitted);
+            s = s + &"\n".to_string();
+            if let Some(bb) = self.bindex_to_bb(block as BlockIndex) {
+                s = s + &format!("  (original IR block: {})\n", bb);
+            }
+            for succ in self.succs(block as BlockIndex) {
+                s = s + &format!("  (successor: Block {})", succ);
+                s = s + &"\n".to_string();
+            }
+            let (start, end) = self.block_ranges[block];
+            s = s + &format!("  (instruction range: {} .. {})", start, end);
+            s = s + &"\n".to_string();
+            for inst in start..end {
+                s = s + &format!(
+                    "  Inst {}:   {}",
+                    inst,
+                    self.insts[inst as usize].show_rru(mb_rru)
+                );
+                s = s + &"\n".to_string();
+            }
+        }
+
+        s = s + &format!("}}}}");
+        s = s + &"\n".to_string();
+
+        s
+    }
+}