Use drain instead of clear (#123 )

Slightly more efficient vec initialization (#120 )
This will, at least on x86_64, compile down to simpler, shorter assembly that uses a zeroed allocation instead of a regular allocation, a memset and various `raw_vec` methods.
2023-04-06 14:21:42 -07:00 · 2023-04-03 13:49:36 -07:00 · 2023-03-09 11:25:59 -08:00 · 2023-03-04 16:38:05 -08:00 · 2023-03-04 14:49:10 -08:00 · 2023-02-28 10:42:13 -08:00
24 changed files with 236 additions and 555 deletions
--- a/.github/workflows/rust.yml
+++ b/.github/workflows/rust.yml
@@ -37,6 +37,16 @@ jobs:
    - name: Check with all features
      run: cargo check --all-features

+  # Make sure the code and its dependencies compile without std.
+  no_std:
+    runs-on: ubuntu-latest
+    steps:
+    - uses: actions/checkout@v3
+    - name: Install thumbv6m-none-eabi target
+      run: rustup target add thumbv6m-none-eabi
+    - name: Check no_std build
+      run: cargo check --target thumbv6m-none-eabi --no-default-features --features trace-log,checker,enable-serde
+
  # Lint dependency graph for security advisories, duplicate versions, and
  # incompatible licences.
  cargo_deny:
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "regalloc2"
-version = "0.5.1"
+version = "0.6.1"
 authors = [
    "Chris Fallin <chris@cfallin.org>",
    "Mozilla SpiderMonkey Developers",
@@ -13,11 +13,15 @@ repository = "https://github.com/bytecodealliance/regalloc2"
 [dependencies]
 log = { version = "0.4.8", default-features = false }
 smallvec = { version = "1.6.1", features = ["union"] }
-fxhash = "0.2.1"
-slice-group-by = "0.3.0"
+rustc-hash = { version = "1.1.0", default-features = false }
+slice-group-by = { version = "0.3.0", default-features = false }
+hashbrown = "0.13.2"

 # Optional serde support, enabled by feature below.
-serde = { version = "1.0.136", features = ["derive"], optional = true }
+serde = { version = "1.0.136", features = [
+    "derive",
+    "alloc",
+], default-features = false, optional = true }

 # The below are only needed for fuzzing.
 libfuzzer-sys = { version = "0.4.2", optional = true }
@@ -29,7 +33,10 @@ debug-assertions = true
 overflow-checks = true

 [features]
-default = []
+default = ["std"]
+
+# Enables std-specific features such as the Error trait for RegAllocError.
+std = []

 # Enables generation of DefAlloc edits for the checker.
 checker = []
--- a/README.md
+++ b/README.md
@@ -3,10 +3,7 @@
 This is a register allocator that started life as, and is about 50%
 still, a port of IonMonkey's backtracking register allocator to
 Rust. In many regards, it has been generalized, optimized, and
-improved since the initial port, and now supports both SSA and non-SSA
-use-cases. (However, non-SSA should be considered deprecated; we want to
-move to SSA-only in the future, to enable some performance improvements.
-See #4.)
+improved since the initial port.

 In addition, it contains substantial amounts of testing infrastructure
 (fuzzing harnesses and checkers) that does not exist in the original
--- a/src/cfg.rs
+++ b/src/cfg.rs
@@ -6,6 +6,8 @@
 //! Lightweight CFG analyses.

 use crate::{domtree, postorder, Block, Function, Inst, ProgPoint, RegAllocError};
+use alloc::vec;
+use alloc::vec::Vec;
 use smallvec::{smallvec, SmallVec};

 #[derive(Clone, Debug)]
--- a/src/checker.rs
+++ b/src/checker.rs
@@ -96,14 +96,16 @@
 #![allow(dead_code)]

 use crate::{
-    Allocation, AllocationKind, Block, Edit, Function, Inst, InstOrEdit, InstPosition, MachineEnv,
-    Operand, OperandConstraint, OperandKind, OperandPos, Output, PReg, PRegSet, VReg,
+    Allocation, AllocationKind, Block, Edit, Function, FxHashMap, FxHashSet, Inst, InstOrEdit,
+    InstPosition, MachineEnv, Operand, OperandConstraint, OperandKind, OperandPos, Output, PReg,
+    PRegSet, VReg,
 };
-use fxhash::{FxHashMap, FxHashSet};
+use alloc::vec::Vec;
+use alloc::{format, vec};
+use core::default::Default;
+use core::hash::Hash;
+use core::result::Result;
 use smallvec::{smallvec, SmallVec};
-use std::default::Default;
-use std::hash::Hash;
-use std::result::Result;

 /// A set of errors detected by the regalloc checker.
 #[derive(Clone, Debug)]
@@ -230,7 +232,7 @@ impl CheckerValue {
    }

    fn from_reg(reg: VReg) -> CheckerValue {
-        CheckerValue::VRegs(std::iter::once(reg).collect())
+        CheckerValue::VRegs(core::iter::once(reg).collect())
    }

    fn remove_vreg(&mut self, reg: VReg) {
@@ -269,10 +271,6 @@ fn visit_all_vregs<F: Function, V: FnMut(VReg)>(f: &F, mut v: V) {
            for op in f.inst_operands(inst) {
                v(op.vreg());
            }
-            if let Some((src, dst)) = f.is_move(inst) {
-                v(src.vreg());
-                v(dst.vreg());
-            }
            if f.is_branch(inst) {
                for succ_idx in 0..f.block_succs(block).len() {
                    for &param in f.branch_blockparams(block, inst, succ_idx) {
@@ -377,8 +375,8 @@ impl Default for CheckerState {
    }
 }

-impl std::fmt::Display for CheckerValue {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+impl core::fmt::Display for CheckerValue {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        match self {
            CheckerValue::Universe => {
                write!(f, "top")
@@ -565,25 +563,6 @@ impl CheckerState {
                // according to the move semantics in the step
                // function below.
            }
-            &CheckerInst::ProgramMove { inst, src, dst: _ } => {
-                // Validate that the fixed-reg constraint, if any, on
-                // `src` is satisfied.
-                if let OperandConstraint::FixedReg(preg) = src.constraint() {
-                    let alloc = Allocation::reg(preg);
-                    let val = self.get_value(&alloc).unwrap_or(&default_val);
-                    trace!(
-                        "checker: checkinst {:?}: cheker value in {:?} is {:?}",
-                        checkinst,
-                        alloc,
-                        val
-                    );
-                    self.check_val(inst, src, alloc, val, &[alloc], checker)?;
-                }
-                // Note that we don't do anything with `dst`
-                // here. That is implicitly checked whenever `dst` is
-                // used; the `update()` step below adds the symbolic
-                // vreg for `dst` into wherever `src` may be stored.
-            }
        }
        Ok(())
    }
@@ -686,15 +665,6 @@ impl CheckerState {
                    }
                }
            }
-            &CheckerInst::ProgramMove { inst: _, src, dst } => {
-                // Remove all earlier instances of `dst`: this vreg is
-                // now stale (it is being overwritten).
-                self.remove_vreg(dst.vreg());
-                // Define `dst` wherever `src` occurs.
-                for (_, value) in self.get_mappings_mut() {
-                    value.copy_vreg(src.vreg(), dst.vreg());
-                }
-            }
        }
    }

@@ -786,23 +756,6 @@ pub(crate) enum CheckerInst {
    /// A safepoint, with the given Allocations specified as containing
    /// reftyped values. All other reftyped values become invalid.
    Safepoint { inst: Inst, allocs: Vec<Allocation> },
-
-    /// An op with one source operand, and one dest operand, that
-    /// copies any symbolic values from the source to the dest, in
-    /// addition to adding the symbolic value of the dest vreg to the
-    /// set. This "program move" is distinguished from the above
-    /// `Move` by being semantically relevant in the original
-    /// (pre-regalloc) program.
-    ///
-    /// We transform checker values as follows: for any vreg-set that
-    /// contains `dst`'s vreg, we first delete that vreg (because it
-    /// is being redefined). Then, for any vreg-set with `src`
-    /// present, we add `dst`.
-    ProgramMove {
-        inst: Inst,
-        src: Operand,
-        dst: Operand,
-    },
 }

 #[derive(Debug)]
@@ -903,35 +856,10 @@ impl<'a, F: Function> Checker<'a, F> {
            self.bb_insts.get_mut(&block).unwrap().push(checkinst);
        }

-        // If this is a move, handle specially. Note that the
-        // regalloc2-inserted moves are not semantically present in
-        // the original program and so do not modify the sets of
-        // symbolic values at all, but rather just move them around;
-        // but "program moves" *are* present, and have the following
-        // semantics: they define the destination vreg, but also
-        // retain any symbolic values in the source.
-        //
-        // regalloc2 reifies all moves into edits in its unified
-        // move/edit framework, so we don't get allocs for these moves
-        // in the post-regalloc output, and the embedder is not
-        // supposed to emit the moves. But we *do* want to check the
-        // semantic implications, namely definition of new vregs. So
-        // we emit `ProgramMove` ops that do just this.
-        if let Some((src, dst)) = self.f.is_move(inst) {
-            let src_op = Operand::any_use(src.vreg());
-            let dst_op = Operand::any_def(dst.vreg());
-            let checkinst = CheckerInst::ProgramMove {
-                inst,
-                src: src_op,
-                dst: dst_op,
-            };
-            trace!("checker: adding inst {:?}", checkinst);
-            self.bb_insts.get_mut(&block).unwrap().push(checkinst);
-        }
        // Skip normal checks if this is a branch: the blockparams do
        // not exist in post-regalloc code, and the edge-moves have to
        // be inserted before the branch rather than after.
-        else if !self.f.is_branch(inst) {
+        if !self.f.is_branch(inst) {
            let operands: Vec<_> = self.f.inst_operands(inst).iter().cloned().collect();
            let allocs: Vec<_> = out.inst_allocs(inst).iter().cloned().collect();
            let clobbers: Vec<_> = self.f.inst_clobbers(inst).into_iter().collect();
@@ -987,11 +915,21 @@ impl<'a, F: Function> Checker<'a, F> {
        let mut queue = Vec::new();
        let mut queue_set = FxHashSet::default();

-        queue.push(self.f.entry_block());
-        queue_set.insert(self.f.entry_block());
+        // Put every block in the queue to start with, to ensure
+        // everything is visited even if the initial state remains
+        // `Top` after preds update it.
+        //
+        // We add blocks in reverse order so that when we process
+        // back-to-front below, we do our initial pass in input block
+        // order, which is (usually) RPO order or at least a
+        // reasonable visit order.
+        for block in (0..self.f.num_blocks()).rev() {
+            let block = Block::new(block);
+            queue.push(block);
+            queue_set.insert(block);
+        }

-        while !queue.is_empty() {
-            let block = queue.pop().unwrap();
+        while let Some(block) = queue.pop() {
            queue_set.remove(&block);
            let mut state = self.bb_in.get(&block).cloned().unwrap();
            trace!("analyze: block {} has state {:?}", block.index(), state);
@@ -1032,9 +970,8 @@ impl<'a, F: Function> Checker<'a, F> {
                        new_state
                    );
                    self.bb_in.insert(succ, new_state);
-                    if !queue_set.contains(&succ) {
+                    if queue_set.insert(succ) {
                        queue.push(succ);
-                        queue_set.insert(succ);
                    }
                }
            }
@@ -1119,9 +1056,6 @@ impl<'a, F: Function> Checker<'a, F> {
                        }
                        trace!("    safepoint: {}", slotargs.join(", "));
                    }
-                    &CheckerInst::ProgramMove { inst, src, dst } => {
-                        trace!("    inst{}: prog_move {} -> {}", inst.index(), src, dst);
-                    }
                    &CheckerInst::ParallelMove { .. } => {
                        panic!("unexpected parallel_move in body (non-edge)")
                    }
--- a/src/domtree.rs
+++ b/src/domtree.rs
@@ -12,6 +12,9 @@
 //   TR-06-33870
 //   https://www.cs.rice.edu/~keith/EMBED/dom.pdf

+use alloc::vec;
+use alloc::vec::Vec;
+
 use crate::Block;

 // Helper
--- a/src/fuzzing/func.rs
+++ b/src/fuzzing/func.rs
@@ -8,6 +8,9 @@ use crate::{
    OperandConstraint, OperandKind, OperandPos, PReg, PRegSet, RegClass, VReg,
 };

+use alloc::vec::Vec;
+use alloc::{format, vec};
+
 use super::arbitrary::Result as ArbitraryResult;
 use super::arbitrary::{Arbitrary, Unstructured};

@@ -124,10 +127,6 @@ impl Function for Func {
        &self.debug_value_labels[..]
    }

-    fn is_move(&self, _: Inst) -> Option<(Operand, Operand)> {
-        None
-    }
-
    fn inst_operands(&self, insn: Inst) -> &[Operand] {
        &self.insts[insn.index()].operands[..]
    }
@@ -279,7 +278,7 @@ pub struct Options {
    pub reftypes: bool,
 }

-impl std::default::Default for Options {
+impl core::default::Default for Options {
    fn default() -> Self {
        Options {
            reused_inputs: false,
@@ -408,7 +407,7 @@ impl Func {
            }
            vregs_by_block.push(vregs.clone());
            vregs_by_block_to_be_defined.push(vec![]);
-            let mut max_block_params = u.int_in_range(0..=std::cmp::min(3, vregs.len() / 3))?;
+            let mut max_block_params = u.int_in_range(0..=core::cmp::min(3, vregs.len() / 3))?;
            for &vreg in &vregs {
                if block > 0 && opts.block_params && bool::arbitrary(u)? && max_block_params > 0 {
                    block_params[block].push(vreg);
@@ -595,8 +594,8 @@ impl Func {
    }
 }

-impl std::fmt::Debug for Func {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+impl core::fmt::Debug for Func {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        write!(f, "{{\n")?;
        for vreg in self.reftype_vregs() {
            write!(f, "  REF: {}\n", vreg)?;
@@ -657,16 +656,18 @@ impl std::fmt::Debug for Func {
 }

 pub fn machine_env() -> MachineEnv {
-    fn regs(r: std::ops::Range<usize>) -> Vec<PReg> {
+    fn regs(r: core::ops::Range<usize>) -> Vec<PReg> {
        r.map(|i| PReg::new(i, RegClass::Int)).collect()
    }
    let preferred_regs_by_class: [Vec<PReg>; 2] = [regs(0..24), vec![]];
    let non_preferred_regs_by_class: [Vec<PReg>; 2] = [regs(24..32), vec![]];
+    let scratch_by_class: [Option<PReg>; 2] = [None, None];
    let fixed_stack_slots = regs(32..63);
    // Register 63 is reserved for use as a fixed non-allocatable register.
    MachineEnv {
        preferred_regs_by_class,
        non_preferred_regs_by_class,
+        scratch_by_class,
        fixed_stack_slots,
    }
 }
--- a/src/index.rs
+++ b/src/index.rs
@@ -50,11 +50,11 @@ macro_rules! define_index {
    };
 }

-pub trait ContainerIndex: Clone + Copy + std::fmt::Debug + PartialEq + Eq {}
+pub trait ContainerIndex: Clone + Copy + core::fmt::Debug + PartialEq + Eq {}

 pub trait ContainerComparator {
    type Ix: ContainerIndex;
-    fn compare(&self, a: Self::Ix, b: Self::Ix) -> std::cmp::Ordering;
+    fn compare(&self, a: Self::Ix, b: Self::Ix) -> core::cmp::Ordering;
 }

 define_index!(Inst);
@@ -146,6 +146,9 @@ impl Iterator for InstRangeIter {

 #[cfg(test)]
 mod test {
+    use alloc::vec;
+    use alloc::vec::Vec;
+
    use super::*;

    #[test]
--- a/src/indexset.rs
+++ b/src/indexset.rs
@@ -5,8 +5,10 @@

 //! Index sets: sets of integers that represent indices into a space.

-use fxhash::FxHashMap;
-use std::cell::Cell;
+use alloc::vec::Vec;
+use core::cell::Cell;
+
+use crate::FxHashMap;

 const SMALL_ELEMS: usize = 12;

@@ -151,10 +153,10 @@ impl AdaptiveMap {

 enum AdaptiveMapIter<'a> {
    Small(&'a [u32], &'a [u64]),
-    Large(std::collections::hash_map::Iter<'a, u32, u64>),
+    Large(hashbrown::hash_map::Iter<'a, u32, u64>),
 }

-impl<'a> std::iter::Iterator for AdaptiveMapIter<'a> {
+impl<'a> core::iter::Iterator for AdaptiveMapIter<'a> {
    type Item = (u32, u64);

    #[inline]
@@ -292,7 +294,7 @@ impl Iterator for SetBitsIter {
        // Build an `Option<NonZeroU64>` so that on the nonzero path,
        // the compiler can optimize the trailing-zeroes operator
        // using that knowledge.
-        std::num::NonZeroU64::new(self.0).map(|nz| {
+        core::num::NonZeroU64::new(self.0).map(|nz| {
            let bitidx = nz.trailing_zeros();
            self.0 &= self.0 - 1; // clear highest set bit
            bitidx as usize
@@ -300,8 +302,8 @@ impl Iterator for SetBitsIter {
    }
 }

-impl std::fmt::Debug for IndexSet {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+impl core::fmt::Debug for IndexSet {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        let vals = self.iter().collect::<Vec<_>>();
        write!(f, "{:?}", vals)
    }
--- a/src/ion/data_structures.rs
+++ b/src/ion/data_structures.rs
@@ -17,14 +17,16 @@ use crate::cfg::CFGInfo;
 use crate::index::ContainerComparator;
 use crate::indexset::IndexSet;
 use crate::{
-    define_index, Allocation, Block, Edit, Function, Inst, MachineEnv, Operand, PReg, ProgPoint,
-    RegClass, VReg,
+    define_index, Allocation, Block, Edit, Function, FxHashSet, Inst, MachineEnv, Operand, PReg,
+    ProgPoint, RegClass, VReg,
 };
-use fxhash::FxHashSet;
+use alloc::collections::BTreeMap;
+use alloc::string::String;
+use alloc::vec::Vec;
+use core::cmp::Ordering;
+use core::fmt::Debug;
+use hashbrown::{HashMap, HashSet};
 use smallvec::SmallVec;
-use std::cmp::Ordering;
-use std::collections::{BTreeMap, HashMap, HashSet};
-use std::fmt::Debug;

 /// A range from `from` (inclusive) to `to` (exclusive).
 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
@@ -64,13 +66,13 @@ impl CodeRange {
    }
 }

-impl std::cmp::PartialOrd for CodeRange {
+impl core::cmp::PartialOrd for CodeRange {
    #[inline(always)]
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        Some(self.cmp(other))
    }
 }
-impl std::cmp::Ord for CodeRange {
+impl core::cmp::Ord for CodeRange {
    #[inline(always)]
    fn cmp(&self, other: &Self) -> Ordering {
        if self.to <= other.from {
@@ -278,7 +280,7 @@ const fn no_bloat_capacity<T>() -> usize {
    //
    // So if `size_of([T; N]) == size_of(pointer) + size_of(capacity)` then we
    // get the maximum inline capacity without bloat.
-    std::mem::size_of::<usize>() * 2 / std::mem::size_of::<T>()
+    core::mem::size_of::<usize>() * 2 / core::mem::size_of::<T>()
 }

 #[derive(Clone, Debug)]
@@ -405,21 +407,6 @@ pub struct Env<'a, F: Function> {
    pub extra_spillslots_by_class: [SmallVec<[Allocation; 2]>; 2],
    pub preferred_victim_by_class: [PReg; 2],

-    // Program moves: these are moves in the provided program that we
-    // handle with our internal machinery, in order to avoid the
-    // overhead of ordinary operand processing. We expect the client
-    // to not generate any code for instructions that return
-    // `Some(..)` for `.is_move()`, and instead use the edits that we
-    // provide to implement those moves (or some simplified version of
-    // them) post-regalloc.
-    //
-    // (from-vreg, inst, from-alloc), sorted by (from-vreg, inst)
-    pub prog_move_srcs: Vec<((VRegIndex, Inst), Allocation)>,
-    // (to-vreg, inst, to-alloc), sorted by (to-vreg, inst)
-    pub prog_move_dsts: Vec<((VRegIndex, Inst), Allocation)>,
-    // (from-vreg, to-vreg) for bundle-merging.
-    pub prog_move_merges: Vec<(LiveRangeIndex, LiveRangeIndex)>,
-
    // When multiple fixed-register constraints are present on a
    // single VReg at a single program point (this can happen for,
    // e.g., call args that use the same value multiple times), we
@@ -446,7 +433,7 @@ pub struct Env<'a, F: Function> {

    // For debug output only: a list of textual annotations at every
    // ProgPoint to insert into the final allocated program listing.
-    pub debug_annotations: std::collections::HashMap<ProgPoint, Vec<String>>,
+    pub debug_annotations: hashbrown::HashMap<ProgPoint, Vec<String>>,
    pub annotations_enabled: bool,

    // Cached allocation for `try_to_allocate_bundle_to_reg` to avoid allocating
@@ -507,7 +494,7 @@ impl SpillSlotList {

 #[derive(Clone, Debug)]
 pub struct PrioQueue {
-    pub heap: std::collections::BinaryHeap<PrioQueueEntry>,
+    pub heap: alloc::collections::BinaryHeap<PrioQueueEntry>,
 }

 #[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord)]
@@ -546,28 +533,28 @@ impl LiveRangeKey {
    }
 }

-impl std::cmp::PartialEq for LiveRangeKey {
+impl core::cmp::PartialEq for LiveRangeKey {
    #[inline(always)]
    fn eq(&self, other: &Self) -> bool {
        self.to > other.from && self.from < other.to
    }
 }
-impl std::cmp::Eq for LiveRangeKey {}
-impl std::cmp::PartialOrd for LiveRangeKey {
+impl core::cmp::Eq for LiveRangeKey {}
+impl core::cmp::PartialOrd for LiveRangeKey {
    #[inline(always)]
-    fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
+    fn partial_cmp(&self, other: &Self) -> Option<core::cmp::Ordering> {
        Some(self.cmp(other))
    }
 }
-impl std::cmp::Ord for LiveRangeKey {
+impl core::cmp::Ord for LiveRangeKey {
    #[inline(always)]
-    fn cmp(&self, other: &Self) -> std::cmp::Ordering {
+    fn cmp(&self, other: &Self) -> core::cmp::Ordering {
        if self.to <= other.from {
-            std::cmp::Ordering::Less
+            core::cmp::Ordering::Less
        } else if self.from >= other.to {
-            std::cmp::Ordering::Greater
+            core::cmp::Ordering::Greater
        } else {
-            std::cmp::Ordering::Equal
+            core::cmp::Ordering::Equal
        }
    }
 }
@@ -577,7 +564,7 @@ pub struct PrioQueueComparator<'a> {
 }
 impl<'a> ContainerComparator for PrioQueueComparator<'a> {
    type Ix = LiveBundleIndex;
-    fn compare(&self, a: Self::Ix, b: Self::Ix) -> std::cmp::Ordering {
+    fn compare(&self, a: Self::Ix, b: Self::Ix) -> core::cmp::Ordering {
        self.prios[a.index()].cmp(&self.prios[b.index()])
    }
 }
@@ -585,7 +572,7 @@ impl<'a> ContainerComparator for PrioQueueComparator<'a> {
 impl PrioQueue {
    pub fn new() -> Self {
        PrioQueue {
-            heap: std::collections::BinaryHeap::new(),
+            heap: alloc::collections::BinaryHeap::new(),
        }
    }

@@ -628,9 +615,7 @@ pub struct InsertedMove {
 #[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord)]
 pub enum InsertMovePrio {
    InEdgeMoves,
-    BlockParam,
    Regular,
-    PostRegular,
    MultiFixedRegInitial,
    MultiFixedRegSecondary,
    ReusedInput,
@@ -660,10 +645,6 @@ pub struct Stats {
    pub livein_iterations: usize,
    pub initial_liverange_count: usize,
    pub merged_bundle_count: usize,
-    pub prog_moves: usize,
-    pub prog_moves_dead_src: usize,
-    pub prog_move_merge_attempt: usize,
-    pub prog_move_merge_success: usize,
    pub process_bundle_count: usize,
    pub process_bundle_reg_probes_fixed: usize,
    pub process_bundle_reg_success_fixed: usize,
--- a/src/ion/dump.rs
+++ b/src/ion/dump.rs
@@ -1,5 +1,9 @@
 //! Debugging output.

+use alloc::string::ToString;
+use alloc::{format, vec};
+use alloc::{string::String, vec::Vec};
+
 use super::Env;
 use crate::{Block, Function, ProgPoint};

--- a/src/ion/liveranges.rs
+++ b/src/ion/liveranges.rs
@@ -22,13 +22,15 @@ use crate::ion::data_structures::{
    BlockparamIn, BlockparamOut, FixedRegFixupLevel, MultiFixedRegFixup,
 };
 use crate::{
-    Allocation, Block, Function, Inst, InstPosition, Operand, OperandConstraint, OperandKind,
-    OperandPos, PReg, ProgPoint, RegAllocError, VReg,
+    Allocation, Block, Function, FxHashMap, FxHashSet, Inst, InstPosition, Operand,
+    OperandConstraint, OperandKind, OperandPos, PReg, ProgPoint, RegAllocError, VReg,
 };
-use fxhash::{FxHashMap, FxHashSet};
+use alloc::collections::VecDeque;
+use alloc::vec;
+use alloc::vec::Vec;
+use hashbrown::HashSet;
 use slice_group_by::GroupByMut;
 use smallvec::{smallvec, SmallVec};
-use std::collections::{HashSet, VecDeque};

 /// A spill weight computed for a certain Use.
 #[derive(Clone, Copy, Debug)]
@@ -42,7 +44,7 @@ pub fn spill_weight_from_constraint(
 ) -> SpillWeight {
    // A bonus of 1000 for one loop level, 4000 for two loop levels,
    // 16000 for three loop levels, etc. Avoids exponentiation.
-    let loop_depth = std::cmp::min(10, loop_depth);
+    let loop_depth = core::cmp::min(10, loop_depth);
    let hot_bonus: f32 = (0..loop_depth).fold(1000.0, |a, _| a * 4.0);
    let def_bonus: f32 = if is_def { 2000.0 } else { 0.0 };
    let constraint_bonus: f32 = match constraint {
@@ -91,7 +93,7 @@ impl SpillWeight {
    }
 }

-impl std::ops::Add<SpillWeight> for SpillWeight {
+impl core::ops::Add<SpillWeight> for SpillWeight {
    type Output = SpillWeight;
    fn add(self, other: SpillWeight) -> Self {
        SpillWeight(self.0 + other.0)
@@ -334,8 +336,7 @@ impl<'a, F: Function> Env<'a, F> {
            workqueue_set.insert(block);
        }

-        while !workqueue.is_empty() {
-            let block = workqueue.pop_front().unwrap();
+        while let Some(block) = workqueue.pop_front() {
            workqueue_set.remove(&block);
            let insns = self.func.block_insns(block);

@@ -357,13 +358,6 @@ impl<'a, F: Function> Env<'a, F> {
            }

            for inst in insns.rev().iter() {
-                if let Some((src, dst)) = self.func.is_move(inst) {
-                    live.set(dst.vreg().vreg(), false);
-                    live.set(src.vreg().vreg(), true);
-                    self.observe_vreg_class(src.vreg());
-                    self.observe_vreg_class(dst.vreg());
-                }
-
                for pos in &[OperandPos::Late, OperandPos::Early] {
                    for op in self.func.inst_operands(inst) {
                        if op.as_fixed_nonallocatable().is_some() {
@@ -520,147 +514,6 @@ impl<'a, F: Function> Env<'a, F> {
                    }
                }

-                // If this is a move, handle specially.
-                if let Some((src, dst)) = self.func.is_move(inst) {
-                    // We can completely skip the move if it is
-                    // trivial (vreg to same vreg).
-                    if src.vreg() != dst.vreg() {
-                        trace!(" -> move inst{}: src {} -> dst {}", inst.index(), src, dst);
-
-                        debug_assert_eq!(src.class(), dst.class());
-                        debug_assert_eq!(src.kind(), OperandKind::Use);
-                        debug_assert_eq!(src.pos(), OperandPos::Early);
-                        debug_assert_eq!(dst.kind(), OperandKind::Def);
-                        debug_assert_eq!(dst.pos(), OperandPos::Late);
-
-                        // Redefine src and dst operands to have
-                        // positions of After and Before respectively
-                        // (see note below), and to have Any
-                        // constraints if they were originally Reg.
-                        let src_constraint = match src.constraint() {
-                            OperandConstraint::Reg => OperandConstraint::Any,
-                            x => x,
-                        };
-                        let dst_constraint = match dst.constraint() {
-                            OperandConstraint::Reg => OperandConstraint::Any,
-                            x => x,
-                        };
-                        let src = Operand::new(
-                            src.vreg(),
-                            src_constraint,
-                            OperandKind::Use,
-                            OperandPos::Late,
-                        );
-                        let dst = Operand::new(
-                            dst.vreg(),
-                            dst_constraint,
-                            OperandKind::Def,
-                            OperandPos::Early,
-                        );
-
-                        if self.annotations_enabled {
-                            self.annotate(
-                                ProgPoint::after(inst),
-                                format!(
-                                    " prog-move v{} ({:?}) -> v{} ({:?})",
-                                    src.vreg().vreg(),
-                                    src_constraint,
-                                    dst.vreg().vreg(),
-                                    dst_constraint,
-                                ),
-                            );
-                        }
-
-                        // N.B.: in order to integrate with the move
-                        // resolution that joins LRs in general, we
-                        // conceptually treat the move as happening
-                        // between the move inst's After and the next
-                        // inst's Before. Thus the src LR goes up to
-                        // (exclusive) next-inst-pre, and the dst LR
-                        // starts at next-inst-pre. We have to take
-                        // care in our move insertion to handle this
-                        // like other inter-inst moves, i.e., at
-                        // `Regular` priority, so it properly happens
-                        // in parallel with other inter-LR moves.
-                        //
-                        // Why the progpoint between move and next
-                        // inst, and not the progpoint between prev
-                        // inst and move? Because a move can be the
-                        // first inst in a block, but cannot be the
-                        // last; so the following progpoint is always
-                        // within the same block, while the previous
-                        // one may be an inter-block point (and the
-                        // After of the prev inst in a different
-                        // block).
-
-                        // Handle the def w.r.t. liveranges: trim the
-                        // start of the range and mark it dead at this
-                        // point in our backward scan.
-                        let pos = ProgPoint::before(inst.next());
-                        let mut dst_lr = vreg_ranges[dst.vreg().vreg()];
-                        if !live.get(dst.vreg().vreg()) {
-                            let from = pos;
-                            let to = pos.next();
-                            dst_lr = self.add_liverange_to_vreg(
-                                VRegIndex::new(dst.vreg().vreg()),
-                                CodeRange { from, to },
-                            );
-                            trace!(" -> invalid LR for def; created {:?}", dst_lr);
-                        }
-                        trace!(" -> has existing LR {:?}", dst_lr);
-                        // Trim the LR to start here.
-                        if self.ranges[dst_lr.index()].range.from
-                            == self.cfginfo.block_entry[block.index()]
-                        {
-                            trace!(" -> started at block start; trimming to {:?}", pos);
-                            self.ranges[dst_lr.index()].range.from = pos;
-                        }
-                        self.ranges[dst_lr.index()].set_flag(LiveRangeFlag::StartsAtDef);
-                        live.set(dst.vreg().vreg(), false);
-                        vreg_ranges[dst.vreg().vreg()] = LiveRangeIndex::invalid();
-
-                        // Handle the use w.r.t. liveranges: make it live
-                        // and create an initial LR back to the start of
-                        // the block.
-                        let pos = ProgPoint::after(inst);
-                        let src_lr = if !live.get(src.vreg().vreg()) {
-                            let range = CodeRange {
-                                from: self.cfginfo.block_entry[block.index()],
-                                to: pos.next(),
-                            };
-                            let src_lr = self
-                                .add_liverange_to_vreg(VRegIndex::new(src.vreg().vreg()), range);
-                            vreg_ranges[src.vreg().vreg()] = src_lr;
-                            src_lr
-                        } else {
-                            vreg_ranges[src.vreg().vreg()]
-                        };
-
-                        trace!(" -> src LR {:?}", src_lr);
-
-                        // Add to live-set.
-                        let src_is_dead_after_move = !live.get(src.vreg().vreg());
-                        live.set(src.vreg().vreg(), true);
-
-                        // Add to program-moves lists.
-                        self.prog_move_srcs.push((
-                            (VRegIndex::new(src.vreg().vreg()), inst),
-                            Allocation::none(),
-                        ));
-                        self.prog_move_dsts.push((
-                            (VRegIndex::new(dst.vreg().vreg()), inst.next()),
-                            Allocation::none(),
-                        ));
-                        self.stats.prog_moves += 1;
-                        if src_is_dead_after_move {
-                            self.stats.prog_moves_dead_src += 1;
-                            self.prog_move_merges.push((src_lr, dst_lr));
-                        }
-                    }
-
-                    continue;
-                }
-
                // Preprocess defs and uses. Specifically, if there
                // are any fixed-reg-constrained defs at Late position
                // and fixed-reg-constrained uses at Early position
@@ -959,12 +812,9 @@ impl<'a, F: Function> Env<'a, F> {
            }
        }

-        for range in 0..self.ranges.len() {
-            self.ranges[range].uses.reverse();
-            debug_assert!(self.ranges[range]
-                .uses
-                .windows(2)
-                .all(|win| win[0].pos <= win[1].pos));
+        for range in &mut self.ranges {
+            range.uses.reverse();
+            debug_assert!(range.uses.windows(2).all(|win| win[0].pos <= win[1].pos));
        }

        // Insert safepoint virtual stack uses, if needed.
@@ -1019,11 +869,6 @@ impl<'a, F: Function> Env<'a, F> {

        self.blockparam_ins.sort_unstable_by_key(|x| x.key());
        self.blockparam_outs.sort_unstable_by_key(|x| x.key());
-        self.prog_move_srcs.sort_unstable_by_key(|(pos, _)| *pos);
-        self.prog_move_dsts.sort_unstable_by_key(|(pos, _)| *pos);
-
-        trace!("prog_move_srcs = {:?}", self.prog_move_srcs);
-        trace!("prog_move_dsts = {:?}", self.prog_move_dsts);

        self.stats.initial_liverange_count = self.ranges.len();
        self.stats.blockparam_ins_count = self.blockparam_ins.len();
@@ -1032,7 +877,7 @@ impl<'a, F: Function> Env<'a, F> {

    pub fn fixup_multi_fixed_vregs(&mut self) {
        // Do a fixed-reg cleanup pass: if there are any LiveRanges with
-        // multiple uses (or defs) at the same ProgPoint and there is
+        // multiple uses at the same ProgPoint and there is
        // more than one FixedReg constraint at that ProgPoint, we
        // need to record all but one of them in a special fixup list
        // and handle them later; otherwise, bundle-splitting to
@@ -1154,15 +999,13 @@ impl<'a, F: Function> Env<'a, F> {
                    }
                }

-                for &(clobber, pos) in &extra_clobbers {
+                for (clobber, pos) in extra_clobbers.drain(..) {
                    let range = CodeRange {
                        from: pos,
                        to: pos.next(),
                    };
                    self.add_liverange_to_preg(range, clobber);
                }
-
-                extra_clobbers.clear();
            }
        }
    }
--- a/src/ion/merge.rs
+++ b/src/ion/merge.rs
@@ -18,6 +18,7 @@ use super::{
 use crate::{
    ion::data_structures::BlockparamOut, Function, Inst, OperandConstraint, OperandKind, PReg,
 };
+use alloc::format;
 use smallvec::smallvec;

 impl<'a, F: Function> Env<'a, F> {
@@ -132,7 +133,7 @@ impl<'a, F: Function> Env<'a, F> {
            // `to` bundle is empty -- just move the list over from
            // `from` and set `bundle` up-link on all ranges.
            trace!(" -> to bundle{} is empty; trivial merge", to.index());
-            let list = std::mem::replace(&mut self.bundles[from.index()].ranges, smallvec![]);
+            let list = core::mem::replace(&mut self.bundles[from.index()].ranges, smallvec![]);
            for entry in &list {
                self.ranges[entry.index.index()].bundle = to;

@@ -170,7 +171,7 @@ impl<'a, F: Function> Env<'a, F> {
        // Two non-empty lists of LiveRanges: concatenate and
        // sort. This is faster than a mergesort-like merge into a new
        // list, empirically.
-        let from_list = std::mem::replace(&mut self.bundles[from.index()].ranges, smallvec![]);
+        let from_list = core::mem::replace(&mut self.bundles[from.index()].ranges, smallvec![]);
        for entry in &from_list {
            self.ranges[entry.index.index()].bundle = to;
        }
@@ -213,7 +214,7 @@ impl<'a, F: Function> Env<'a, F> {
        }

        if self.bundles[from.index()].spillset != self.bundles[to.index()].spillset {
-            let from_vregs = std::mem::replace(
+            let from_vregs = core::mem::replace(
                &mut self.spillsets[self.bundles[from.index()].spillset.index()].vregs,
                smallvec![],
            );
@@ -351,28 +352,6 @@ impl<'a, F: Function> Env<'a, F> {
            self.merge_bundles(from_bundle, to_bundle);
        }

-        // Attempt to merge move srcs/dsts.
-        for i in 0..self.prog_move_merges.len() {
-            let (src, dst) = self.prog_move_merges[i];
-            trace!("trying to merge move src LR {:?} to dst LR {:?}", src, dst);
-            let src = self.resolve_merged_lr(src);
-            let dst = self.resolve_merged_lr(dst);
-            trace!(
-                "resolved LR-construction merging chains: move-merge is now src LR {:?} to dst LR {:?}",
-                src,
-                dst
-            );
-
-            let src_bundle = self.ranges[src.index()].bundle;
-            debug_assert!(src_bundle.is_valid());
-            let dest_bundle = self.ranges[dst.index()].bundle;
-            debug_assert!(dest_bundle.is_valid());
-            self.stats.prog_move_merge_attempt += 1;
-            if self.merge_bundles(/* from */ dest_bundle, /* to */ src_bundle) {
-                self.stats.prog_move_merge_success += 1;
-            }
-        }
-
        trace!("done merging bundles");
    }

--- a/src/ion/mod.rs
+++ b/src/ion/mod.rs
@@ -16,7 +16,9 @@
 use crate::cfg::CFGInfo;
 use crate::ssa::validate_ssa;
 use crate::{Function, MachineEnv, Output, PReg, ProgPoint, RegAllocError, RegClass};
-use std::collections::HashMap;
+use alloc::vec;
+use alloc::vec::Vec;
+use hashbrown::HashMap;

 pub(crate) mod data_structures;
 pub use data_structures::Stats;
@@ -71,10 +73,6 @@ impl<'a, F: Function> Env<'a, F> {
            extra_spillslots_by_class: [smallvec![], smallvec![]],
            preferred_victim_by_class: [PReg::invalid(), PReg::invalid()],

-            prog_move_srcs: Vec::with_capacity(n / 2),
-            prog_move_dsts: Vec::with_capacity(n / 2),
-            prog_move_merges: Vec::with_capacity(n / 2),
-
            multi_fixed_reg_fixups: vec![],
            inserted_moves: vec![],
            edits: Vec::with_capacity(n),
@@ -86,7 +84,7 @@ impl<'a, F: Function> Env<'a, F> {

            stats: Stats::default(),

-            debug_annotations: std::collections::HashMap::new(),
+            debug_annotations: hashbrown::HashMap::new(),
            annotations_enabled,

            conflict_set: Default::default(),
--- a/src/ion/moves.rs
+++ b/src/ion/moves.rs
@@ -22,12 +22,13 @@ use crate::ion::data_structures::{
 use crate::ion::reg_traversal::RegTraversalIter;
 use crate::moves::{MoveAndScratchResolver, ParallelMoves};
 use crate::{
-    Allocation, Block, Edit, Function, Inst, InstPosition, OperandConstraint, OperandKind,
-    OperandPos, PReg, ProgPoint, RegClass, SpillSlot, VReg,
+    Allocation, Block, Edit, Function, FxHashMap, Inst, InstPosition, OperandConstraint,
+    OperandKind, OperandPos, PReg, ProgPoint, RegClass, SpillSlot, VReg,
 };
-use fxhash::FxHashMap;
+use alloc::vec::Vec;
+use alloc::{format, vec};
+use core::fmt::Debug;
 use smallvec::{smallvec, SmallVec};
-use std::fmt::Debug;

 impl<'a, F: Function> Env<'a, F> {
    pub fn is_start_of_block(&self, pos: ProgPoint) -> bool {
@@ -179,8 +180,6 @@ impl<'a, F: Function> Env<'a, F> {

        let mut blockparam_in_idx = 0;
        let mut blockparam_out_idx = 0;
-        let mut prog_move_src_idx = 0;
-        let mut prog_move_dst_idx = 0;
        for vreg in 0..self.vregs.len() {
            let vreg = VRegIndex::new(vreg);
            if !self.is_vreg_used(vreg) {
@@ -190,7 +189,7 @@ impl<'a, F: Function> Env<'a, F> {
            // For each range in each vreg, insert moves or
            // half-moves.  We also scan over `blockparam_ins` and
            // `blockparam_outs`, which are sorted by (block, vreg),
-            // and over program-move srcs/dsts to fill in allocations.
+            // to fill in allocations.
            let mut prev = LiveRangeIndex::invalid();
            for range_idx in 0..self.vregs[vreg.index()].ranges.len() {
                let entry = self.vregs[vreg.index()].ranges[range_idx];
@@ -496,9 +495,9 @@ impl<'a, F: Function> Env<'a, F> {
                            // this case returns the index of the first
                            // entry that is greater as an `Err`.
                            if label_vreg.vreg() < vreg.index() {
-                                std::cmp::Ordering::Less
+                                core::cmp::Ordering::Less
                            } else {
-                                std::cmp::Ordering::Greater
+                                core::cmp::Ordering::Greater
                            }
                        })
                        .unwrap_err();
@@ -517,96 +516,13 @@ impl<'a, F: Function> Env<'a, F> {
                            continue;
                        }

-                        let from = std::cmp::max(label_from, range.from);
-                        let to = std::cmp::min(label_to, range.to);
+                        let from = core::cmp::max(label_from, range.from);
+                        let to = core::cmp::min(label_to, range.to);

                        self.debug_locations.push((label, from, to, alloc));
                    }
                }

-                // Scan over program move srcs/dsts to fill in allocations.
-
-                // Move srcs happen at `After` of a given
-                // inst. Compute [from, to) semi-inclusive range of
-                // inst indices for which we should fill in the source
-                // with this LR's allocation.
-                //
-                // range from inst-Before or inst-After covers cur
-                // inst's After; so includes move srcs from inst.
-                let move_src_start = (vreg, range.from.inst());
-                // range to (exclusive) inst-Before or inst-After
-                // covers only prev inst's After; so includes move
-                // srcs to (exclusive) inst.
-                let move_src_end = (vreg, range.to.inst());
-                trace!(
-                    "vreg {:?} range {:?}: looking for program-move sources from {:?} to {:?}",
-                    vreg,
-                    range,
-                    move_src_start,
-                    move_src_end
-                );
-                while prog_move_src_idx < self.prog_move_srcs.len()
-                    && self.prog_move_srcs[prog_move_src_idx].0 < move_src_start
-                {
-                    trace!(" -> skipping idx {}", prog_move_src_idx);
-                    prog_move_src_idx += 1;
-                }
-                while prog_move_src_idx < self.prog_move_srcs.len()
-                    && self.prog_move_srcs[prog_move_src_idx].0 < move_src_end
-                {
-                    trace!(
-                        " -> setting idx {} ({:?}) to alloc {:?}",
-                        prog_move_src_idx,
-                        self.prog_move_srcs[prog_move_src_idx].0,
-                        alloc
-                    );
-                    self.prog_move_srcs[prog_move_src_idx].1 = alloc;
-                    prog_move_src_idx += 1;
-                }
-
-                // move dsts happen at Before point.
-                //
-                // Range from inst-Before includes cur inst, while inst-After includes only next inst.
-                let move_dst_start = if range.from.pos() == InstPosition::Before {
-                    (vreg, range.from.inst())
-                } else {
-                    (vreg, range.from.inst().next())
-                };
-                // Range to (exclusive) inst-Before includes prev
-                // inst, so to (exclusive) cur inst; range to
-                // (exclusive) inst-After includes cur inst, so to
-                // (exclusive) next inst.
-                let move_dst_end = if range.to.pos() == InstPosition::Before {
-                    (vreg, range.to.inst())
-                } else {
-                    (vreg, range.to.inst().next())
-                };
-                trace!(
-                    "vreg {:?} range {:?}: looking for program-move dests from {:?} to {:?}",
-                    vreg,
-                    range,
-                    move_dst_start,
-                    move_dst_end
-                );
-                while prog_move_dst_idx < self.prog_move_dsts.len()
-                    && self.prog_move_dsts[prog_move_dst_idx].0 < move_dst_start
-                {
-                    trace!(" -> skipping idx {}", prog_move_dst_idx);
-                    prog_move_dst_idx += 1;
-                }
-                while prog_move_dst_idx < self.prog_move_dsts.len()
-                    && self.prog_move_dsts[prog_move_dst_idx].0 < move_dst_end
-                {
-                    trace!(
-                        " -> setting idx {} ({:?}) to alloc {:?}",
-                        prog_move_dst_idx,
-                        self.prog_move_dsts[prog_move_dst_idx].0,
-                        alloc
-                    );
-                    self.prog_move_dsts[prog_move_dst_idx].1 = alloc;
-                    prog_move_dst_idx += 1;
-                }
-
                prev = entry.index;
            }
        }
@@ -714,7 +630,7 @@ impl<'a, F: Function> Env<'a, F> {
        }

        // Handle multi-fixed-reg constraints by copying.
-        for fixup in std::mem::replace(&mut self.multi_fixed_reg_fixups, vec![]) {
+        for fixup in core::mem::replace(&mut self.multi_fixed_reg_fixups, vec![]) {
            let from_alloc = self.get_alloc(fixup.pos.inst(), fixup.from_slot as usize);
            let to_alloc = Allocation::reg(PReg::from_index(fixup.to_preg.index()));
            trace!(
@@ -820,42 +736,6 @@ impl<'a, F: Function> Env<'a, F> {
            }
        }

-        // Sort the prog-moves lists and insert moves to reify the
-        // input program's move operations.
-        self.prog_move_srcs
-            .sort_unstable_by_key(|((_, inst), _)| *inst);
-        self.prog_move_dsts
-            .sort_unstable_by_key(|((_, inst), _)| inst.prev());
-        let prog_move_srcs = std::mem::replace(&mut self.prog_move_srcs, vec![]);
-        let prog_move_dsts = std::mem::replace(&mut self.prog_move_dsts, vec![]);
-        debug_assert_eq!(prog_move_srcs.len(), prog_move_dsts.len());
-        for (&((_, from_inst), from_alloc), &((to_vreg, to_inst), to_alloc)) in
-            prog_move_srcs.iter().zip(prog_move_dsts.iter())
-        {
-            trace!(
-                "program move at inst {:?}: alloc {:?} -> {:?} (v{})",
-                from_inst,
-                from_alloc,
-                to_alloc,
-                to_vreg.index(),
-            );
-            debug_assert!(from_alloc.is_some());
-            debug_assert!(to_alloc.is_some());
-            debug_assert_eq!(from_inst, to_inst.prev());
-            // N.B.: these moves happen with the *same* priority as
-            // LR-to-LR moves, because they work just like them: they
-            // connect a use at one progpoint (move-After) with a def
-            // at an adjacent progpoint (move+1-Before), so they must
-            // happen in parallel with all other LR-to-LR moves.
-            self.insert_move(
-                ProgPoint::before(to_inst),
-                InsertMovePrio::Regular,
-                from_alloc,
-                to_alloc,
-                self.vreg(to_vreg),
-            );
-        }
-
        // Sort the debug-locations vector; we provide this
        // invariant to the client.
        self.debug_locations.sort_unstable();
@@ -986,6 +866,9 @@ impl<'a, F: Function> Env<'a, F> {
                    to: pos_prio.pos.next(),
                });
                let get_reg = || {
+                    if let Some(reg) = self.env.scratch_by_class[regclass as usize] {
+                        return Some(Allocation::reg(reg));
+                    }
                    while let Some(preg) = scratch_iter.next() {
                        if !self.pregs[preg.index()]
                            .allocations
--- a/src/ion/process.rs
+++ b/src/ion/process.rs
@@ -22,12 +22,11 @@ use crate::{
        CodeRange, BUNDLE_MAX_NORMAL_SPILL_WEIGHT, MAX_SPLITS_PER_SPILLSET,
        MINIMAL_BUNDLE_SPILL_WEIGHT, MINIMAL_FIXED_BUNDLE_SPILL_WEIGHT,
    },
-    Allocation, Function, Inst, InstPosition, OperandConstraint, OperandKind, PReg, ProgPoint,
-    RegAllocError,
+    Allocation, Function, FxHashSet, Inst, InstPosition, OperandConstraint, OperandKind, PReg,
+    ProgPoint, RegAllocError,
 };
-use fxhash::FxHashSet;
+use core::fmt::Debug;
 use smallvec::{smallvec, SmallVec};
-use std::fmt::Debug;

 #[derive(Clone, Debug, PartialEq, Eq)]
 pub enum AllocRegResult {
@@ -159,7 +158,7 @@ impl<'a, F: Function> Env<'a, F> {
                    trace!("   -> conflict bundle {:?}", conflict_bundle);
                    if self.conflict_set.insert(conflict_bundle) {
                        conflicts.push(conflict_bundle);
-                        max_conflict_weight = std::cmp::max(
+                        max_conflict_weight = core::cmp::max(
                            max_conflict_weight,
                            self.bundles[conflict_bundle.index()].cached_spill_weight(),
                        );
@@ -172,7 +171,7 @@ impl<'a, F: Function> Env<'a, F> {
                    }

                    if first_conflict.is_none() {
-                        first_conflict = Some(ProgPoint::from_index(std::cmp::max(
+                        first_conflict = Some(ProgPoint::from_index(core::cmp::max(
                            preg_key.from,
                            key.from,
                        )));
@@ -334,7 +333,7 @@ impl<'a, F: Function> Env<'a, F> {
                    self.bundles[bundle.index()].prio,
                    final_weight
                );
-                std::cmp::min(BUNDLE_MAX_NORMAL_SPILL_WEIGHT, final_weight)
+                core::cmp::min(BUNDLE_MAX_NORMAL_SPILL_WEIGHT, final_weight)
            } else {
                0
            }
@@ -824,7 +823,7 @@ impl<'a, F: Function> Env<'a, F> {
                // (up to the Before of the next inst), *unless*
                // the original LR was only over the Before (up to
                // the After) of this inst.
-                let to = std::cmp::min(ProgPoint::before(u.pos.inst().next()), lr_to);
+                let to = core::cmp::min(ProgPoint::before(u.pos.inst().next()), lr_to);

                // If the last bundle was at the same inst, add a new
                // LR to the same bundle; otherwise, create a LR and a
@@ -863,7 +862,7 @@ impl<'a, F: Function> Env<'a, F> {

                // Otherwise, create a new LR.
                let pos = ProgPoint::before(u.pos.inst());
-                let pos = std::cmp::max(lr_from, pos);
+                let pos = core::cmp::max(lr_from, pos);
                let cr = CodeRange { from: pos, to };
                let lr = self.create_liverange(cr);
                new_lrs.push((vreg, lr));
@@ -1036,7 +1035,7 @@ impl<'a, F: Function> Env<'a, F> {
                    self.get_or_create_spill_bundle(bundle, /* create_if_absent = */ false)
                {
                    let mut list =
-                        std::mem::replace(&mut self.bundles[bundle.index()].ranges, smallvec![]);
+                        core::mem::replace(&mut self.bundles[bundle.index()].ranges, smallvec![]);
                    for entry in &list {
                        self.ranges[entry.index.index()].bundle = spill;
                    }
@@ -1107,7 +1106,7 @@ impl<'a, F: Function> Env<'a, F> {
                    lowest_cost_evict_conflict_cost,
                    lowest_cost_split_conflict_cost,
                ) {
-                    (Some(a), Some(b)) => Some(std::cmp::max(a, b)),
+                    (Some(a), Some(b)) => Some(core::cmp::max(a, b)),
                    _ => None,
                };
                match self.try_to_allocate_bundle_to_reg(bundle, preg_idx, scan_limit_cost) {
@@ -1291,7 +1290,7 @@ impl<'a, F: Function> Env<'a, F> {
                );
                let bundle_start = self.bundles[bundle.index()].ranges[0].range.from;
                let mut split_at_point =
-                    std::cmp::max(lowest_cost_split_conflict_point, bundle_start);
+                    core::cmp::max(lowest_cost_split_conflict_point, bundle_start);
                let requeue_with_reg = lowest_cost_split_conflict_reg;

                // Adjust `split_at_point` if it is within a deeper loop
--- a/src/ion/redundant_moves.rs
+++ b/src/ion/redundant_moves.rs
@@ -1,7 +1,6 @@
 //! Redundant-move elimination.

-use crate::{Allocation, VReg};
-use fxhash::FxHashMap;
+use crate::{Allocation, FxHashMap, VReg};
 use smallvec::{smallvec, SmallVec};

 #[derive(Copy, Clone, Debug, PartialEq, Eq)]
@@ -112,9 +111,9 @@ impl RedundantMoveEliminator {
    pub fn clear_alloc(&mut self, alloc: Allocation) {
        trace!("   redundant move eliminator: clear {:?}", alloc);
        if let Some(ref mut existing_copies) = self.reverse_allocs.get_mut(&alloc) {
-            for to_inval in existing_copies.iter() {
+            for to_inval in existing_copies.drain(..) {
                trace!("     -> clear existing copy: {:?}", to_inval);
-                if let Some(val) = self.allocs.get_mut(to_inval) {
+                if let Some(val) = self.allocs.get_mut(&to_inval) {
                    match val {
                        RedundantMoveState::Copy(_, Some(vreg)) => {
                            *val = RedundantMoveState::Orig(*vreg);
@@ -122,9 +121,8 @@ impl RedundantMoveEliminator {
                        _ => *val = RedundantMoveState::None,
                    }
                }
-                self.allocs.remove(to_inval);
+                self.allocs.remove(&to_inval);
            }
-            existing_copies.clear();
        }
        self.allocs.remove(&alloc);
    }
--- a/src/ion/reg_traversal.rs
+++ b/src/ion/reg_traversal.rs
@@ -78,7 +78,7 @@ impl<'a> RegTraversalIter<'a> {
    }
 }

-impl<'a> std::iter::Iterator for RegTraversalIter<'a> {
+impl<'a> core::iter::Iterator for RegTraversalIter<'a> {
    type Item = PReg;

    fn next(&mut self) -> Option<PReg> {
--- a/src/ion/spill.rs
+++ b/src/ion/spill.rs
@@ -138,7 +138,7 @@ impl<'a, F: Function> Env<'a, F> {
            let mut success = false;
            // Never probe the same element more than once: limit the
            // attempt count to the number of slots in existence.
-            for _attempt in 0..std::cmp::min(self.slots_by_size[size].slots.len(), MAX_ATTEMPTS) {
+            for _attempt in 0..core::cmp::min(self.slots_by_size[size].slots.len(), MAX_ATTEMPTS) {
                // Note: this indexing of `slots` is always valid
                // because either the `slots` list is empty and the
                // iteration limit above consequently means we don't
--- a/src/ion/stackmap.rs
+++ b/src/ion/stackmap.rs
@@ -12,6 +12,8 @@

 //! Stackmap computation.

+use alloc::vec::Vec;
+
 use super::{Env, ProgPoint, VRegIndex};
 use crate::{ion::data_structures::u64_key, Function};

--- a/src/lib.rs
+++ b/src/lib.rs
@@ -11,6 +11,12 @@
 */

 #![allow(dead_code)]
+#![no_std]
+
+#[cfg(feature = "std")]
+extern crate std;
+
+extern crate alloc;

 // Even when trace logging is disabled, the trace macro has a significant
 // performance cost so we disable it in release builds.
@@ -28,6 +34,11 @@ macro_rules! trace_enabled {
    };
 }

+use core::hash::BuildHasherDefault;
+use rustc_hash::FxHasher;
+type FxHashMap<K, V> = hashbrown::HashMap<K, V, BuildHasherDefault<FxHasher>>;
+type FxHashSet<V> = hashbrown::HashSet<V, BuildHasherDefault<FxHasher>>;
+
 pub(crate) mod cfg;
 pub(crate) mod domtree;
 pub mod indexset;
@@ -38,6 +49,8 @@ pub mod ssa;

 #[macro_use]
 mod index;
+
+use alloc::vec::Vec;
 pub use index::{Block, Inst, InstRange, InstRangeIter};

 pub mod checker;
@@ -142,8 +155,8 @@ impl PReg {
    }
 }

-impl std::fmt::Debug for PReg {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+impl core::fmt::Debug for PReg {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        write!(
            f,
            "PReg(hw = {}, class = {:?}, index = {})",
@@ -154,8 +167,8 @@ impl std::fmt::Debug for PReg {
    }
 }

-impl std::fmt::Display for PReg {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+impl core::fmt::Display for PReg {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        let class = match self.class() {
            RegClass::Int => "i",
            RegClass::Float => "f",
@@ -266,8 +279,7 @@ impl From<&MachineEnv> for PRegSet {
 /// A virtual register. Contains a virtual register number and a
 /// class.
 ///
-/// A virtual register ("vreg") corresponds to an SSA value for SSA
-/// input, or just a register when we allow for non-SSA input. All
+/// A virtual register ("vreg") corresponds to an SSA value. All
 /// dataflow in the input program is specified via flow through a
 /// virtual register; even uses of specially-constrained locations,
 /// such as fixed physical registers, are done by using vregs, because
@@ -312,8 +324,8 @@ impl VReg {
    }
 }

-impl std::fmt::Debug for VReg {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+impl core::fmt::Debug for VReg {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        write!(
            f,
            "VReg(vreg = {}, class = {:?})",
@@ -323,8 +335,8 @@ impl std::fmt::Debug for VReg {
    }
 }

-impl std::fmt::Display for VReg {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+impl core::fmt::Display for VReg {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        write!(f, "v{}", self.vreg())
    }
 }
@@ -383,8 +395,8 @@ impl SpillSlot {
    }
 }

-impl std::fmt::Display for SpillSlot {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+impl core::fmt::Display for SpillSlot {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        write!(f, "stack{}", self.index())
    }
 }
@@ -414,8 +426,8 @@ pub enum OperandConstraint {
    Reuse(usize),
 }

-impl std::fmt::Display for OperandConstraint {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+impl core::fmt::Display for OperandConstraint {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        match self {
            Self::Any => write!(f, "any"),
            Self::Reg => write!(f, "reg"),
@@ -797,14 +809,14 @@ impl Operand {
    }
 }

-impl std::fmt::Debug for Operand {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
-        std::fmt::Display::fmt(self, f)
+impl core::fmt::Debug for Operand {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
+        core::fmt::Display::fmt(self, f)
    }
 }

-impl std::fmt::Display for Operand {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+impl core::fmt::Display for Operand {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        match (self.kind(), self.pos()) {
            (OperandKind::Def, OperandPos::Late) | (OperandKind::Use, OperandPos::Early) => {
                write!(f, "{:?}", self.kind())?;
@@ -837,14 +849,14 @@ pub struct Allocation {
    bits: u32,
 }

-impl std::fmt::Debug for Allocation {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
-        std::fmt::Display::fmt(self, f)
+impl core::fmt::Debug for Allocation {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
+        core::fmt::Display::fmt(self, f)
    }
 }

-impl std::fmt::Display for Allocation {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+impl core::fmt::Display for Allocation {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        match self.kind() {
            AllocationKind::None => write!(f, "none"),
            AllocationKind::Reg => write!(f, "{}", self.as_reg().unwrap()),
@@ -1029,10 +1041,6 @@ pub trait Function {
        false
    }

-    /// Determine whether an instruction is a move; if so, return the
-    /// Operands for (src, dst).
-    fn is_move(&self, insn: Inst) -> Option<(Operand, Operand)>;
-
    // --------------------------
    // Instruction register slots
    // --------------------------
@@ -1178,8 +1186,8 @@ pub struct ProgPoint {
    bits: u32,
 }

-impl std::fmt::Debug for ProgPoint {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+impl core::fmt::Debug for ProgPoint {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        write!(
            f,
            "progpoint{}{}",
@@ -1326,19 +1334,43 @@ impl<'a> Iterator for OutputIter<'a> {
 pub struct MachineEnv {
    /// Preferred physical registers for each class. These are the
    /// registers that will be allocated first, if free.
+    ///
+    /// If an explicit scratch register is provided in `scratch_by_class` then
+    /// it must not appear in this list.
    pub preferred_regs_by_class: [Vec<PReg>; 2],

    /// Non-preferred physical registers for each class. These are the
    /// registers that will be allocated if a preferred register is
    /// not available; using one of these is considered suboptimal,
    /// but still better than spilling.
+    ///
+    /// If an explicit scratch register is provided in `scratch_by_class` then
+    /// it must not appear in this list.
    pub non_preferred_regs_by_class: [Vec<PReg>; 2],

+    /// Optional dedicated scratch register per class. This is needed to perform
+    /// moves between registers when cyclic move patterns occur. The
+    /// register should not be placed in either the preferred or
+    /// non-preferred list (i.e., it is not otherwise allocatable).
+    ///
+    /// Note that the register allocator will freely use this register
+    /// between instructions, but *within* the machine code generated
+    /// by a single (regalloc-level) instruction, the client is free
+    /// to use the scratch register. E.g., if one "instruction" causes
+    /// the emission of two machine-code instructions, this lowering
+    /// can use the scratch register between them.
+    ///
+    /// If a scratch register is not provided then the register allocator will
+    /// automatically allocate one as needed, spilling a value to the stack if
+    /// necessary.
+    pub scratch_by_class: [Option<PReg>; 2],
+
    /// Some `PReg`s can be designated as locations on the stack rather than
    /// actual registers. These can be used to tell the register allocator about
    /// pre-defined stack slots used for function arguments and return values.
    ///
-    /// `PReg`s in this list cannot be used as an allocatable register.
+    /// `PReg`s in this list cannot be used as an allocatable or scratch
+    /// register.
    pub fixed_stack_slots: Vec<PReg>,
 }

@@ -1403,9 +1435,9 @@ impl Output {
                // binary_search_by returns the index of where it would have
                // been inserted in Err.
                if pos < ProgPoint::before(inst_range.first()) {
-                    std::cmp::Ordering::Less
+                    core::cmp::Ordering::Less
                } else {
-                    std::cmp::Ordering::Greater
+                    core::cmp::Ordering::Greater
                }
            })
            .unwrap_err();
@@ -1444,12 +1476,13 @@ pub enum RegAllocError {
    TooManyLiveRegs,
 }

-impl std::fmt::Display for RegAllocError {
-    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+impl core::fmt::Display for RegAllocError {
+    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
        write!(f, "{:?}", self)
    }
 }

+#[cfg(feature = "std")]
 impl std::error::Error for RegAllocError {}

 /// Run the allocator.
--- a/src/moves.rs
+++ b/src/moves.rs
@@ -4,8 +4,8 @@
 */

 use crate::{ion::data_structures::u64_key, Allocation, PReg};
+use core::fmt::Debug;
 use smallvec::{smallvec, SmallVec};
-use std::fmt::Debug;

 /// A list of moves to be performed in sequence, with auxiliary data
 /// attached to each.
--- a/src/postorder.rs
+++ b/src/postorder.rs
@@ -6,6 +6,8 @@
 //! Fast postorder computation.

 use crate::Block;
+use alloc::vec;
+use alloc::vec::Vec;
 use smallvec::{smallvec, SmallVec};

 pub fn calculate<'a, SuccFn: Fn(Block) -> &'a [Block]>(
@@ -16,8 +18,7 @@ pub fn calculate<'a, SuccFn: Fn(Block) -> &'a [Block]>(
    let mut ret = vec![];

    // State: visited-block map, and explicit DFS stack.
-    let mut visited = vec![];
-    visited.resize(num_blocks, false);
+    let mut visited = vec![false; num_blocks];

    struct State<'a> {
        block: Block,
--- a/src/ssa.rs
+++ b/src/ssa.rs
@@ -5,7 +5,8 @@

 //! SSA-related utilities.

-use std::collections::HashSet;
+use alloc::vec;
+use hashbrown::HashSet;

 use crate::cfg::CFGInfo;
 use crate::{Block, Function, Inst, OperandKind, RegAllocError, VReg};
Author	SHA1	Message	Date
Trevor Elliott	f0e9cde328	Use drain instead of clear (#123 )	2023-04-06 14:21:42 -07:00
Johan Milanov	9c6d6dc9aa	Slightly more efficient vec initialization (#120 ) This will, at least on x86_64, compile down to simpler, shorter assembly that uses a zeroed allocation instead of a regular allocation, a memset and various `raw_vec` methods.	2023-04-03 13:49:36 -07:00
Amanieu d'Antras	2bd03256b3	Make regalloc2 `#![no_std]` (#119 ) * Make regalloc2 `#![no_std]` This crate doesn't require any features from the standard library, so it can be made `no_std` to allow it to be used in environments that can't use the Rust standard library. This PR mainly performs the following mechanical changes: - `std::collections` is replaced with `alloc::collections`. - `std::` is replaced with `core::`. - `Vec`, `vec!`, `format!` and `ToString` are imported when needed since they are no longer in the prelude. - `HashSet` and `HashMap` are taken from the `hashbrown` crate, which is the same implementation that the standard library uses. - `FxHashSet` and `FxHashMap` are typedefs in `lib.rs` that are based on the `hashbrown` types. The only functional change is that `RegAllocError` no longer implements the `Error` trait since that is not available in `core`. Dependencies were adjusted to not require `std` and this is tested in CI by building against the `thumbv6m-none-eabi` target that doesn't have `std`. * Add the Error trait impl back under a "std" feature	2023-03-09 11:25:59 -08:00
Amanieu d'Antras	7354cfedde	Remove support for program moves (#118 )	2023-03-04 16:38:05 -08:00
Amanieu d'Antras	54f074e507	Re-introduce optional dedicated scratch registers (#117 ) * Re-introduce optional dedicated scratch registers Dedicated scratch registers used for resolving move cycles were removed in #51 and replaced with an algorithm to automatically allocate a scratch register as needed. However in many cases, a client will already have a non-allocatable scratch register available for things like extended jumps (see #91). It makes sense to re-use this register for regalloc than potentially spilling an existing register. * Clarify comment	2023-03-04 14:49:10 -08:00
Trevor Elliott	34a9ae7379	Misc refactorings (#116 ) * Use a while-let instead of checking is_empty and popping * This conditional should always be true, as we expect the input is in ssa * Use iter_mut instead of iterating the index * We don't support multiple defs of the same vreg anymore * Drain instead of clear	2023-02-28 10:42:13 -08:00
Chris Fallin	1e8da4f99b	Bump version to 0.6.1. (#115 )	2023-02-16 02:06:40 +00:00
Jamey Sharp	7bb83a3361	checker: Use a couple of Rust idioms (#114 ) This should have no functional change, just makes the source slightly easier to read and reason about.	2023-02-16 01:59:20 +00:00
Chris Fallin	c3e513c4cb	Fix checker when empty blocks result in unchanged-from-`Top` entry state. (#113 ) The checker works by keeping a worklist of blocks to process, and adds a block to the worklist when its entry state changes. Every entry state is initially `Top` (in a lattice). The entry block is explicitly added to the worklist to kick off the processing. In ordinary cases, the entry block has some instructions that change state from `Top` to something else (lower in the lattice), and this is propagated to its successors; its successors are added to the worklist; and so on. No other state is `Top` from then on (because of monotonicity) so every reachable block is processed. However, if the entry block is completely empty except for the terminating branch, the state remains `Top`; then the entry state of its successors, even when updated, is still `Top`; and the state didn't change so the blocks are not added to the worklist. (Nevermind that they were not processed in the first place!) The bug is that the invariant "has been processed already with current state" is not true initially, when the current state is set to `Top` but nothing has been processed. This PR makes a simple fix: it adds every block to the worklist initially to be processed, in input order (which is usually RPO order in practice) as a good first heuristic; then if after processing the input state changes again, it can be reprocessed until fixpoint as always. Fixes bytecodealliance/wasmtime#5791.	2023-02-15 17:42:51 -08:00
Trevor Elliott	50b9cf8fe2	Bump to version 0.6.0 (#112 ) Bump the version to 0.6.0 for the next publish to crates.io. This version removes pinned registers and mod operands, which requires the bump to 0.6.0.	2023-02-07 14:57:22 -08:00