This PR adds a basic *alias analysis*, and optimizations that use it. This is a "mid-end optimization": it operates on CLIF, the machine-independent IR, before lowering occurs. The alias analysis (or maybe more properly, a sort of memory-value analysis) determines when it can prove a particular memory location is equal to a given SSA value, and when it can, it replaces any loads of that location. This subsumes two common optimizations: * Redundant load elimination: when the same memory address is loaded two times, and it can be proven that no intervening operations will write to that memory, then the second load is *redundant* and its result must be the same as the first. We can use the first load's result and remove the second load. * Store-to-load forwarding: when a load can be proven to access exactly the memory written by a preceding store, we can replace the load's result with the store's data operand, and remove the load. Both of these optimizations rely on a "last store" analysis that is a sort of coloring mechanism, split across disjoint categories of abstract state. The basic idea is that every memory-accessing operation is put into one of N disjoint categories; it is disallowed for memory to ever be accessed by an op in one category and later accessed by an op in another category. (The frontend must ensure this.) Then, given this, we scan the code and determine, for each memory-accessing op, when a single prior instruction is a store to the same category. This "colors" the instruction: it is, in a sense, a static name for that version of memory. This analysis provides an important invariant: if two operations access memory with the same last-store, then *no other store can alias* in the time between that last store and these operations. This must-not-alias property, together with a check that the accessed address is *exactly the same* (same SSA value and offset), and other attributes of the access (type, extension mode) are the same, let us prove that the results are the same. Given last-store info, we scan the instructions and build a table from "memory location" key (last store, address, offset, type, extension) to known SSA value stored in that location. A store inserts a new mapping. A load may also insert a new mapping, if we didn't already have one. Then when a load occurs and an entry already exists for its "location", we can reuse the value. This will be either RLE or St-to-Ld depending on where the value came from. Note that this *does* work across basic blocks: the last-store analysis is a full iterative dataflow pass, and we are careful to check dominance of a previously-defined value before aliasing to it at a potentially redundant load. So we will do the right thing if we only have a "partially redundant" load (loaded already but only in one predecessor block), but we will also correctly reuse a value if there is a store or load above a loop and a redundant load of that value within the loop, as long as no potentially-aliasing stores happen within the loop.
157 lines
5.4 KiB
Rust
157 lines
5.4 KiB
Rust
//! Instruction predicates/properties, shared by various analyses.
|
|
use crate::ir::immediates::Offset32;
|
|
use crate::ir::instructions::BranchInfo;
|
|
use crate::ir::{Block, DataFlowGraph, Function, Inst, InstructionData, Opcode, Type, Value};
|
|
use crate::machinst::ty_bits;
|
|
use cranelift_entity::EntityRef;
|
|
|
|
/// Preserve instructions with used result values.
|
|
pub fn any_inst_results_used(inst: Inst, live: &[bool], dfg: &DataFlowGraph) -> bool {
|
|
dfg.inst_results(inst).iter().any(|v| live[v.index()])
|
|
}
|
|
|
|
/// Test whether the given opcode is unsafe to even consider as side-effect-free.
|
|
fn trivially_has_side_effects(opcode: Opcode) -> bool {
|
|
opcode.is_call()
|
|
|| opcode.is_branch()
|
|
|| opcode.is_terminator()
|
|
|| opcode.is_return()
|
|
|| opcode.can_trap()
|
|
|| opcode.other_side_effects()
|
|
|| opcode.can_store()
|
|
}
|
|
|
|
/// Load instructions without the `notrap` flag are defined to trap when
|
|
/// operating on inaccessible memory, so we can't treat them as side-effect-free even if the loaded
|
|
/// value is unused.
|
|
fn is_load_with_defined_trapping(opcode: Opcode, data: &InstructionData) -> bool {
|
|
if !opcode.can_load() {
|
|
return false;
|
|
}
|
|
match *data {
|
|
InstructionData::StackLoad { .. } => false,
|
|
InstructionData::Load { flags, .. } => !flags.notrap(),
|
|
_ => true,
|
|
}
|
|
}
|
|
|
|
/// Does the given instruction have any side-effect that would preclude it from being removed when
|
|
/// its value is unused?
|
|
pub fn has_side_effect(func: &Function, inst: Inst) -> bool {
|
|
let data = &func.dfg[inst];
|
|
let opcode = data.opcode();
|
|
trivially_has_side_effects(opcode) || is_load_with_defined_trapping(opcode, data)
|
|
}
|
|
|
|
/// Does the given instruction have any side-effect as per [has_side_effect], or else is a load,
|
|
/// but not the get_pinned_reg opcode?
|
|
pub fn has_lowering_side_effect(func: &Function, inst: Inst) -> bool {
|
|
let op = func.dfg[inst].opcode();
|
|
op != Opcode::GetPinnedReg && (has_side_effect(func, inst) || op.can_load())
|
|
}
|
|
|
|
/// Is the given instruction a constant value (`iconst`, `fconst`, `bconst`) that can be
|
|
/// represented in 64 bits?
|
|
pub fn is_constant_64bit(func: &Function, inst: Inst) -> Option<u64> {
|
|
let data = &func.dfg[inst];
|
|
if data.opcode() == Opcode::Null {
|
|
return Some(0);
|
|
}
|
|
match data {
|
|
&InstructionData::UnaryImm { imm, .. } => Some(imm.bits() as u64),
|
|
&InstructionData::UnaryIeee32 { imm, .. } => Some(imm.bits() as u64),
|
|
&InstructionData::UnaryIeee64 { imm, .. } => Some(imm.bits()),
|
|
&InstructionData::UnaryBool { imm, .. } => {
|
|
let imm = if imm {
|
|
let bits = ty_bits(func.dfg.value_type(func.dfg.inst_results(inst)[0]));
|
|
|
|
if bits < 64 {
|
|
(1u64 << bits) - 1
|
|
} else {
|
|
u64::MAX
|
|
}
|
|
} else {
|
|
0
|
|
};
|
|
|
|
Some(imm)
|
|
}
|
|
_ => None,
|
|
}
|
|
}
|
|
|
|
/// Get the address, offset, and access type from the given instruction, if any.
|
|
pub fn inst_addr_offset_type(func: &Function, inst: Inst) -> Option<(Value, Offset32, Type)> {
|
|
let data = &func.dfg[inst];
|
|
match data {
|
|
InstructionData::Load { arg, offset, .. } => {
|
|
let ty = func.dfg.value_type(func.dfg.inst_results(inst)[0]);
|
|
Some((*arg, *offset, ty))
|
|
}
|
|
InstructionData::LoadNoOffset { arg, .. } => {
|
|
let ty = func.dfg.value_type(func.dfg.inst_results(inst)[0]);
|
|
Some((*arg, 0.into(), ty))
|
|
}
|
|
InstructionData::Store { args, offset, .. } => {
|
|
let ty = func.dfg.value_type(args[0]);
|
|
Some((args[1], *offset, ty))
|
|
}
|
|
InstructionData::StoreNoOffset { args, .. } => {
|
|
let ty = func.dfg.value_type(args[0]);
|
|
Some((args[1], 0.into(), ty))
|
|
}
|
|
_ => None,
|
|
}
|
|
}
|
|
|
|
/// Get the store data, if any, from an instruction.
|
|
pub fn inst_store_data(func: &Function, inst: Inst) -> Option<Value> {
|
|
let data = &func.dfg[inst];
|
|
match data {
|
|
InstructionData::Store { args, .. } | InstructionData::StoreNoOffset { args, .. } => {
|
|
Some(args[0])
|
|
}
|
|
_ => None,
|
|
}
|
|
}
|
|
|
|
/// Determine whether this opcode behaves as a memory fence, i.e.,
|
|
/// prohibits any moving of memory accesses across it.
|
|
pub fn has_memory_fence_semantics(op: Opcode) -> bool {
|
|
match op {
|
|
Opcode::AtomicRmw
|
|
| Opcode::AtomicCas
|
|
| Opcode::AtomicLoad
|
|
| Opcode::AtomicStore
|
|
| Opcode::Fence => true,
|
|
Opcode::Call | Opcode::CallIndirect => true,
|
|
_ => false,
|
|
}
|
|
}
|
|
|
|
/// Visit all successors of a block with a given visitor closure.
|
|
pub(crate) fn visit_block_succs<F: FnMut(Inst, Block)>(f: &Function, block: Block, mut visit: F) {
|
|
for inst in f.layout.block_likely_branches(block) {
|
|
if f.dfg[inst].opcode().is_branch() {
|
|
visit_branch_targets(f, inst, &mut visit);
|
|
}
|
|
}
|
|
}
|
|
|
|
fn visit_branch_targets<F: FnMut(Inst, Block)>(f: &Function, inst: Inst, visit: &mut F) {
|
|
match f.dfg[inst].analyze_branch(&f.dfg.value_lists) {
|
|
BranchInfo::NotABranch => {}
|
|
BranchInfo::SingleDest(dest, _) => {
|
|
visit(inst, dest);
|
|
}
|
|
BranchInfo::Table(table, maybe_dest) => {
|
|
if let Some(dest) = maybe_dest {
|
|
visit(inst, dest);
|
|
}
|
|
for &dest in f.jump_tables[table].as_slice() {
|
|
visit(inst, dest);
|
|
}
|
|
}
|
|
}
|
|
}
|