egraph support: rewrite to work in terms of CLIF data structures. (#5382)
* egraph support: rewrite to work in terms of CLIF data structures. This work rewrites the "egraph"-based optimization framework in Cranelift to operate on aegraphs (acyclic egraphs) represented in the CLIF itself rather than as a separate data structure to which and from which we translate the CLIF. The basic idea is to add a new kind of value, a "union", that is like an alias but refers to two other values rather than one. This allows us to represent an eclass of enodes (values) as a tree. The union node allows for a value to have *multiple representations*: either constituent value could be used, and (in well-formed CLIF produced by correct optimization rules) they must be equivalent. Like the old egraph infrastructure, we take advantage of acyclicity and eager rule application to do optimization in a single pass. Like before, we integrate GVN (during the optimization pass) and LICM (during elaboration). Unlike the old egraph infrastructure, everything stays in the DataFlowGraph. "Pure" enodes are represented as instructions that have values attached, but that are not placed into the function layout. When entering "egraph" form, we remove them from the layout while optimizing. When leaving "egraph" form, during elaboration, we can place an instruction back into the layout the first time we elaborate the enode; if we elaborate it more than once, we clone the instruction. The implementation performs two passes overall: - One, a forward pass in RPO (to see defs before uses), that (i) removes "pure" instructions from the layout and (ii) optimizes as it goes. As before, we eagerly optimize, so we form the entire union of optimized forms of a value before we see any uses of that value. This lets us rewrite uses to use the most "up-to-date" form of the value and canonicalize and optimize that form. The eager rewriting and acyclic representation make each other work (we could not eagerly rewrite if there were cycles; and acyclicity does not miss optimization opportunities only because the first time we introduce a value, we immediately produce its "best" form). This design choice is also what allows us to avoid the "parent pointers" and fixpoint loop of traditional egraphs. This forward optimization pass keeps a scoped hashmap to "intern" nodes (thus performing GVN), and also interleaves on a per-instruction level with alias analysis. The interleaving with alias analysis allows alias analysis to see the most optimized form of each address (so it can see equivalences), and allows the next value to see any equivalences (reuses of loads or stored values) that alias analysis uncovers. - Two, a forward pass in domtree preorder, that "elaborates" pure enodes back into the layout, possibly in multiple places if needed. This tracks the loop nest and hoists nodes as needed, performing LICM as it goes. Note that by doing this in forward order, we avoid the "fixpoint" that traditional LICM needs: we hoist a def before its uses, so when we place a node, we place it in the right place the first time rather than moving later. This PR replaces the old (a)egraph implementation. It removes both the cranelift-egraph crate and the logic in cranelift-codegen that uses it. On `spidermonkey.wasm` running a simple recursive Fibonacci microbenchmark, this work shows 5.5% compile-time reduction and 7.7% runtime improvement (speedup). Most of this implementation was done in (very productive) pair programming sessions with Jamey Sharp, thus: Co-authored-by: Jamey Sharp <jsharp@fastly.com> * Review feedback. * Review feedback. * Review feedback. * Bugfix: cprop rule: `(x + k1) - k2` becomes `x - (k2 - k1)`, not `x - (k1 - k2)`. Co-authored-by: Jamey Sharp <jsharp@fastly.com>
This commit is contained in:
@@ -1,308 +1,131 @@
|
||||
//! Optimization driver using ISLE rewrite rules on an egraph.
|
||||
|
||||
use crate::egraph::Analysis;
|
||||
use crate::egraph::FuncEGraph;
|
||||
pub use crate::egraph::{Node, NodeCtx};
|
||||
use crate::egraph::{NewOrExistingInst, OptimizeCtx};
|
||||
use crate::ir::condcodes;
|
||||
pub use crate::ir::condcodes::{FloatCC, IntCC};
|
||||
use crate::ir::dfg::ValueDef;
|
||||
pub use crate::ir::immediates::{Ieee32, Ieee64, Imm64, Offset32, Uimm32, Uimm64, Uimm8};
|
||||
pub use crate::ir::types::*;
|
||||
pub use crate::ir::{
|
||||
dynamic_to_fixed, AtomicRmwOp, Block, Constant, DynamicStackSlot, FuncRef, GlobalValue, Heap,
|
||||
HeapImm, Immediate, InstructionImms, JumpTable, MemFlags, Opcode, StackSlot, Table, TrapCode,
|
||||
Type, Value,
|
||||
dynamic_to_fixed, AtomicRmwOp, Block, Constant, DataFlowGraph, DynamicStackSlot, FuncRef,
|
||||
GlobalValue, Heap, HeapImm, Immediate, InstructionData, JumpTable, MemFlags, Opcode, StackSlot,
|
||||
Table, TrapCode, Type, Value,
|
||||
};
|
||||
use crate::isle_common_prelude_methods;
|
||||
use crate::machinst::isle::*;
|
||||
use crate::trace;
|
||||
pub use cranelift_egraph::{Id, NewOrExisting, NodeIter};
|
||||
use cranelift_entity::{EntityList, EntityRef};
|
||||
use smallvec::SmallVec;
|
||||
use cranelift_entity::packed_option::ReservedValue;
|
||||
use smallvec::{smallvec, SmallVec};
|
||||
use std::marker::PhantomData;
|
||||
|
||||
pub type IdArray = EntityList<Id>;
|
||||
#[allow(dead_code)]
|
||||
pub type Unit = ();
|
||||
pub type Range = (usize, usize);
|
||||
pub type ValueArray2 = [Value; 2];
|
||||
pub type ValueArray3 = [Value; 3];
|
||||
|
||||
pub type ConstructorVec<T> = SmallVec<[T; 8]>;
|
||||
|
||||
mod generated_code;
|
||||
pub(crate) mod generated_code;
|
||||
use generated_code::ContextIter;
|
||||
|
||||
struct IsleContext<'a, 'b> {
|
||||
egraph: &'a mut FuncEGraph<'b>,
|
||||
pub(crate) struct IsleContext<'a, 'b, 'c> {
|
||||
pub(crate) ctx: &'a mut OptimizeCtx<'b, 'c>,
|
||||
}
|
||||
|
||||
const REWRITE_LIMIT: usize = 5;
|
||||
|
||||
pub fn optimize_eclass<'a>(id: Id, egraph: &mut FuncEGraph<'a>) -> Id {
|
||||
trace!("running rules on eclass {}", id.index());
|
||||
egraph.stats.rewrite_rule_invoked += 1;
|
||||
|
||||
if egraph.rewrite_depth > REWRITE_LIMIT {
|
||||
egraph.stats.rewrite_depth_limit += 1;
|
||||
return id;
|
||||
}
|
||||
egraph.rewrite_depth += 1;
|
||||
|
||||
// Find all possible rewrites and union them in, returning the
|
||||
// union.
|
||||
let mut ctx = IsleContext { egraph };
|
||||
let optimized_ids = generated_code::constructor_simplify(&mut ctx, id);
|
||||
let mut union_id = id;
|
||||
if let Some(mut ids) = optimized_ids {
|
||||
while let Some(new_id) = ids.next(&mut ctx) {
|
||||
if ctx.egraph.subsume_ids.contains(&new_id) {
|
||||
trace!(" -> eclass {} subsumes {}", new_id, id);
|
||||
ctx.egraph.stats.node_subsume += 1;
|
||||
// Merge in the unionfind so canonicalization still
|
||||
// works, but take *only* the subsuming ID, and break
|
||||
// now.
|
||||
ctx.egraph.egraph.unionfind.union(union_id, new_id);
|
||||
union_id = new_id;
|
||||
break;
|
||||
}
|
||||
ctx.egraph.stats.node_union += 1;
|
||||
let old_union_id = union_id;
|
||||
union_id = ctx
|
||||
.egraph
|
||||
.egraph
|
||||
.union(&ctx.egraph.node_ctx, union_id, new_id);
|
||||
trace!(
|
||||
" -> union eclass {} with {} to get {}",
|
||||
new_id,
|
||||
old_union_id,
|
||||
union_id
|
||||
);
|
||||
}
|
||||
}
|
||||
trace!(" -> optimize {} got {}", id, union_id);
|
||||
ctx.egraph.rewrite_depth -= 1;
|
||||
union_id
|
||||
}
|
||||
|
||||
pub(crate) fn store_to_load<'a>(id: Id, egraph: &mut FuncEGraph<'a>) -> Id {
|
||||
// Note that we only examine the latest enode in the eclass: opts
|
||||
// are invoked for every new enode added to an eclass, so
|
||||
// traversing the whole eclass would be redundant.
|
||||
let load_key = egraph.egraph.classes[id].get_node().unwrap();
|
||||
if let Node::Load {
|
||||
op:
|
||||
InstructionImms::Load {
|
||||
opcode: Opcode::Load,
|
||||
offset: load_offset,
|
||||
..
|
||||
},
|
||||
ty: load_ty,
|
||||
addr: load_addr,
|
||||
mem_state,
|
||||
..
|
||||
} = load_key.node(&egraph.egraph.nodes)
|
||||
{
|
||||
if let Some(store_inst) = mem_state.as_store() {
|
||||
trace!(" -> got load op for id {}", id);
|
||||
if let Some((store_ty, store_id)) = egraph.store_nodes.get(&store_inst) {
|
||||
trace!(" -> got store id: {} ty: {}", store_id, store_ty);
|
||||
let store_key = egraph.egraph.classes[*store_id].get_node().unwrap();
|
||||
if let Node::Inst {
|
||||
op:
|
||||
InstructionImms::Store {
|
||||
opcode: Opcode::Store,
|
||||
offset: store_offset,
|
||||
..
|
||||
},
|
||||
args: store_args,
|
||||
..
|
||||
} = store_key.node(&egraph.egraph.nodes)
|
||||
{
|
||||
let store_args = store_args.as_slice(&egraph.node_ctx.args);
|
||||
let store_data = store_args[0];
|
||||
let store_addr = store_args[1];
|
||||
if *load_offset == *store_offset
|
||||
&& *load_ty == *store_ty
|
||||
&& egraph.egraph.unionfind.equiv_id_mut(*load_addr, store_addr)
|
||||
{
|
||||
trace!(" -> same offset, type, address; forwarding");
|
||||
egraph.stats.store_to_load_forward += 1;
|
||||
return store_data;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
id
|
||||
}
|
||||
|
||||
struct NodesEtorIter<'a, 'b>
|
||||
where
|
||||
'b: 'a,
|
||||
{
|
||||
root: Id,
|
||||
iter: NodeIter<NodeCtx, Analysis>,
|
||||
pub(crate) struct InstDataEtorIter<'a, 'b, 'c> {
|
||||
stack: SmallVec<[Value; 8]>,
|
||||
_phantom1: PhantomData<&'a ()>,
|
||||
_phantom2: PhantomData<&'b ()>,
|
||||
_phantom3: PhantomData<&'c ()>,
|
||||
}
|
||||
|
||||
impl<'a, 'b> generated_code::ContextIter for NodesEtorIter<'a, 'b>
|
||||
where
|
||||
'b: 'a,
|
||||
{
|
||||
type Context = IsleContext<'a, 'b>;
|
||||
type Output = (Type, InstructionImms, IdArray);
|
||||
|
||||
fn next(&mut self, ctx: &mut IsleContext<'a, 'b>) -> Option<Self::Output> {
|
||||
while let Some(node) = self.iter.next(&ctx.egraph.egraph) {
|
||||
trace!("iter from root {}: node {:?}", self.root, node);
|
||||
match node {
|
||||
Node::Pure {
|
||||
op,
|
||||
args,
|
||||
ty,
|
||||
arity,
|
||||
}
|
||||
| Node::Inst {
|
||||
op,
|
||||
args,
|
||||
ty,
|
||||
arity,
|
||||
..
|
||||
} if *arity == 1 => {
|
||||
return Some((*ty, op.clone(), args.clone()));
|
||||
}
|
||||
_ => {}
|
||||
}
|
||||
}
|
||||
None
|
||||
}
|
||||
}
|
||||
|
||||
impl<'a, 'b> generated_code::Context for IsleContext<'a, 'b> {
|
||||
isle_common_prelude_methods!();
|
||||
|
||||
fn eclass_type(&mut self, eclass: Id) -> Option<Type> {
|
||||
let mut iter = self.egraph.egraph.enodes(eclass);
|
||||
while let Some(node) = iter.next(&self.egraph.egraph) {
|
||||
match node {
|
||||
&Node::Pure { ty, arity, .. } | &Node::Inst { ty, arity, .. } if arity == 1 => {
|
||||
return Some(ty);
|
||||
}
|
||||
&Node::Load { ty, .. } => return Some(ty),
|
||||
&Node::Result { ty, .. } => return Some(ty),
|
||||
&Node::Param { ty, .. } => return Some(ty),
|
||||
_ => {}
|
||||
}
|
||||
}
|
||||
None
|
||||
}
|
||||
|
||||
fn at_loop_level(&mut self, eclass: Id) -> (u8, Id) {
|
||||
(
|
||||
self.egraph.egraph.analysis_value(eclass).loop_level.level() as u8,
|
||||
eclass,
|
||||
)
|
||||
}
|
||||
|
||||
type enodes_etor_iter = NodesEtorIter<'a, 'b>;
|
||||
|
||||
fn enodes_etor(&mut self, eclass: Id) -> Option<NodesEtorIter<'a, 'b>> {
|
||||
Some(NodesEtorIter {
|
||||
root: eclass,
|
||||
iter: self.egraph.egraph.enodes(eclass),
|
||||
impl<'a, 'b, 'c> InstDataEtorIter<'a, 'b, 'c> {
|
||||
fn new(root: Value) -> Self {
|
||||
debug_assert_ne!(root, Value::reserved_value());
|
||||
Self {
|
||||
stack: smallvec![root],
|
||||
_phantom1: PhantomData,
|
||||
_phantom2: PhantomData,
|
||||
})
|
||||
}
|
||||
|
||||
fn pure_enode_ctor(&mut self, ty: Type, op: &InstructionImms, args: IdArray) -> Id {
|
||||
let op = op.clone();
|
||||
match self.egraph.egraph.add(
|
||||
Node::Pure {
|
||||
op,
|
||||
args,
|
||||
ty,
|
||||
arity: 1,
|
||||
},
|
||||
&mut self.egraph.node_ctx,
|
||||
) {
|
||||
NewOrExisting::New(id) => {
|
||||
self.egraph.stats.node_created += 1;
|
||||
self.egraph.stats.node_pure += 1;
|
||||
self.egraph.stats.node_ctor_created += 1;
|
||||
optimize_eclass(id, self.egraph)
|
||||
}
|
||||
NewOrExisting::Existing(id) => {
|
||||
self.egraph.stats.node_ctor_deduped += 1;
|
||||
id
|
||||
}
|
||||
_phantom3: PhantomData,
|
||||
}
|
||||
}
|
||||
|
||||
fn id_array_0_etor(&mut self, arg0: IdArray) -> Option<()> {
|
||||
let values = arg0.as_slice(&self.egraph.node_ctx.args);
|
||||
if values.len() == 0 {
|
||||
Some(())
|
||||
} else {
|
||||
None
|
||||
}
|
||||
}
|
||||
|
||||
fn id_array_0_ctor(&mut self) -> IdArray {
|
||||
EntityList::default()
|
||||
}
|
||||
|
||||
fn id_array_1_etor(&mut self, arg0: IdArray) -> Option<Id> {
|
||||
let values = arg0.as_slice(&self.egraph.node_ctx.args);
|
||||
if values.len() == 1 {
|
||||
Some(values[0])
|
||||
} else {
|
||||
None
|
||||
}
|
||||
}
|
||||
|
||||
fn id_array_1_ctor(&mut self, arg0: Id) -> IdArray {
|
||||
EntityList::from_iter([arg0].into_iter(), &mut self.egraph.node_ctx.args)
|
||||
}
|
||||
|
||||
fn id_array_2_etor(&mut self, arg0: IdArray) -> Option<(Id, Id)> {
|
||||
let values = arg0.as_slice(&self.egraph.node_ctx.args);
|
||||
if values.len() == 2 {
|
||||
Some((values[0], values[1]))
|
||||
} else {
|
||||
None
|
||||
}
|
||||
}
|
||||
|
||||
fn id_array_2_ctor(&mut self, arg0: Id, arg1: Id) -> IdArray {
|
||||
EntityList::from_iter([arg0, arg1].into_iter(), &mut self.egraph.node_ctx.args)
|
||||
}
|
||||
|
||||
fn id_array_3_etor(&mut self, arg0: IdArray) -> Option<(Id, Id, Id)> {
|
||||
let values = arg0.as_slice(&self.egraph.node_ctx.args);
|
||||
if values.len() == 3 {
|
||||
Some((values[0], values[1], values[2]))
|
||||
} else {
|
||||
None
|
||||
}
|
||||
}
|
||||
|
||||
fn id_array_3_ctor(&mut self, arg0: Id, arg1: Id, arg2: Id) -> IdArray {
|
||||
EntityList::from_iter(
|
||||
[arg0, arg1, arg2].into_iter(),
|
||||
&mut self.egraph.node_ctx.args,
|
||||
)
|
||||
}
|
||||
|
||||
fn remat(&mut self, id: Id) -> Id {
|
||||
trace!("remat: {}", id);
|
||||
self.egraph.remat_ids.insert(id);
|
||||
id
|
||||
}
|
||||
|
||||
fn subsume(&mut self, id: Id) -> Id {
|
||||
trace!("subsume: {}", id);
|
||||
self.egraph.subsume_ids.insert(id);
|
||||
id
|
||||
}
|
||||
}
|
||||
|
||||
impl<'a, 'b, 'c> ContextIter for InstDataEtorIter<'a, 'b, 'c>
|
||||
where
|
||||
'b: 'a,
|
||||
'c: 'b,
|
||||
{
|
||||
type Context = IsleContext<'a, 'b, 'c>;
|
||||
type Output = (Type, InstructionData);
|
||||
|
||||
fn next(&mut self, ctx: &mut IsleContext<'a, 'b, 'c>) -> Option<Self::Output> {
|
||||
while let Some(value) = self.stack.pop() {
|
||||
debug_assert_ne!(value, Value::reserved_value());
|
||||
let value = ctx.ctx.func.dfg.resolve_aliases(value);
|
||||
trace!("iter: value {:?}", value);
|
||||
match ctx.ctx.func.dfg.value_def(value) {
|
||||
ValueDef::Union(x, y) => {
|
||||
debug_assert_ne!(x, Value::reserved_value());
|
||||
debug_assert_ne!(y, Value::reserved_value());
|
||||
trace!(" -> {}, {}", x, y);
|
||||
self.stack.push(x);
|
||||
self.stack.push(y);
|
||||
continue;
|
||||
}
|
||||
ValueDef::Result(inst, _) if ctx.ctx.func.dfg.inst_results(inst).len() == 1 => {
|
||||
let ty = ctx.ctx.func.dfg.value_type(value);
|
||||
trace!(" -> value of type {}", ty);
|
||||
return Some((ty, ctx.ctx.func.dfg[inst].clone()));
|
||||
}
|
||||
_ => {}
|
||||
}
|
||||
}
|
||||
None
|
||||
}
|
||||
}
|
||||
|
||||
impl<'a, 'b, 'c> generated_code::Context for IsleContext<'a, 'b, 'c> {
|
||||
isle_common_prelude_methods!();
|
||||
|
||||
type inst_data_etor_iter = InstDataEtorIter<'a, 'b, 'c>;
|
||||
|
||||
fn inst_data_etor(&mut self, eclass: Value) -> Option<InstDataEtorIter<'a, 'b, 'c>> {
|
||||
Some(InstDataEtorIter::new(eclass))
|
||||
}
|
||||
|
||||
fn make_inst_ctor(&mut self, ty: Type, op: &InstructionData) -> Value {
|
||||
let value = self
|
||||
.ctx
|
||||
.insert_pure_enode(NewOrExistingInst::New(op.clone(), ty));
|
||||
trace!("make_inst_ctor: {:?} -> {}", op, value);
|
||||
value
|
||||
}
|
||||
|
||||
fn value_array_2_ctor(&mut self, arg0: Value, arg1: Value) -> ValueArray2 {
|
||||
[arg0, arg1]
|
||||
}
|
||||
|
||||
fn value_array_3_ctor(&mut self, arg0: Value, arg1: Value, arg2: Value) -> ValueArray3 {
|
||||
[arg0, arg1, arg2]
|
||||
}
|
||||
|
||||
#[inline]
|
||||
fn value_type(&mut self, val: Value) -> Type {
|
||||
self.ctx.func.dfg.value_type(val)
|
||||
}
|
||||
|
||||
fn remat(&mut self, value: Value) -> Value {
|
||||
trace!("remat: {}", value);
|
||||
self.ctx.remat_values.insert(value);
|
||||
self.ctx.stats.remat += 1;
|
||||
value
|
||||
}
|
||||
|
||||
fn subsume(&mut self, value: Value) -> Value {
|
||||
trace!("subsume: {}", value);
|
||||
self.ctx.subsume_values.insert(value);
|
||||
self.ctx.stats.subsume += 1;
|
||||
value
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user