egraphs: fix accidental remat of call. (#5726)

In the provided test case in #5716, the result of a call was then
added to 0. We have a rewrite rule that sets the remat-bit on any add
of a value and a constant, because these frequently appear (e.g. from
address offset calculations) and this can frequently reduce register
pressure (one long-lived base vs. many long-lived base+offset values).
Separately, we have an algebraic rule that `x+0` rewrites to `x`.

The result of this was that we had an eclass with the remat bit set on
the add, but the add was also union'd into the call. We pick the
latter during extraction, because it's cheaper not to do the add at
all; but we still get the remat bit, and try to remat a call (!),
which blows up later.

This PR fixes the logic to look up the "best value" for a value (i.e.,
whatever extraction determined), and look up the remat bit on *that*
node, not the canonical node.

(Why did the canonical node become the iadd and not the call? Because
the former had a lower value-number, as an accident of IR
construction; we don't impose any requirements on the input CLIF's
value-number ordering, and I don't think this breaks any of the
important acyclic properties, even though there is technically a
dependence from a lower-numbered to a higher-numbered node. In essence
one can think of them as having "virtual numbers" in any true
topologically-sorted order, and the only place the actual integer
indices matter should be in choosing the "canonical ID", which is just
used for dedup'ing, modulo this bug.)

Fixes #5716.
This commit is contained in:
Chris Fallin
2023-02-06 15:36:16 -08:00
committed by GitHub
parent 16afefdab1
commit 75ae976adc
2 changed files with 51 additions and 10 deletions

View File

@@ -278,6 +278,14 @@ impl<'a> Elaborator<'a> {
before before
); );
// Get the best option; we use `value` (latest
// value) here so we have a full view of the
// eclass.
trace!("looking up best value for {}", value);
let (_, best_value) = self.value_to_best_value[value];
debug_assert_ne!(best_value, Value::reserved_value());
trace!("elaborate: value {} -> best {}", value, best_value);
let remat = if let Some(elab_val) = let remat = if let Some(elab_val) =
self.value_to_elaborated_value.get(&canonical_value) self.value_to_elaborated_value.get(&canonical_value)
{ {
@@ -287,7 +295,7 @@ impl<'a> Elaborator<'a> {
// from another block. If so, ignore the hit // from another block. If so, ignore the hit
// and recompute below. // and recompute below.
let remat = elab_val.in_block != self.cur_block let remat = elab_val.in_block != self.cur_block
&& self.remat_values.contains(&canonical_value); && self.remat_values.contains(&best_value);
if !remat { if !remat {
trace!("elaborate: value {} -> {:?}", value, elab_val); trace!("elaborate: value {} -> {:?}", value, elab_val);
self.stats.elaborate_memoize_hit += 1; self.stats.elaborate_memoize_hit += 1;
@@ -304,20 +312,12 @@ impl<'a> Elaborator<'a> {
// Value not available; but still look up // Value not available; but still look up
// whether it's been flagged for remat because // whether it's been flagged for remat because
// this affects placement. // this affects placement.
let remat = self.remat_values.contains(&canonical_value); let remat = self.remat_values.contains(&best_value);
trace!(" -> not present in map; remat = {}", remat); trace!(" -> not present in map; remat = {}", remat);
remat remat
}; };
self.stats.elaborate_memoize_miss += 1; self.stats.elaborate_memoize_miss += 1;
// Get the best option; we use `value` (latest
// value) here so we have a full view of the
// eclass.
trace!("looking up best value for {}", value);
let (_, best_value) = self.value_to_best_value[value];
debug_assert_ne!(best_value, Value::reserved_value());
trace!("elaborate: value {} -> best {}", value, best_value,);
// Now resolve the value to its definition to see // Now resolve the value to its definition to see
// how we can compute it. // how we can compute it.
let (inst, result_idx) = match self.func.dfg.value_def(best_value) { let (inst, result_idx) = match self.func.dfg.value_def(best_value) {

View File

@@ -0,0 +1,41 @@
;; This tests that a call does not get rematerialized, even if a remat flag is
;; set on a different node in its eclass.
;;
;; Below, `v97` is an add of `v238` (the call's first return value) and a
;; constant 0; a mid-end rule rewrites this to just `v238` (i.e., `v97` is unioned
;; in). Separately, a rule states that an add of a value and a constant always
;; gets rematerialized at use. When `v97` is used in a later block, it would have
;; rematerialized the add; except, if we instead use the result of the call
;; directly, we should *not* remat the call. If we do, a compile error results
;; later.
test compile
set opt_level=speed_and_size
target aarch64
function u0:33() system_v {
ss0 = explicit_slot 32
sig0 = (i64, i64, i64, i64, i64) -> i64, i64 system_v
fn0 = colocated u0:0 sig0
jt0 = jump_table [block36, block38]
block0:
v80 = iconst.i32 0
v91 = iconst.i64 0
v92 = iconst.i64 0
v96 = iconst.i64 0
v235 = iconst.i64 0
v236 = iconst.i64 0
v237 = iconst.i64 0
v238, v239 = call fn0(v236, v237, v91, v92, v235) ; v236 = 0, v237 = 0, v91 = 0, v92 = 0, v235 = 0
v97 = iadd v238, v96 ; v96 = 0
br_table v80, block37, jt0 ; v80 = 0
block36:
trap user0
block37:
trap unreachable
block38:
v98 = load.i8 notrap v97
v99 = fcvt_from_uint.f64 v98
stack_store v99, ss0
trap user0
}