In the provided test case in #5716, the result of a call was then added to 0. We have a rewrite rule that sets the remat-bit on any add of a value and a constant, because these frequently appear (e.g. from address offset calculations) and this can frequently reduce register pressure (one long-lived base vs. many long-lived base+offset values). Separately, we have an algebraic rule that `x+0` rewrites to `x`. The result of this was that we had an eclass with the remat bit set on the add, but the add was also union'd into the call. We pick the latter during extraction, because it's cheaper not to do the add at all; but we still get the remat bit, and try to remat a call (!), which blows up later. This PR fixes the logic to look up the "best value" for a value (i.e., whatever extraction determined), and look up the remat bit on *that* node, not the canonical node. (Why did the canonical node become the iadd and not the call? Because the former had a lower value-number, as an accident of IR construction; we don't impose any requirements on the input CLIF's value-number ordering, and I don't think this breaks any of the important acyclic properties, even though there is technically a dependence from a lower-numbered to a higher-numbered node. In essence one can think of them as having "virtual numbers" in any true topologically-sorted order, and the only place the actual integer indices matter should be in choosing the "canonical ID", which is just used for dedup'ing, modulo this bug.) Fixes #5716.
42 lines
1.3 KiB
Plaintext
42 lines
1.3 KiB
Plaintext
;; This tests that a call does not get rematerialized, even if a remat flag is
|
|
;; set on a different node in its eclass.
|
|
;;
|
|
;; Below, `v97` is an add of `v238` (the call's first return value) and a
|
|
;; constant 0; a mid-end rule rewrites this to just `v238` (i.e., `v97` is unioned
|
|
;; in). Separately, a rule states that an add of a value and a constant always
|
|
;; gets rematerialized at use. When `v97` is used in a later block, it would have
|
|
;; rematerialized the add; except, if we instead use the result of the call
|
|
;; directly, we should *not* remat the call. If we do, a compile error results
|
|
;; later.
|
|
|
|
test compile
|
|
set opt_level=speed_and_size
|
|
target aarch64
|
|
|
|
function u0:33() system_v {
|
|
ss0 = explicit_slot 32
|
|
sig0 = (i64, i64, i64, i64, i64) -> i64, i64 system_v
|
|
fn0 = colocated u0:0 sig0
|
|
jt0 = jump_table [block36, block38]
|
|
block0:
|
|
v80 = iconst.i32 0
|
|
v91 = iconst.i64 0
|
|
v92 = iconst.i64 0
|
|
v96 = iconst.i64 0
|
|
v235 = iconst.i64 0
|
|
v236 = iconst.i64 0
|
|
v237 = iconst.i64 0
|
|
v238, v239 = call fn0(v236, v237, v91, v92, v235) ; v236 = 0, v237 = 0, v91 = 0, v92 = 0, v235 = 0
|
|
v97 = iadd v238, v96 ; v96 = 0
|
|
br_table v80, block37, jt0 ; v80 = 0
|
|
block36:
|
|
trap user0
|
|
block37:
|
|
trap unreachable
|
|
block38:
|
|
v98 = load.i8 notrap v97
|
|
v99 = fcvt_from_uint.f64 v98
|
|
stack_store v99, ss0
|
|
trap user0
|
|
}
|