Files
wasmtime/cranelift/entity
Chris Fallin 7b8854f803 egraphs: fix handling of effectful-but-idempotent ops and GVN. (#5800)
* Revert "egraphs: disable GVN of effectful idempotent ops (temporarily). (#5808)"

This reverts commit c7e2571866.

* egraphs: fix handling of effectful-but-idempotent ops and GVN.

This PR addresses #5796: currently, ops that are effectful, i.e., remain
in the side-effecting skeleton (which we keep in the `Layout` while the
egraph exists), but are idempotent and thus mergeable by a GVN pass, are
not handled properly.

GVN is still possible on effectful but idempotent ops precisely because
our GVN does not create partial redundancies: it removes an instruction
only when it is dominated by an identical instruction. An isntruction
will not be "hoisted" to a point where it could execute in the optimized
code but not in the original.

However, there are really two parts to the egraph implementation that
produce this effect: the deduplication on insertion into the egraph, and
the elaboration with a scoped hashmap. The deduplication lets us give a
single name (value ID) to all copies of an identical instruction, and
then elaboration will re-create duplicates if GVN should not hoist or
merge some of them.

Because deduplication need not worry about dominance or scopes, we use a
simple (non-scoped) hashmap to dedup/intern ops as "egraph nodes".

When we added support for GVN'ing effectful but idempotent ops (#5594),
we kept the use of this simple dedup'ing hashmap, but these ops do not
get elaborated; instead they stay in the side-effecting skeleton. Thus,
we inadvertently created potential for weird code-motion effects.

The proposal in #5796 would solve this in a clean way by treating these
ops as pure again, and keeping them out of the skeleton, instead putting
"force" pseudo-ops in the skeleton. However, this is a little more
complex than I would like, and I've realized that @jameysharp's earlier
suggestion is much simpler: we can keep an actual scoped hashmap
separately just for the effectful-but-idempotent ops, and use it to GVN
while we build the egraph. In effect, we're fusing a separate GVN pass
with the egraph pass (but letting it interact corecursively with
egraph rewrites. This is in principle similar to how we keep a separate
map for loads and fuse this pass with the egraph rewrite pass as well.

Note that we can use a `ScopedHashMap` here without the "context" (as
needed by `CtxHashMap`) because, as noted by @jameysharp, in practice
the ops we want to GVN have all their args inline. Equality on the
`InstructinoData` itself is conservative: two insts whose struct
contents compare shallowly equal are definitely identical, but identical
insts in a deep-equality sense may not compare shallowly equal, due to
list indirection. This is fine for GVN, because it is still sound to
skip any given GVN opportunity (and keep the original instructions).

Fixes #5796.

* Add comments from review.
2023-03-02 02:10:42 +00:00
..
2023-02-06 09:10:19 -06:00

This crate contains array-based data structures used by the core Cranelift code generator which use densely numbered entity references as mapping keys.

One major difference between this crate and crates like slotmap, slab, and generational-arena is that this crate currently provides no way to delete entities. This limits its use to situations where deleting isn't important, however this also makes it more efficient, because it doesn't need extra bookkeeping state to reuse the storage for deleted objects, or to ensure that new objects always have unique keys (eg. slotmap's and generational-arena's versioning).

Another major difference is that this crate protects against using a key from one map to access an element in another. Where SlotMap, Slab, and Arena have a value type parameter, PrimaryMap has a key type parameter and a value type parameter. The crate also provides the entity_impl macro which makes it easy to declare new unique types for use as keys. Any attempt to use a key in a map it's not intended for is diagnosed with a type error.

Another is that this crate has two core map types, PrimaryMap and SecondaryMap, which serve complementary purposes. A PrimaryMap creates its own keys when elements are inserted, while an SecondaryMap reuses the keys values of a PrimaryMap, conceptually storing additional data in the same index space. SecondaryMap's values must implement Default and all elements in an SecondaryMap initially have the value of default().

A common way to implement Default is to wrap a type in Option, however this crate also provides the PackedOption utility which can use less memory in some cases.

Additional utilities provided by this crate include:

  • EntityList, for allocating many small arrays (such as instruction operand lists in a compiler code generator).
  • SparseMap: an alternative to SecondaryMap which can use less memory in some situations.
  • EntitySet: a specialized form of SecondaryMap using a bitvector to record which entities are members of the set.