wasmtime

Author SHA1 Message Date

Author	SHA1	Message	Date
Chris Fallin	230e2135d6	Cranelift: remove non-egraphs optimization pipeline and `use_egraphs` option. (#6167 ) * Cranelift: remove non-egraphs optimization pipeline and `use_egraphs` option. This PR removes the LICM, GVN, and preopt passes, and associated support pieces, from `cranelift-codegen`. Not to worry, we still have optimizations: the egraph framework subsumes all of these, and has been on by default since #5181. A few decision points: - Filetests for the legacy LICM, GVN and simple_preopt were removed too. As we built optimizations in the egraph framework we wrote new tests for the equivalent functionality, and many of the old tests were testing specific behaviors in the old implementations that may not be relevant anymore. However if folks prefer I could take a different approach here and try to port over all of the tests. - The corresponding filetest modes (commands) were deleted too. The `test alias_analysis` mode remains, but no longer invokes a separate GVN first (since there is no separate GVN that will not also do alias analysis) so the tests were tweaked slightly to work with that. The egrpah testsuite also covers alias analysis. - The `divconst_magic_numbers` module is removed since it's unused without `simple_preopt`, though this is the one remaining optimization we still need to build in the egraphs framework, pending #5908. The magic numbers will live forever in git history so removing this in the meantime is not a major issue IMHO. - The `use_egraphs` setting itself was removed at both the Cranelift and Wasmtime levels. It has been marked deprecated for a few releases now (Wasmtime 6.0, 7.0, upcoming 8.0, and corresponding Cranelift versions) so I think this is probably OK. As an alternative if anyone feels strongly, we could leave the setting and make it a no-op. * Update test outputs for remaining test differences.	2023-04-06 18:11:03 +00:00
Chris Fallin	0824abbae4	Add a basic alias analysis with redundant-load elim and store-to-load fowarding opts. (#4163 ) This PR adds a basic alias analysis, and optimizations that use it. This is a "mid-end optimization": it operates on CLIF, the machine-independent IR, before lowering occurs. The alias analysis (or maybe more properly, a sort of memory-value analysis) determines when it can prove a particular memory location is equal to a given SSA value, and when it can, it replaces any loads of that location. This subsumes two common optimizations: * Redundant load elimination: when the same memory address is loaded two times, and it can be proven that no intervening operations will write to that memory, then the second load is redundant and its result must be the same as the first. We can use the first load's result and remove the second load. * Store-to-load forwarding: when a load can be proven to access exactly the memory written by a preceding store, we can replace the load's result with the store's data operand, and remove the load. Both of these optimizations rely on a "last store" analysis that is a sort of coloring mechanism, split across disjoint categories of abstract state. The basic idea is that every memory-accessing operation is put into one of N disjoint categories; it is disallowed for memory to ever be accessed by an op in one category and later accessed by an op in another category. (The frontend must ensure this.) Then, given this, we scan the code and determine, for each memory-accessing op, when a single prior instruction is a store to the same category. This "colors" the instruction: it is, in a sense, a static name for that version of memory. This analysis provides an important invariant: if two operations access memory with the same last-store, then no other store can alias in the time between that last store and these operations. This must-not-alias property, together with a check that the accessed address is exactly the same (same SSA value and offset), and other attributes of the access (type, extension mode) are the same, let us prove that the results are the same. Given last-store info, we scan the instructions and build a table from "memory location" key (last store, address, offset, type, extension) to known SSA value stored in that location. A store inserts a new mapping. A load may also insert a new mapping, if we didn't already have one. Then when a load occurs and an entry already exists for its "location", we can reuse the value. This will be either RLE or St-to-Ld depending on where the value came from. Note that this does work across basic blocks: the last-store analysis is a full iterative dataflow pass, and we are careful to check dominance of a previously-defined value before aliasing to it at a potentially redundant load. So we will do the right thing if we only have a "partially redundant" load (loaded already but only in one predecessor block), but we will also correctly reuse a value if there is a store or load above a loop and a redundant load of that value within the loop, as long as no potentially-aliasing stores happen within the loop.	2022-05-20 13:19:32 -07:00

Chris Fallin

230e2135d6

Cranelift: remove non-egraphs optimization pipeline and use_egraphs option. (#6167 )

* Cranelift: remove non-egraphs optimization pipeline and `use_egraphs` option.

This PR removes the LICM, GVN, and preopt passes, and associated support
pieces, from `cranelift-codegen`. Not to worry, we still have
optimizations: the egraph framework subsumes all of these, and has been
on by default since #5181.

A few decision points:

- Filetests for the legacy LICM, GVN and simple_preopt were removed too.
  As we built optimizations in the egraph framework we wrote new tests
  for the equivalent functionality, and many of the old tests were
  testing specific behaviors in the old implementations that may not be
  relevant anymore. However if folks prefer I could take a different
  approach here and try to port over all of the tests.

- The corresponding filetest modes (commands) were deleted too. The
  `test alias_analysis` mode remains, but no longer invokes a separate
  GVN first (since there is no separate GVN that will not also do alias
  analysis) so the tests were tweaked slightly to work with that. The
  egrpah testsuite also covers alias analysis.

- The `divconst_magic_numbers` module is removed since it's unused
  without `simple_preopt`, though this is the one remaining optimization
  we still need to build in the egraphs framework, pending #5908. The
  magic numbers will live forever in git history so removing this in the
  meantime is not a major issue IMHO.

- The `use_egraphs` setting itself was removed at both the Cranelift and
  Wasmtime levels. It has been marked deprecated for a few releases now
  (Wasmtime 6.0, 7.0, upcoming 8.0, and corresponding Cranelift
  versions) so I think this is probably OK. As an alternative if anyone
  feels strongly, we could leave the setting and make it a no-op.

* Update test outputs for remaining test differences.

2023-04-06 18:11:03 +00:00

Chris Fallin

0824abbae4

Add a basic alias analysis with redundant-load elim and store-to-load fowarding opts. (#4163 )

This PR adds a basic *alias analysis*, and optimizations that use it.
This is a "mid-end optimization": it operates on CLIF, the
machine-independent IR, before lowering occurs.

The alias analysis (or maybe more properly, a sort of memory-value
analysis) determines when it can prove a particular memory
location is equal to a given SSA value, and when it can, it replaces any
loads of that location.

This subsumes two common optimizations:

* Redundant load elimination: when the same memory address is loaded two
  times, and it can be proven that no intervening operations will write
  to that memory, then the second load is *redundant* and its result
  must be the same as the first. We can use the first load's result and
  remove the second load.

* Store-to-load forwarding: when a load can be proven to access exactly
  the memory written by a preceding store, we can replace the load's
  result with the store's data operand, and remove the load.

Both of these optimizations rely on a "last store" analysis that is a
sort of coloring mechanism, split across disjoint categories of abstract
state. The basic idea is that every memory-accessing operation is put
into one of N disjoint categories; it is disallowed for memory to ever
be accessed by an op in one category and later accessed by an op in
another category. (The frontend must ensure this.)

Then, given this, we scan the code and determine, for each
memory-accessing op, when a single prior instruction is a store to the
same category. This "colors" the instruction: it is, in a sense, a
static name for that version of memory.

This analysis provides an important invariant: if two operations access
memory with the same last-store, then *no other store can alias* in the
time between that last store and these operations. This must-not-alias
property, together with a check that the accessed address is *exactly
the same* (same SSA value and offset), and other attributes of the
access (type, extension mode) are the same, let us prove that the
results are the same.

Given last-store info, we scan the instructions and build a table from
"memory location" key (last store, address, offset, type, extension) to
known SSA value stored in that location. A store inserts a new mapping.
A load may also insert a new mapping, if we didn't already have one.
Then when a load occurs and an entry already exists for its "location",
we can reuse the value. This will be either RLE or St-to-Ld depending on
where the value came from.

Note that this *does* work across basic blocks: the last-store analysis
is a full iterative dataflow pass, and we are careful to check dominance
of a previously-defined value before aliasing to it at a potentially
redundant load. So we will do the right thing if we only have a
"partially redundant" load (loaded already but only in one predecessor
block), but we will also correctly reuse a value if there is a store or
load above a loop and a redundant load of that value within the loop, as
long as no potentially-aliasing stores happen within the loop.

2022-05-20 13:19:32 -07:00

2 Commits