This patch implements, for aarch64, the following wasm SIMD extensions
i32x4.dot_i16x8_s instruction
https://github.com/WebAssembly/simd/pull/127
It also updates dependencies as follows, in order that the new instruction can
be parsed, decoded, etc:
wat to 1.0.27
wast to 26.0.1
wasmparser to 0.65.0
wasmprinter to 0.2.12
The changes are straightforward:
* new CLIF instruction `widening_pairwise_dot_product_s`
* translation from wasm into `widening_pairwise_dot_product_s`
* new AArch64 instructions `smull`, `smull2` (part of the `VecRRR` group)
* translation from `widening_pairwise_dot_product_s` to `smull ; smull2 ; addv`
There is no testcase in this commit, because that is a separate repo. The
implementation has been tested, nevertheless.
Rather than using paths from the root instruction to the instruction we are
matching against or checking if it is constant or whatever, use temporary
variables. When we successfully match an instruction's opcode, we simultaneously
define these temporaries for the instruction's operands. This is similar to how
open-coding these matches in Rust would use `match` expressions with pattern
matching to bind the operands to variables at the same time.
This saves about 1.8% of instructions retired when Peepmatic is enabled.
This commit splits "increments" in two; they previously contained both the
linearized left- and right-hand sides. But only the first increment ever had any
actions, so it was confusing (and space wasting) that all increments had an
"actions" vector. No more!
This commit separates the linearized left-hand side ("matches") from the
linearized right-hand side ("actions").
This lets us avoid the cost of `cranelift_codegen::ir::Opcode` to
`peepmatic_runtime::Operator` conversion overhead, and paves the way for
allowing Peepmatic to support non-clif optimizations (e.g. vcode optimizations).
Rather than defining our own `peepmatic::Operator` type like we used to, now the
whole `peepmatic` crate is effectively generic over a `TOperator` type
parameter. For the Cranelift integration, we use `cranelift_codegen::ir::Opcode`
as the concrete type for our `TOperator` type parameter. For testing, we also
define a `TestOperator` type, so that we can test Peepmatic code without
building all of Cranelift, and we can keep them somewhat isolated from each
other.
The methods that `peepmatic::Operator` had are now translated into trait bounds
on the `TOperator` type. These traits need to be shared between all of
`peepmatic`, `peepmatic-runtime`, and `cranelift-codegen`'s Peepmatic
integration. Therefore, these new traits live in a new crate:
`peepmatic-traits`. This crate acts as a header file of sorts for shared
trait/type/macro definitions.
Additionally, the `peepmatic-runtime` crate no longer depends on the
`peepmatic-macro` procedural macro crate, which should lead to faster build
times for Cranelift when it is using pre-built peephole optimizers.
Also add configuration to CI to fail doc generation if any links are
broken. Unfortunately we can't blanket deny all warnings in rustdoc
since some are unconditional warnings, but for now this is hopefully
good enough.
Closes#1947
The `peepmatic-runtime` crate contains everything required to use a
`peepmatic`-generated peephole optimizer.
In short: build times and code size.
If you are just using a peephole optimizer, you shouldn't need the functions
to construct it from scratch from the DSL (and the implied code size and
compilation time), let alone even build it at all. You should just
deserialize an already-built peephole optimizer, and then use it.
That's all that is contained here in this crate.