Commit Graph

1001 Commits

Author SHA1 Message Date
Alex Crichton
9e87e45745 Update wasmparser, wast, and spec test suite (#2264)
This brings in a number of SIMD opcode renames, various other test suite
updates, as well as some new proposed SIMD opcodes too.
2020-10-05 13:51:16 -05:00
Benjamin Bouvier
df8f85f4bc machinst x64: remove non_camel_case_types; 2020-10-05 17:44:31 +02:00
Benjamin Bouvier
4a10a78e33 machinst x64: remove non_snake_case; 2020-10-05 17:44:31 +02:00
Johnnie Birch
7b4d173b90 Adds packed floating point min/max for X64 for the new backend
Allows for simd_f32x4 and simd_f64x2 spec tests
2020-10-02 16:20:10 -07:00
Chris Fallin
3ca173e4bc Fix arm32 build after some ABI framework changes.
It turns out that while we don't have the partial/experimental arm32
backend tested on our CI yet, the Firefox build *does* at least rely on
the backend to build, because it specifies the `arm32` feature to
`cranelift-codegen`, even if it will never invoke the backend.
Our previous old-framework arm32 stub at least compiled, so it didn't
break Firefox.

We should probably add a CI build check to ensure we don't bitrot what
we have here, but this is the immediate fix to get us back to sanity.
2020-10-02 11:55:46 -07:00
Chris Fallin
b2f52910fb Merge pull request #2224 from jgouly/sp_adjust
arm64: Use SignedOffset rather than PreIndexed addressing mode for ca…
2020-10-02 09:18:00 -07:00
Andrew Brown
ca1b76421a [machinst x64]: remove duplicate code to insert a lane 2020-10-02 08:29:31 -07:00
Andrew Brown
c42a097a0c [machinst x64]: use is64 instead of w_bit 2020-10-02 08:29:31 -07:00
Andrew Brown
16a2538ecd [machinst x64]: rename Inst::XmmUninitializedValue and document
This approach is not the best but avoids an extra instruction; perhaps at some point, as mentioned in https://github.com/bytecodealliance/wasmtime/pull/2248, we will add the extra instruction or refactor things in such a way that this `Inst` variant is unnecessary.
2020-10-02 08:29:31 -07:00
Andrew Brown
50b9399006 [machinst x64]: lower remaining lane operations--any_true, all_true, splat 2020-10-02 08:29:31 -07:00
Andrew Brown
4565582f02 [machinst x64]: clarify parameter name of Inst::xmm_rm_r_imm 2020-10-02 08:29:31 -07:00
Andrew Brown
0579e9f9de [machinst x64]: add packed OR 2020-10-02 08:29:31 -07:00
Andrew Brown
74226d6781 [machinst x64]: add integer comparisons 2020-10-02 08:29:31 -07:00
Joey Gouly
eec60c9b06 arm64: Use SignedOffset rather than PreIndexed addressing mode for callee-saved registers
This also passes `fixed_frame_storage_size` (previously `total_sp_adjust`)
into `gen_clobber_save` so that it can be combined with other stack
adjustments.

Copyright (c) 2020, Arm Limited.
2020-10-02 16:22:55 +01:00
Chris Fallin
b8f0dc429f Merge pull request #2223 from cfallin/baldrdash-2020
Support for SpiderMonkey's "Wasm ABI 2020" in general and on AArch64.
2020-09-30 15:33:05 -07:00
Chris Fallin
835db11bea Support for SpiderMonkey's "Wasm ABI 2020".
As part of a Wasm JIT update, SpiderMonkey is changing its internal
WebAssembly function ABI. The new ABI's frame format includes "caller
TLS" and "callee TLS" slots. The details of where these come from are
not important; from Cranelift's point of view, the only relevant
requirement is that we have two on-stack args that are always present
(offsetting other on-stack args), and that we define special argument
purposes so that we can supply values for these slots.

Note that this adds a *new* ABI (a variant of the Baldrdash ABI) because
we do not want to tightly couple the landing of this PR to the landing
of the changes in SpiderMonkey; it's better if both the old and new
behavior remain available in Cranelift, so SpiderMonkey can continue to
vendor Cranelift even if it does not land (or backs out) the ABI change.

Furthermore, note that this needs to be a Cranelift-level change (i.e.
cannot be done purely from the translator environment implementation)
because the special TLS arguments must always go on the stack, which
would not otherwise happen with the usual argument-placement logic; and
there is no primitive to push a value directly in CLIF code (the notion
of a stack frame is a lower-level concept).
2020-09-30 14:55:56 -07:00
Andrew Brown
4484a00ea5 [machinst x64]: calculate extension modes in one place 2020-09-29 14:48:59 -07:00
Andrew Brown
715be68101 [machinst x64]: assert lane is correct size for extractlane
This change applies a good suggestion @bjorn3 made in #2230 that I forgot to implement there.
2020-09-29 09:34:22 -07:00
Andrew Brown
f50d905152 [machinst x64]: refactor using added RegMem::from(Writable<Reg>) 2020-09-29 08:45:12 -07:00
Andrew Brown
e3eb098c99 [machinst x64]: add swizzle implementation 2020-09-29 08:45:12 -07:00
Andrew Brown
050f078f86 [machinst x64]: add saturating addition implementation 2020-09-29 08:45:12 -07:00
Andrew Brown
a64abf9b76 [machinst x64]: add shuffle implementation 2020-09-29 08:45:12 -07:00
Andrew Brown
f4836f9ca9 [machinst x64]: add extractlane implementation 2020-09-29 08:45:12 -07:00
Andrew Brown
29fa894790 [machinst x64]: add insertlane implementation 2020-09-29 08:45:12 -07:00
Pat Hickey
b10beeee01 dep gardening (#2233)
* wasmtime-profiling: latest object dep is 0.21.1

* latest gimli is 0.22

* bump cargo.lock
2020-09-26 00:49:28 -05:00
Andrew Brown
48cf45491d [machinst x64]: inform the register allocator of more types of packed moves 2020-09-25 18:59:01 -07:00
Andrew Brown
ac2bf9d246 [machinst x64]: add packed min/max implementations 2020-09-23 15:40:46 -07:00
Andrew Brown
7546d98844 [machinst x64]: add avg_round implementation 2020-09-23 15:40:46 -07:00
Andrew Brown
b202464fa0 [machinst x64]: add iabs implementation 2020-09-23 15:40:46 -07:00
Alex Crichton
5e08eb3b83 Bump wasmtime to 0.20.0 (#2222)
At the same time bump cranelift crates to 0.67.0
2020-09-23 13:54:02 -05:00
Benjamin Bouvier
79cff73da5 machinst x64: implement loads/stores for v128 SIMD types;
This made it possible to enable more SIMD tests from the spec test suite
too.
2020-09-23 16:42:03 +02:00
Jakub Krauz
bab3c73100 Put arm32 backend behind experimental_arm32 flag 2020-09-22 12:53:14 +02:00
Jakub Krauz
f6a140a662 arm32 codegen
This commit adds arm32 code generation for some IR insts.
Floating-point instructions are not supported, because regalloc
does not allow to represent overlapping register classes,
which are needed by VFP/Neon.

There is also no support for big-endianness, I64 and I128 types.
2020-09-22 12:49:42 +02:00
bjorn3
45ccc6940e Fix Switch for 128bit integers 2020-09-21 14:50:59 +02:00
Chris Fallin
1c7fa7f785 Merge pull request #2181 from jgouly/madd-opt
arm64: Combine mul + add into madd
2020-09-15 11:52:33 -07:00
Joshua Nelson
d28abad441 Upgrade to target-lexicon 0.11
This allows downstream library users to use `CDataModel` without having
to install two different versions of target-lexicon.
2020-09-15 11:40:09 -07:00
Nick Fitzgerald
e1c8878b33 cranelift_codegen::souper_harvest: Move preopt out of Context, into clif-util
This allows for more flexibility of when/where to harvest LHS candidates. For
example, we could choose to harvest candidates that overlap with and supercede
our current preopt peepholes.

This commit also makes sure that we compute the CFG before running preopt, when
harvesting LHS candidates via `clif-util souper-harvest`.
2020-09-14 16:27:47 -07:00
Nick Fitzgerald
c87aaeeece cranelift_codegen::souper_harvest: Update TODOs to include more instructions 2020-09-14 16:27:47 -07:00
Nick Fitzgerald
b2acec1164 Harvest integer comparisons into Souper left-hand side candidates 2020-09-14 16:27:47 -07:00
Nick Fitzgerald
5a87171121 Do not use the matches! macro so we work with older rustc versions 2020-09-14 16:27:47 -07:00
Nick Fitzgerald
89f1e02f1f Remove executable bits from a few Rust source files 2020-09-14 16:27:47 -07:00
Nick Fitzgerald
3a6dd832c0 Harvest left-hand side superoptimization candidates.
Given a clif function, harvest all its integer subexpressions, so that they can
be fed into [Souper](https://github.com/google/souper) as candidates for
superoptimization. For some of these candidates, Souper will successfully
synthesize a right-hand side that is equivalent but has lower cost than the
left-hand side. Then, we can combine these left- and right-hand sides into a
complete optimization, and add it to our peephole passes.

To harvest the expression that produced a given value `x`, we do a post-order
traversal of the dataflow graph starting from `x`. As we do this traversal, we
maintain a map from clif values to their translated Souper values. We stop
traversing when we reach anything that can't be translated into Souper IR: a
memory load, a float-to-int conversion, a block parameter, etc. For values
produced by these instructions, we create a Souper `var`, which is an input
variable to the optimization. For instructions that have a direct mapping into
Souper IR, we get the Souper version of each of its operands and then create the
Souper version of the instruction itself. It should now be clear why we do a
post-order traversal: we need an instruction's translated operands in order to
translate the instruction itself. Once this instruction is translated, we update
the clif-to-souper map with this new translation so that any other instruction
that uses this result as an operand has access to the translated value. When the
traversal is complete we return the translation of `x` as the root of left-hand
side candidate.
2020-09-14 16:27:47 -07:00
Johnnie Birch
07d0d32b69 Adds i64x2.mul for the new backend targeting x64 2020-09-11 13:17:42 -07:00
Joey Gouly
22369cfa0d arm64: Combine mul + add into madd
Copyright (c) 2020, Arm Limited.
2020-09-11 18:06:19 +01:00
Benjamin Bouvier
3849dc18b1 machinst x64: revamp integer immediate emission;
In particular:

- try to optimize the integer emission into a 32-bit emission, when the
high bits are all zero, and stop relying on the caller of `imm_r` to
ensure this.
- rename `Inst::imm_r`/`Inst::Imm_R` to `Inst::imm`/`Inst::Imm`.
- generate a sign-extending mov 32-bit immediate to 64-bits, whenever
possible.
- fix a few places where the previous commit did introduce the
generation of zero-constants with xor, when calling `put_input_to_reg`,
thus clobbering the flags before they were read.
2020-09-11 18:13:30 +02:00
Benjamin Bouvier
d9052d0a9c machinst x64: generate copies of constants during lowering; 2020-09-11 17:41:44 +02:00
Benjamin Bouvier
cace32746f machinst x64: pattern-match addresses that are base+cst index; 2020-09-11 17:41:44 +02:00
Benjamin Bouvier
a1bdf11602 machinst x64: fix gen_store_base_offset for multi-value returns;
The previous method assumed that this could be used only for I64 values,
but this is actually used for multi-value returns, which can have any
type.
2020-09-10 11:17:41 +02:00
Chris Fallin
bd3ba0a774 Merge pull request #2189 from bnjbvr/x64-refactor-sub
machinst x64: a few small refactorings/renamings
2020-09-09 12:40:59 -07:00
Benjamin Bouvier
b4a2dd37a4 machinst x64: rename input_to_reg to put_input_to_reg;
Eventually, we should be able to unify this function's implementation
with the aarch64 one; but the latter does much more, and this would
require abstractions brought up in another pending PR#2142.
2020-09-09 18:03:59 +02:00