Commit Graph

986 Commits

Author SHA1 Message Date
Nick Fitzgerald
c015d69eb8 peepmatic: Do not use paths in linear IR
Rather than using paths from the root instruction to the instruction we are
matching against or checking if it is constant or whatever, use temporary
variables. When we successfully match an instruction's opcode, we simultaneously
define these temporaries for the instruction's operands. This is similar to how
open-coding these matches in Rust would use `match` expressions with pattern
matching to bind the operands to variables at the same time.

This saves about 1.8% of instructions retired when Peepmatic is enabled.
2020-10-13 11:03:48 -07:00
Andrew Brown
4484a00ea5 [machinst x64]: calculate extension modes in one place 2020-09-29 14:48:59 -07:00
Andrew Brown
715be68101 [machinst x64]: assert lane is correct size for extractlane
This change applies a good suggestion @bjorn3 made in #2230 that I forgot to implement there.
2020-09-29 09:34:22 -07:00
Andrew Brown
f50d905152 [machinst x64]: refactor using added RegMem::from(Writable<Reg>) 2020-09-29 08:45:12 -07:00
Andrew Brown
e3eb098c99 [machinst x64]: add swizzle implementation 2020-09-29 08:45:12 -07:00
Andrew Brown
050f078f86 [machinst x64]: add saturating addition implementation 2020-09-29 08:45:12 -07:00
Andrew Brown
a64abf9b76 [machinst x64]: add shuffle implementation 2020-09-29 08:45:12 -07:00
Andrew Brown
f4836f9ca9 [machinst x64]: add extractlane implementation 2020-09-29 08:45:12 -07:00
Andrew Brown
29fa894790 [machinst x64]: add insertlane implementation 2020-09-29 08:45:12 -07:00
Pat Hickey
b10beeee01 dep gardening (#2233)
* wasmtime-profiling: latest object dep is 0.21.1

* latest gimli is 0.22

* bump cargo.lock
2020-09-26 00:49:28 -05:00
Andrew Brown
48cf45491d [machinst x64]: inform the register allocator of more types of packed moves 2020-09-25 18:59:01 -07:00
Andrew Brown
ac2bf9d246 [machinst x64]: add packed min/max implementations 2020-09-23 15:40:46 -07:00
Andrew Brown
7546d98844 [machinst x64]: add avg_round implementation 2020-09-23 15:40:46 -07:00
Andrew Brown
b202464fa0 [machinst x64]: add iabs implementation 2020-09-23 15:40:46 -07:00
Alex Crichton
5e08eb3b83 Bump wasmtime to 0.20.0 (#2222)
At the same time bump cranelift crates to 0.67.0
2020-09-23 13:54:02 -05:00
Benjamin Bouvier
79cff73da5 machinst x64: implement loads/stores for v128 SIMD types;
This made it possible to enable more SIMD tests from the spec test suite
too.
2020-09-23 16:42:03 +02:00
Jakub Krauz
bab3c73100 Put arm32 backend behind experimental_arm32 flag 2020-09-22 12:53:14 +02:00
Jakub Krauz
f6a140a662 arm32 codegen
This commit adds arm32 code generation for some IR insts.
Floating-point instructions are not supported, because regalloc
does not allow to represent overlapping register classes,
which are needed by VFP/Neon.

There is also no support for big-endianness, I64 and I128 types.
2020-09-22 12:49:42 +02:00
bjorn3
45ccc6940e Fix Switch for 128bit integers 2020-09-21 14:50:59 +02:00
Chris Fallin
1c7fa7f785 Merge pull request #2181 from jgouly/madd-opt
arm64: Combine mul + add into madd
2020-09-15 11:52:33 -07:00
Joshua Nelson
d28abad441 Upgrade to target-lexicon 0.11
This allows downstream library users to use `CDataModel` without having
to install two different versions of target-lexicon.
2020-09-15 11:40:09 -07:00
Nick Fitzgerald
e1c8878b33 cranelift_codegen::souper_harvest: Move preopt out of Context, into clif-util
This allows for more flexibility of when/where to harvest LHS candidates. For
example, we could choose to harvest candidates that overlap with and supercede
our current preopt peepholes.

This commit also makes sure that we compute the CFG before running preopt, when
harvesting LHS candidates via `clif-util souper-harvest`.
2020-09-14 16:27:47 -07:00
Nick Fitzgerald
c87aaeeece cranelift_codegen::souper_harvest: Update TODOs to include more instructions 2020-09-14 16:27:47 -07:00
Nick Fitzgerald
b2acec1164 Harvest integer comparisons into Souper left-hand side candidates 2020-09-14 16:27:47 -07:00
Nick Fitzgerald
5a87171121 Do not use the matches! macro so we work with older rustc versions 2020-09-14 16:27:47 -07:00
Nick Fitzgerald
89f1e02f1f Remove executable bits from a few Rust source files 2020-09-14 16:27:47 -07:00
Nick Fitzgerald
3a6dd832c0 Harvest left-hand side superoptimization candidates.
Given a clif function, harvest all its integer subexpressions, so that they can
be fed into [Souper](https://github.com/google/souper) as candidates for
superoptimization. For some of these candidates, Souper will successfully
synthesize a right-hand side that is equivalent but has lower cost than the
left-hand side. Then, we can combine these left- and right-hand sides into a
complete optimization, and add it to our peephole passes.

To harvest the expression that produced a given value `x`, we do a post-order
traversal of the dataflow graph starting from `x`. As we do this traversal, we
maintain a map from clif values to their translated Souper values. We stop
traversing when we reach anything that can't be translated into Souper IR: a
memory load, a float-to-int conversion, a block parameter, etc. For values
produced by these instructions, we create a Souper `var`, which is an input
variable to the optimization. For instructions that have a direct mapping into
Souper IR, we get the Souper version of each of its operands and then create the
Souper version of the instruction itself. It should now be clear why we do a
post-order traversal: we need an instruction's translated operands in order to
translate the instruction itself. Once this instruction is translated, we update
the clif-to-souper map with this new translation so that any other instruction
that uses this result as an operand has access to the translated value. When the
traversal is complete we return the translation of `x` as the root of left-hand
side candidate.
2020-09-14 16:27:47 -07:00
Johnnie Birch
07d0d32b69 Adds i64x2.mul for the new backend targeting x64 2020-09-11 13:17:42 -07:00
Joey Gouly
22369cfa0d arm64: Combine mul + add into madd
Copyright (c) 2020, Arm Limited.
2020-09-11 18:06:19 +01:00
Benjamin Bouvier
3849dc18b1 machinst x64: revamp integer immediate emission;
In particular:

- try to optimize the integer emission into a 32-bit emission, when the
high bits are all zero, and stop relying on the caller of `imm_r` to
ensure this.
- rename `Inst::imm_r`/`Inst::Imm_R` to `Inst::imm`/`Inst::Imm`.
- generate a sign-extending mov 32-bit immediate to 64-bits, whenever
possible.
- fix a few places where the previous commit did introduce the
generation of zero-constants with xor, when calling `put_input_to_reg`,
thus clobbering the flags before they were read.
2020-09-11 18:13:30 +02:00
Benjamin Bouvier
d9052d0a9c machinst x64: generate copies of constants during lowering; 2020-09-11 17:41:44 +02:00
Benjamin Bouvier
cace32746f machinst x64: pattern-match addresses that are base+cst index; 2020-09-11 17:41:44 +02:00
Benjamin Bouvier
a1bdf11602 machinst x64: fix gen_store_base_offset for multi-value returns;
The previous method assumed that this could be used only for I64 values,
but this is actually used for multi-value returns, which can have any
type.
2020-09-10 11:17:41 +02:00
Chris Fallin
bd3ba0a774 Merge pull request #2189 from bnjbvr/x64-refactor-sub
machinst x64: a few small refactorings/renamings
2020-09-09 12:40:59 -07:00
Benjamin Bouvier
b4a2dd37a4 machinst x64: rename input_to_reg to put_input_to_reg;
Eventually, we should be able to unify this function's implementation
with the aarch64 one; but the latter does much more, and this would
require abstractions brought up in another pending PR#2142.
2020-09-09 18:03:59 +02:00
Benjamin Bouvier
cb96d16ac7 machinst x64: inline helper used only once; 2020-09-09 18:03:59 +02:00
Benjamin Bouvier
7a833f442a machinst: common up some instruction data helpers; 2020-09-09 18:03:59 +02:00
Benjamin Bouvier
a835c247c0 machinst: make get_output_reg target independent; 2020-09-09 18:03:59 +02:00
Benjamin Bouvier
6a3c4fb54e machinst x64: rename output_to_reg to get_output_reg; 2020-09-09 18:03:59 +02:00
Benjamin Bouvier
9620ce6bdf machinst x64: mask shift count too; 2020-09-09 18:03:59 +02:00
Benjamin Bouvier
9c328cc64b machinst x64: Remove unfinished comment; 2020-09-09 18:03:59 +02:00
Anton Kirilov
f612e8e7b2 AArch64: Add various missing SIMD bits
In addition, improve the code for stack pointer manipulation.

Copyright (c) 2020, Arm Limited.
2020-09-09 13:37:50 +01:00
Chris Fallin
e8f772c1ac x64 new backend: port ABI implementation to shared infrastructure with AArch64.
Previously, in #2128, we factored out a common "vanilla 64-bit ABI"
implementation from the AArch64 ABI code, with the idea that this should
be largely compatible with x64. This PR alters the new x64 backend to
make use of the shared infrastructure, removing the duplication that
existed previously. The generated code is nearly (not exactly) the same;
the only difference relates to how the clobber-save region is padded in
the prologue.

This also changes some register allocations in the aarch64 code because
call support in the shared ABI infra now passes a temp vreg in, rather
than requiring use of a fixed, non-allocable temp; tests have been
updated, and the runtime behavior is unchanged.
2020-09-08 17:59:01 -07:00
Chris Fallin
3d6c4d312f Merge pull request #2187 from akirilov-arm/ALUOp3
AArch64: Introduce an enum for ternary integer operations
2020-09-08 12:57:59 -07:00
Chris Fallin
e913bcb26a Merge pull request #2179 from jgouly/mvn
arm64: Don't always materialise a 64-bit constant
2020-09-08 09:17:08 -07:00
bjorn3
9428480230 Merge SignExtendAlAh and SignExtendRaxRdx 2020-09-08 15:00:24 +02:00
bjorn3
3dcda164dc Fix nits 2020-09-08 15:00:24 +02:00
bjorn3
9999913a31 Fix sign extension
Co-authored-by: Max Graey <maxgraey@gmail.com>
2020-09-08 15:00:24 +02:00
bjorn3
067255ef45 x64: Implement rotl and rotr for small integers 2020-09-08 15:00:24 +02:00
bjorn3
4251a950ba x64: Implement ishl, ushr and sshr for small integers 2020-09-08 15:00:24 +02:00