Commit Graph

56 Commits

Author SHA1 Message Date
Andrew Brown
c8cce5d2d7 [machinst x64]: enable packed saturated arithmetic 2020-10-08 08:46:20 -07:00
Benjamin Bouvier
a470f1e0cd machinst x64: remove dead code and allow(dead_code) annotation;
The BranchTarget is always used as a label, so just use a plain
MachLabel in this case.
2020-10-08 10:05:57 +02:00
Chris Fallin
71768bb6cf Fix AArch64 ABI to respect half-caller-save, half-callee-save vec regs.
This PR updates the AArch64 ABI implementation so that it (i) properly
respects that v8-v15 inclusive have callee-save lower halves, and
caller-save upper halves, by conservatively approximating (to full
registers) in the appropriate directions when generating prologue
caller-saves and when informing the regalloc of clobbered regs across
callsites.

In order to prevent saving all of these vector registers in the prologue
of every non-leaf function due to the above approximation, this also
makes use of a new regalloc.rs feature to exclude call instructions'
writes from the clobber set returned by register allocation. This is
safe whenever the caller and callee have the same ABI (because anything
the callee could clobber, the caller is allowed to clobber as well
without saving it in the prologue).

Fixes #2254.
2020-10-06 14:44:02 -07:00
Andrew Brown
50b9399006 [machinst x64]: lower remaining lane operations--any_true, all_true, splat 2020-10-02 08:29:31 -07:00
Andrew Brown
0579e9f9de [machinst x64]: add packed OR 2020-10-02 08:29:31 -07:00
Andrew Brown
74226d6781 [machinst x64]: add integer comparisons 2020-10-02 08:29:31 -07:00
Andrew Brown
4484a00ea5 [machinst x64]: calculate extension modes in one place 2020-09-29 14:48:59 -07:00
Andrew Brown
f50d905152 [machinst x64]: refactor using added RegMem::from(Writable<Reg>) 2020-09-29 08:45:12 -07:00
Andrew Brown
050f078f86 [machinst x64]: add saturating addition implementation 2020-09-29 08:45:12 -07:00
Andrew Brown
a64abf9b76 [machinst x64]: add shuffle implementation 2020-09-29 08:45:12 -07:00
Andrew Brown
f4836f9ca9 [machinst x64]: add extractlane implementation 2020-09-29 08:45:12 -07:00
Andrew Brown
29fa894790 [machinst x64]: add insertlane implementation 2020-09-29 08:45:12 -07:00
Andrew Brown
ac2bf9d246 [machinst x64]: add packed min/max implementations 2020-09-23 15:40:46 -07:00
Andrew Brown
7546d98844 [machinst x64]: add avg_round implementation 2020-09-23 15:40:46 -07:00
Andrew Brown
b202464fa0 [machinst x64]: add iabs implementation 2020-09-23 15:40:46 -07:00
Benjamin Bouvier
3849dc18b1 machinst x64: revamp integer immediate emission;
In particular:

- try to optimize the integer emission into a 32-bit emission, when the
high bits are all zero, and stop relying on the caller of `imm_r` to
ensure this.
- rename `Inst::imm_r`/`Inst::Imm_R` to `Inst::imm`/`Inst::Imm`.
- generate a sign-extending mov 32-bit immediate to 64-bits, whenever
possible.
- fix a few places where the previous commit did introduce the
generation of zero-constants with xor, when calling `put_input_to_reg`,
thus clobbering the flags before they were read.
2020-09-11 18:13:30 +02:00
Johnnie Birch
a64af55cda Adds x64 packed negation for the new backend 2020-09-07 11:56:05 -07:00
Benjamin Bouvier
cca10b87cb machinst x64: optimize select/brz/brnz when the input is a comparison; 2020-08-24 17:00:30 +02:00
Julian Seward
620e4b4e82 This patch fills in the missing pieces needed to support wasm atomics on newBE/x64.
It does this by providing an implementation of the CLIF instructions `AtomicRmw`, `AtomicCas`,
`AtomicLoad`, `AtomicStore` and `Fence`.

The translation is straightforward.  `AtomicCas` is translated into x64 `cmpxchg`, `AtomicLoad`
becomes a normal load because x64-TSO provides adequate sequencing, `AtomicStore` becomes a
normal store followed by `mfence`, and `Fence` becomes `mfence`.  `AtomicRmw` is the only
complex case: it becomes a normal load, followed by a loop which computes an updated value,
tries to `cmpxchg` it back to memory, and repeats if necessary.

This is a minimum-effort initial implementation.  `AtomicRmw` could be implemented more
efficiently using LOCK-prefixed integer read-modify-write instructions in the case where the old
value in memory is not required.  Subsequent work could add that, if required.

The x64 emitter has been updated to emit the new instructions, obviously.  The `LegacyPrefix`
mechanism has been revised to handle multiple prefix bytes, not just one, since it is now
sometimes necessary to emit both 0x66 (Operand Size Override) and F0 (Lock).

In the aarch64 implementation of atomics, there has been some minor renaming for the sake of
clarity, and for consistency with this x64 implementation.
2020-08-24 11:50:06 +02:00
Andrew Brown
2767b2efc6 machinst x64: add Inst::[move|load|store] for choosing the correct x86 instruction
This change primarily adds the ability to lower packed `[move|load|store]` instructions (the vector types were previously unimplemented), but with the addition of the utility `Inst::[move|load|store]` functions it became possible to remove duplicated code (e.g. `stack_load` and `stack_store`) and use these utility functions elsewhere (though not exhaustively).
2020-08-20 12:37:22 -07:00
Andrew Brown
cf598dc35b machinst x64: add packed moves for different vector types 2020-08-20 12:37:22 -07:00
Johnnie Birch
a31336996c Add support for some packed multiplication for new x64 backend
Adds support for i32x4, and i16x8 and lowering for pmuludq in
preperation for i64x2.
2020-08-19 10:24:14 -07:00
Johnnie Birch
38ef98700f Adds packed integer subtraction 2020-08-12 09:41:20 -07:00
Johnnie Birch
2eadc6e2a8 Add packed integer add opcodes (v128) to instruction set enum 2020-08-06 22:25:18 -07:00
Andrew Brown
999e04a2c4 machinst x64: refactor imports to use rustfmt convention
This change is a pure refactoring--no change to functionality. It removes newlines between the `use ...` statements in the x64 backend so that rustfmt can format them according to its convention. I noticed some files had followed a manual convention but subsequent additions did not seem to fit; this change fixes that and lightly coalesces some of the occurrences of `use a::b; use a::c;` into `use::{b, c}`.
2020-08-04 09:17:54 -07:00
Benjamin Bouvier
e108f14620 machinst x64: use xor/xorpss/xorpd to generate zero constants; 2020-07-31 13:17:52 -07:00
Andrew Brown
c74a9d1225 machinst x64: add packed shifts 2020-07-30 14:16:12 -07:00
Andrew Brown
0398033447 machinst x64: add packed FP comparisons
Re-orders the SseOpcode variants alphabetically.
2020-07-30 14:16:12 -07:00
Andrew Brown
e3bd8d696b machinst x64: add basic packed FP arithmetic
Includes instruction definition of packed min/max.
2020-07-30 14:16:12 -07:00
Andrew Brown
77cc2f69c1 machinst x64: allow use of vector-length types 2020-07-30 14:16:12 -07:00
Benjamin Bouvier
987c616bf5 machinst x64: implement support for dynamic heaps and explicit bound checks; 2020-07-24 19:29:12 +02:00
Benjamin Bouvier
aa103698d4 machinst x64: extend Copysign to work for f64 inputs too; 2020-07-24 19:29:12 +02:00
Benjamin Bouvier
48ec806a9d machinst x64: implement Fabs/Fneg in terms of other instructions; 2020-07-24 19:29:12 +02:00
Benjamin Bouvier
03b9e1e86a machinst x64: implement float min/max with the right semantics; 2020-07-24 19:29:12 +02:00
Benjamin Bouvier
cd54f05efd machinst x64: implement float-to-int and int-to-float conversions; 2020-07-24 19:29:12 +02:00
Johnnie Birch
a7cedf3100 Add support for 32 bit and 64 bit fcmp for the new backend
Implements commiss and commisd.
2020-07-17 13:46:54 -07:00
Benjamin Bouvier
ead8a835c4 machinst x64: add more FP support 2020-07-17 15:56:44 +02:00
Benjamin Bouvier
bab337fc32 Address review comments; 2020-07-16 18:21:06 +02:00
Benjamin Bouvier
5a55646fc3 machinst x64: support out-of-bounds memory accesses; 2020-07-16 18:21:06 +02:00
Benjamin Bouvier
d9310e8d90 machinst x64: fix checked div sequence
- it should mark as clobbering (def) rdx, not modifying it
- the signed-div check requires a temporary to compare against int64_min
2020-07-16 18:21:06 +02:00
Benjamin Bouvier
6f5403a94b machinst x64: lower Ctz using the Bsf x86 instruction 2020-07-16 18:21:06 +02:00
Benjamin Bouvier
ec2209665a machinst x64: implement bsr and lower Clz; 2020-07-16 18:21:06 +02:00
Benjamin Bouvier
571061fe4c machinst x64: add support for rotations; 2020-07-16 18:21:06 +02:00
Benjamin Bouvier
9d5be00de4 Address review comments
- put the division in the synthetic instruction as well,
- put the branch table check in the inst's emission code,
- replace OneWayCondJmp by TrapIf vcode instruction,
- add comments describing code generated by the synthetic instructions
2020-07-03 14:33:52 +02:00
Benjamin Bouvier
2115e70acb machinst x64: implement enough to support branch tables; 2020-07-03 14:33:52 +02:00
Benjamin Bouvier
f86ecdcb86 machinst x64: lower and implement div/idiv; ADD TESTS 2020-07-03 14:33:52 +02:00
Benjamin Bouvier
c9a3f05afd machinst x64: implement calls and int cmp/store/loads;
This makes it possible to run a simple recursive fibonacci function in
wasmtime.
2020-06-25 16:20:33 +02:00
Johnnie Birch
2d364f75bd Remove xmm_r_r inst data structure and cases after related refactoring
Removes unneeded data structure that was holding instructions for
xmm based move instructions. These instructions can should be categorized
as rm not just r. This change is intended to simplify organization and
cases when lowering.
2020-06-25 14:31:51 +02:00
Johnnie Birch
f2f7706265 Implements vcode lowering for f32.copysign.
This patch implements the required but not already available
x64 instructions for copysign as well as the actual lowering sequence
and tests for the newly implemented x64 instructions.
Those instructions include:

andps,
andnps,
movaps,
movd,
orps,

The lowering sequence is based on the lowering for f32.copysign
in the current cranelift backend. movd does not have a test yet
due to some logic needed express a 32-bit register as a source
for xmm_rm_r instructions. This code also begins some
rethinking/refactoring of how the sse move instuctions
are written and so also includes new emit cases that will replace
current ones that match a different enum used to describe sse moves.
2020-06-24 11:47:26 -07:00
Johnnie Birch
043571fee0 Adds f32.mul, f32.div for vcode backend for x64.
Adds support for lowering clif instructions Fdiv and Fmul
for new vcode backend. Misc adds lowering and test for
sqrtss and removes a redundant to_string() func for the
SseOpcode struct.
2020-06-17 17:19:57 -07:00