Commit Graph

1391 Commits

Author SHA1 Message Date
bjorn3
9428480230 Merge SignExtendAlAh and SignExtendRaxRdx 2020-09-08 15:00:24 +02:00
bjorn3
3dcda164dc Fix nits 2020-09-08 15:00:24 +02:00
bjorn3
9999913a31 Fix sign extension
Co-authored-by: Max Graey <maxgraey@gmail.com>
2020-09-08 15:00:24 +02:00
bjorn3
067255ef45 x64: Implement rotl and rotr for small integers 2020-09-08 15:00:24 +02:00
bjorn3
4251a950ba x64: Implement ishl, ushr and sshr for small integers 2020-09-08 15:00:24 +02:00
bjorn3
cc35f1e9bb x64: Misc small integer fixes 2020-09-08 15:00:24 +02:00
bjorn3
ce033f2a0c x64: Fix udiv and sdiv for 8bit integers 2020-09-08 15:00:24 +02:00
bjorn3
74642b166f x64: Implement ineg and bnot 2020-09-08 15:00:24 +02:00
Anton Kirilov
e92f949663 AArch64: Introduce an enum for ternary integer operations
This commit performs a small cleanup in the AArch64 backend - after
the MAdd and MSub variants have been extracted, the ALUOp enum is
used purely for binary integer operations.

Also, Inst::Mov has been renamed to Inst::Mov64 for consistency.

Copyright (c) 2020, Arm Limited.
2020-09-08 13:22:22 +01:00
Johnnie Birch
a64af55cda Adds x64 packed negation for the new backend 2020-09-07 11:56:05 -07:00
Joey Gouly
650d48cd84 arm64: Don't always materialise a 64-bit constant
This improves the mov/movk/movn sequnce when the high half of the
64-bit value is all zero.

Copyright (c) 2020, Arm Limited.
2020-09-01 13:29:01 +01:00
Benjamin Bouvier
a7f7c23bf9 machinst aarch64: in baldrdash, allow returning only one value across register classes;
Baldrdash's API requires that there is at most one result in a register,
across all the possible register classes: in particular, it's not
possible to return an i64 value in a register while returning an v128
value in another register.

This patch adds a notion of "remaining register values", so this is
properly taking into account when choosing whether a return value may be
put into a register or not.
2020-08-31 12:36:26 +02:00
Julian Seward
8ac4bd1d0d CL/newBE/x64: Lowering of scalar shifts: fix shift-by-imm generation
The logic for generation of shifts-by-immediate was not quite right.  The result was that even
shifts by an amount known at compile time were being done by moving the shift immediate into %cl
and then doing a variable shift by %cl.  The effect is worse than it sounds, because all of
those shift constants are small and often used in multiple places, so they were GVN'd up and
often ended up at the entry block of the function.  Hence these were connected to the use points
by long live ranges which got spilled.  So all in all, most of the win here comes from avoiding
spilling.

The problem was caused by this line, in the `Opcode::Ishl | Opcode::Ushr ..` case:
```
   let (count, rhs) = if let Some(cst) = ctx.get_constant(inputs[1].insn) {
```
`inputs[]` appears to refer to this CLIF instruction's inputs, and bizarrely `inputs[].insn` all
refer to the instruction (the shift) itself.  Hence `ctx.get_constant(inputs[1].insn)` asks
"does this shift instruction produce a constant" to which the answer is always "no", so the
shift-by-unknown amount code is always generated.  The fix here is to change that expression to
```
   let (count, rhs) = if let Some(cst) = ctx.get_input(insn, 1).constant {
```
`get_input`'s result conveniently includes a `constant` field of type `Option<u64>`, so we just
use that instead.
2020-08-27 11:48:35 +02:00
Benjamin Bouvier
7c85654285 Address review comments. 2020-08-24 17:00:30 +02:00
Benjamin Bouvier
ee76e01efc machinst: fix the pinned reg hack;
The pinned register hack didn't work because the GetPinnedReg is marked
as having side-effects, so that GVN wouldn't try to common it out.

This commit tweaks the function used during lowering to vcode, so that
the GetPinnedReg opcode is specially handled. It's a bit lame, but it
makes the hack work again.

Also, use_input needs to be a no-op for real registers.
2020-08-24 17:00:30 +02:00
Benjamin Bouvier
efff43e769 machinst x64: fold address modes on loads/stores; 2020-08-24 17:00:30 +02:00
Benjamin Bouvier
b830ee79de machinst x64: commute operands of integer operations if one input is an immediate; 2020-08-24 17:00:30 +02:00
Benjamin Bouvier
cca10b87cb machinst x64: optimize select/brz/brnz when the input is a comparison; 2020-08-24 17:00:30 +02:00
Julian Seward
620e4b4e82 This patch fills in the missing pieces needed to support wasm atomics on newBE/x64.
It does this by providing an implementation of the CLIF instructions `AtomicRmw`, `AtomicCas`,
`AtomicLoad`, `AtomicStore` and `Fence`.

The translation is straightforward.  `AtomicCas` is translated into x64 `cmpxchg`, `AtomicLoad`
becomes a normal load because x64-TSO provides adequate sequencing, `AtomicStore` becomes a
normal store followed by `mfence`, and `Fence` becomes `mfence`.  `AtomicRmw` is the only
complex case: it becomes a normal load, followed by a loop which computes an updated value,
tries to `cmpxchg` it back to memory, and repeats if necessary.

This is a minimum-effort initial implementation.  `AtomicRmw` could be implemented more
efficiently using LOCK-prefixed integer read-modify-write instructions in the case where the old
value in memory is not required.  Subsequent work could add that, if required.

The x64 emitter has been updated to emit the new instructions, obviously.  The `LegacyPrefix`
mechanism has been revised to handle multiple prefix bytes, not just one, since it is now
sometimes necessary to emit both 0x66 (Operand Size Override) and F0 (Lock).

In the aarch64 implementation of atomics, there has been some minor renaming for the sake of
clarity, and for consistency with this x64 implementation.
2020-08-24 11:50:06 +02:00
Anton Kirilov
b895ac0e40 AArch64: Implement SIMD conversions
Copyright (c) 2020, Arm Limited.
2020-08-21 18:03:50 +01:00
Andrew Brown
2767b2efc6 machinst x64: add Inst::[move|load|store] for choosing the correct x86 instruction
This change primarily adds the ability to lower packed `[move|load|store]` instructions (the vector types were previously unimplemented), but with the addition of the utility `Inst::[move|load|store]` functions it became possible to remove duplicated code (e.g. `stack_load` and `stack_store`) and use these utility functions elsewhere (though not exhaustively).
2020-08-20 12:37:22 -07:00
Andrew Brown
cf598dc35b machinst x64: add packed moves for different vector types 2020-08-20 12:37:22 -07:00
Chris Fallin
debacec1c5 Merge pull request #2150 from jgouly/mul64s
arm64: Implement SIMD i64x2 multiply
2020-08-20 11:57:56 -07:00
Chris Fallin
051feaad75 Merge pull request #2148 from bjorn3/aarch64_fix_put_input_in_rsa
Fix put_input_in_reg
2020-08-20 11:41:35 -07:00
Chris Fallin
775dfa9df2 Merge pull request #1520 from bjorn3/aarch64-lower-small-fcvt_from_int
Lower fcvt_from_{u,s}int for 8 and 16 bit ints
2020-08-20 11:35:06 -07:00
Joey Gouly
a518c10141 arm64: Implement SIMD i64x2 multiply
Copyright (c) 2020, Arm Limited.
2020-08-20 13:26:03 +01:00
bjorn3
957eb9eeba Less unnecessary zero and sign extensions 2020-08-20 10:17:04 +02:00
bjorn3
ba48b9aef1 Fix put_input_in_reg 2020-08-19 19:38:47 +02:00
Johnnie Birch
a31336996c Add support for some packed multiplication for new x64 backend
Adds support for i32x4, and i16x8 and lowering for pmuludq in
preperation for i64x2.
2020-08-19 10:24:14 -07:00
bjorn3
4a84f3f073 Lower fcvt_from_{u,s}int for 8 and 16 bit ints 2020-08-19 18:07:12 +02:00
Chris Fallin
22181d0819 Use regalloc 0.0.30.
This upgrade pulls in one memory-allocation reduction improvement
(bytecodealliance/regalloc.rs#95). There should be no change in behavior
as a result of this.
2020-08-18 09:51:35 -07:00
Chris Fallin
5fa0be3515 AArch64 ABI: properly store full 64-bit width of extended args/retvals.
When storing an argument to a stack location for consumption by a
callee, or storing a return value to an on-stack return slot for
consumption by the caller, the ABI implementation was properly extending
the value but was then performing a store with only the original width.
This fixes the issue by always performing a 64-bit store of the extended
value.

Issue reported by @uweigand (thanks!).
2020-08-17 15:00:04 -07:00
Chris Fallin
ac6539abd7 Merge pull request #2128 from cfallin/machinst-abi-refactor-2
Refactor AArch64 ABI support to extract common bits for shared impl with x64.
2020-08-14 17:08:07 -07:00
Chris Fallin
5cf3fba3da Refactor AArch64 ABI support to extract common bits for shared impl with x64.
We have observed that the ABI implementations for AArch64 and x64 are
very similar; in fact, x64's implementation started as a modified copy
of AArch64's implementation. This is an artifact of both a similar ABI
(both machines pass args and return values in registers first, then the
stack, and both machines give considerable freedom with stack-frame
layout) and a too-low-level ABI abstraction in the existing design. For
machines that fit the mainstream or most common ABI-design idioms, we
should be able to do much better.

This commit factors AArch64 into machine-specific and
machine-independent parts, but does not yet modify x64; that will come
next.

This should be completely neutral with respect to compile time and
generated code performance.
2020-08-14 16:27:39 -07:00
Chris Fallin
3b007dd6a2 Upgrade to regalloc 0.0.29.
This upgrade pulls in several recent changes in regalloc that should
improve compile-time performance in the AArch64 and new x64 backends.
2020-08-13 11:40:34 -07:00
Johnnie Birch
38ef98700f Adds packed integer subtraction 2020-08-12 09:41:20 -07:00
Nick Fitzgerald
5af47dc4cd cranelift: Only emit stack maps when a function actually uses reference types
This fix avoids a small slow down in scenarios where reference types are enabled
but a given function doesn't actually use them.

Fixes #1883
2020-08-07 16:54:51 -07:00
Nick Fitzgerald
00de2a6ab6 Merge pull request #2110 from fitzgen/peepmatic-parse-nesting-depth
Peepmatic: Implement maximum nesting level in parser
2020-08-07 10:49:23 -07:00
Nick Fitzgerald
fdbc9e351f Merge pull request #2111 from fitzgen/rename-stackmap-to-stack-map
Rename "Stackmap" to "StackMap"
2020-08-07 10:46:38 -07:00
Nick Fitzgerald
174159a552 Bump wast to version 22.0.0 in peepmatic crates 2020-08-07 10:12:11 -07:00
Nick Fitzgerald
05bf9ea3f3 Rename "Stackmap" to "StackMap"
And "stackmap" to "stack_map".

This commit is purely mechanical.
2020-08-07 10:08:44 -07:00
Johnnie Birch
e60a6f2ad2 Fixup packed integer add lowering
Remove stray print statement
Fix bug in match statement causing unreachable code.
2020-08-06 22:25:18 -07:00
Johnnie Birch
f5909b37c3 Add emit tests for packed integer add instructions 2020-08-06 22:25:18 -07:00
Johnnie Birch
dd6ba5f9d7 Lower packed integer add instructions (v128)
Adds lowering support for packed integer add instructions and helper
function for determining if a type for an instruction indicates it is
packed.
2020-08-06 22:25:18 -07:00
Johnnie Birch
2eadc6e2a8 Add packed integer add opcodes (v128) to instruction set enum 2020-08-06 22:25:18 -07:00
Anton Kirilov
1ec6930005 Enable the spec::simd::simd_lane test for AArch64
Copyright (c) 2020, Arm Limited.
2020-08-06 11:14:15 +01:00
Andrew Brown
4cb36afd7b machinst x64: refactor to use types::[type] everywhere
This change is a pure refactoring--no change to functionality. It removes `use crate::ir::types::*` imports and uses instead `types::I32`, e.g., throughout the x64 code. Though it increases code verbosity, this change makes it more clear where the type identifiers come from (they are generated by `cranelif-codegen-meta` so without a prefix it is difficult to find their origin), avoids IDE confusion (e.g. CLion flags the un-prefixed identifiers as errors), and avoids importing unwanted identifiers into the namespace.
2020-08-05 10:45:45 -07:00
Andrew Brown
8cfff26957 machinst x64: implement floating point comparisons
Note that this fixes an encoding issue in which the packed single and packed double prefixes were flipped.
2020-08-04 13:24:38 -07:00
Andrew Brown
c21fe0eb73 machinst x64: use assert_eq! when possible 2020-08-04 09:18:45 -07:00
Andrew Brown
999e04a2c4 machinst x64: refactor imports to use rustfmt convention
This change is a pure refactoring--no change to functionality. It removes newlines between the `use ...` statements in the x64 backend so that rustfmt can format them according to its convention. I noticed some files had followed a manual convention but subsequent additions did not seem to fit; this change fixes that and lightly coalesces some of the occurrences of `use a::b; use a::c;` into `use::{b, c}`.
2020-08-04 09:17:54 -07:00