Commit Graph

177 Commits

Author SHA1 Message Date
Darin Morrison
d68437e1e6 Update SIMD tests to use hex literals 2020-03-02 08:28:59 -08:00
bjorn3
0a1bb3ba6c Add TLS support for ELF and MachO (#1174)
* Add TLS support
* Add binemit and legalize tests
* Spill all caller-saved registers when necessary
2020-02-25 17:50:04 -08:00
Andrew Brown
032e81fd6f Add x86 SIMD average rounding 2020-02-24 09:48:38 -08:00
Andrew Brown
1a9dc743d1 Infer REX prefix for SIMD load instruction 2020-02-19 09:24:05 -08:00
Andrew Brown
936120dcf9 Infer REX prefix for SIMD store and vconst instructions 2020-02-19 09:24:05 -08:00
Peter Delevoryas
18b40d1101 Add ineg legalization for scalar integer types (#1385) 2020-02-14 13:16:02 -08:00
Ryan Hunt
832666c45e Mass rename Ebb and relatives to Block (#1365)
* Manually rename BasicBlock to BlockPredecessor

BasicBlock is a pair of (Ebb, Inst) that is used to represent the
basic block subcomponent of an Ebb that is a predecessor to an Ebb.

Eventually we will be able to remove this struct, but for now it
makes sense to give it a non-conflicting name so that we can start
to transition Ebb to represent a basic block.

I have not updated any comments that refer to BasicBlock, as
eventually we will remove BlockPredecessor and replace with Block,
which is a basic block, so the comments will become correct.

* Manually rename SSABuilder block types to avoid conflict

SSABuilder has its own Block and BlockData types. These along with
associated identifier will cause conflicts in a later commit, so
they are renamed to be more verbose here.

* Automatically rename 'Ebb' to 'Block' in *.rs

* Automatically rename 'EBB' to 'block' in *.rs

* Automatically rename 'ebb' to 'block' in *.rs

* Automatically rename 'extended basic block' to 'basic block' in *.rs

* Automatically rename 'an basic block' to 'a basic block' in *.rs

* Manually update comment for `Block`

`Block`'s wikipedia article required an update.

* Automatically rename 'an `Block`' to 'a `Block`' in *.rs

* Automatically rename 'extended_basic_block' to 'basic_block' in *.rs

* Automatically rename 'ebb' to 'block' in *.clif

* Manually rename clif constant that contains 'ebb' as substring to avoid conflict

* Automatically rename filecheck uses of 'EBB' to 'BB'

'regex: EBB' -> 'regex: BB'
'$EBB' -> '$BB'

* Automatically rename 'EBB' 'Ebb' to 'block' in *.clif

* Automatically rename 'an block' to 'a block' in *.clif

* Fix broken testcase when function name length increases

Test function names are limited to 16 characters. This causes
the new longer name to be truncated and fail a filecheck test. An
outdated comment was also fixed.
2020-02-07 10:46:47 -06:00
Ryan Hunt
c360007b19 Drop 'basic-blocks' feature (#1363)
* All: Drop 'basic-blocks' feature

This makes it so that 'basic-blocks' cannot be disabled and we can
start assuming it everywhere.

* Tests: Replace non-bb filetests with bb version

* Tests: Adapt solver-fixedconflict filetests to use basic blocks
2020-01-23 22:36:06 -07:00
Ryan Hunt
fc58dd6aff Tests: Add basic test for r32, r64
This commit adds a basic test for reference types on 32/64bit systems.
 * Storing a ref type in a table
 * Loading a ref type from a table
 * Passing ref types to a function
 * Returning ref types from a function
 * `is_null` instruction
 * `is_invalid` instruction
2020-01-23 13:37:11 -06:00
Benjamin Bouvier
3125431ece Address nits from #1325 2020-01-23 09:39:49 +01:00
Sean Stangl
b4c6bfd371 When splitting a const, insert prior to the terminal branch group. (#1325)
* When splitting a const, insert prior to the terminal branch group. Closes #1159

Given code like the following, on x86_64, which does not have i128 registers:

    ebb0(v0: i64):
        v1 = iconst.i128 0
        v2 = icmp_imm eq v0, 1
        brnz v2, ebb1
        jump ebb2(v1)

It would be split to:

    ebb0(v0: i64):
        v1 = iconst.i128 0
        v2 = icmp_imm eq v0, 1
        brnz v2, ebb1
        v3, v4 = isplit.i128 v1
        jump ebb2(v3, v4)

But that fails basic-block invariants. This patch changes that to:

    ebb0(v0: i64):
        v1 = iconst.i128 0
        v2 = icmp_imm eq v0, 1
        v3, v4 = isplit.i128 v1
        brnz v2, ebb1
        jump ebb2(v3, v4)

* Add isplit-bb.clif testcase
2020-01-22 17:14:41 +01:00
jmkrauz
ae6ba1e58c Fix narrow_icmp_imm (#1343) 2020-01-21 15:20:44 +01:00
Benjamin Bouvier
dd497c19e1 Renames Settings ⚠️ (fixes #976) (#1321)
This is a breaking API change: the following settings have been renamed:

- jump_tables_enabled -> enable_jump_tables
- colocated_libcalls -> use_colocated_libcalls
- probestack_enabled -> enable_probestack
- allones_funcaddrs -> emit_all_ones_funcaddrs
2020-01-13 14:42:49 -07:00
Yury Delendik
bd88155483 Refactor unwind; add FDE support. (#1320)
* Refactor unwind

* add FDE support

* use sink directly in emit functions

* pref off all unwinding generation with feature
2020-01-13 10:32:55 -06:00
Andrew Brown
6fe86bcb61 Fix SIMD float comparison encoding (#1285)
The Intel manual uses `CMPNLT` and `CMPNLE` to denote not-less-than and not-less-than-or-equals. These were translated previously to `FloatCC::GreaterThan` and `FloatCC::GreaterThanOrEqual` but should be correctly translated to `FloatCC::UnorderedOrGreaterThanOrEqual` and `FloatCC::UnorderedOrGreaterThan`. This change adds the necessary legalizations to make use of these new encodings.
2020-01-08 09:28:05 -08:00
bjorn3
9bbe378d41 Fix master tests (#1316)
Co-authored-by: Benjamin Bouvier <public@benj.me>
2020-01-06 14:39:04 +01:00
Sean Stangl
cf9e762f16 Add a DynRex recipe type for x86, decreasing the number of recipes (#1298)
This patch adds a third mode for templates: REX inference is requestable
at template instantiation time. This reduces the number of recipes
by removing rex()/nonrex() redundancy for many instructions.
2019-12-19 15:49:34 -07:00
Andrew Brown
4433ad2858 Fix legalization of icmp ugt (#1278)
Previously, the same pattern (pmax + pcmpeq) as `uge` was used but this logic was incorrect for operands with equal values.
2019-12-16 14:14:51 -07:00
Andrew Brown
6181f20326 Fix legalization of SIMD fneg (#1286)
Previously `fsub` was used but this fails when negating -0.0 and +0.0 in the SIMD spec tests; using more instructions, this change uses shifts to create a constant for flipping the most significant bit of each lane with `bxor`.
2019-12-16 10:32:08 -08:00
Andrew Brown
0604ec480c Fix scalar_to_vector: move not wide enough for 64-bit values (#1287)
Previously, the use of `enc_x86_64` emitted two 64-bit mode encodings for `scalar_to_vector.i64`, neither of which contained the REX.W bit telling `MOVD/MOVQ` to move 64 bits of data instead of 32 bits. Now, `scalar_to_vector.i64` will always use a sole 64-bit mode REX.W encoding and `scalar_to_vector` with other widths will have three encodings: a 32-bit mode move, a 64-bit mode move with no REX, and a 64-bit mode move with REX (but not REX.W).
2019-12-16 10:17:08 -08:00
bjorn3
4cc0241f37 More i8 legalizations (#1253)
* Legalize stack_{load,store}.i8, fixes #433
* Legalize select.i8, cc: #466
* Legalize brz.i8 and brnz.i8, cc: #1117
2019-12-04 09:16:22 -08:00
krk
bc9f05e5e2 Add legalization for bitrev.i128 via narrowing, fixes #1116 (#1229) 2019-11-22 12:38:04 +01:00
bjorn3
c5e74986e1 Legalize uextend and sextend to 128bit ints 2019-11-19 14:42:21 -08:00
Andrew Brown
91d29c09d0 Add x86 SIMD floating-point absolute value 2019-11-15 13:45:25 -08:00
Andrew Brown
1f17e35e95 Add x86 SIMD immediate shifts 2019-11-15 13:45:25 -08:00
Andrew Brown
6519a43b08 Add x86 SIMD floating-point negation 2019-11-15 13:45:25 -08:00
krk
f7c7245b06 Add legalization for popcnt.i128 via narrowing, fixes #1116 2019-11-15 14:03:49 +01:00
Andrew Brown
c8eb4e9612 Add x86 SIMD floating-point arithmetic 2019-11-12 17:05:39 -08:00
Andrew Brown
04db2a9f39 Bind constant vectors to vconst; fixes #1052 (#1217) 2019-11-12 15:57:59 -08:00
Benjamin Bouvier
9080a02e10 Replace CraneStation by bytecodealliance everywhere; (#1221) 2019-11-12 10:09:31 -08:00
Andrew Brown
d32301854d Add x86 SIMD implementation of float comparison 2019-11-08 14:06:53 -08:00
Benjamin Bouvier
143cb01489 Do not align the stack frame for leaf functions not using the stack. 2019-11-08 17:20:20 +01:00
Andrew Brown
af4637aff6 Add x86 SIMD legalizations for icmp less-than 2019-11-05 16:42:34 -08:00
Andrew Brown
feffed85d2 Add x86 SIMD legalizations for integer greater-than
This includes `icmp ugt`, `icmp sge`, and `icmp uge` for vectors with lanes of I8, I16, and I32.
2019-11-05 16:42:34 -08:00
Andrew Brown
0ab5760fd7 Add x86 SIMD instructions for min and max
Only the I8, I16, and I32 versions are included since Cranelift lacks support for AVX.
2019-11-05 16:42:34 -08:00
Andrew Brown
c454c3c771 Add x86 SIMD encoding for icmp sgt 2019-11-05 16:42:34 -08:00
Andrew Brown
e3a20d67b2 Add x86 SIMD legalization of icmp ne 2019-11-05 16:42:34 -08:00
Nick Fitzgerald
a49483408c Many multi-value returns (#1147)
* Add x86 encodings for `bint` converting to `i8` and `i16`

* Introduce tests for many multi-value returns

* Support arbitrary numbers of return values

This commit implements support for returning an arbitrary number of return
values from a function. During legalization we transform multi-value signatures
to take a struct return ("sret") return pointer, instead of returning its values
in registers. Callers allocate the sret space in their stack frame and pass a
pointer to it into the caller, and once the caller returns to them, they load
the return values back out of the sret stack slot. The callee's return
operations are legalized to store the return values through the given sret
pointer.

* Keep track of old, pre-legalized signatures

When legalizing a call or return for its new legalized signature, we may need to
look at the old signature in order to figure out how to legalize the call or
return.

* Add test for multi-value returns and `call_indirect`

* Encode bool -> int x86 instructions in a loop

* Rename `Signature::uses_sret` to `Signature::uses_struct_return_param`

* Rename `p` to `param`

* Add a clarifiying comment in `num_registers_required`

* Rename `num_registers_required` to `num_return_registers_required`

* Re-add newline

* Handle already-assigned parameters in `num_return_registers_required`

* Document what some debug assertions are checking for

* Make "illegalizing" closure's control flow simpler

* Add unit tests and comments for our rounding-up-to-the-next-multiple-of-a-power-of-2 function

* Use `append_isnt_arg` instead of doing the same thing  manually

* Fix grammar in comment

* Add `Signature::uses_special_{param,return}` helper functions

* Inline the definition of `legalize_type_for_sret_load` for readability

* Move sret legalization debug assertions out into their own function

* Add `round_up_to_multiple_of_type_align` helper for readability

* Add a debug assertion that we aren't removing the wrong return value

* Rename `RetPtr` stack slots to `StructReturnSlot`

* Make `legalize_type_for_sret_store` more symmetrical to `legalized_type_for_sret`

* rustfmt

* Remove unnecessary loop labels

* Do not pre-assign offsets to struct return stack slots

Instead, let the existing frame layout algorithm decide where they should go.

* Expand "sret" into explicit "struct return" in doc comment

* typo: "than" -> "then" in comment

* Fold test's debug message into the assertion itself
2019-11-05 14:36:03 -08:00
Peter Huene
8923bac7e8 Implement emitting Windows unwind information for fastcall functions. (#1155)
* Implement emitting Windows unwind information for fastcall functions.

This commit implements emitting Windows unwind information for x64 fastcall
calling convention functions.

The unwind information can be used to construct a Windows function table at
runtime for JIT'd code, enabling stack walking and unwinding by the operating
system.

* Address code review feedback.

This commit addresses code review feedback:

* Remove unnecessary unsafe code.
* Emit the unwind information always as little endian.
* Fix comments.

A dependency from cranelift-codegen to the byteorder crate was added.
The byteorder crate is a no-dependencies crate with a reasonable
abstraction for writing binary data for a specific endianness.

* Address code review feedback.

* Disable default features for the `byteorder` crate.
* Add a comment regarding the Windows ABI unwind code numerical values.
* Panic if we encounter a Windows function with a prologue greater than 256
  bytes in size.
2019-11-05 13:14:30 -08:00
Andrew Brown
879ccf871a Add x86 SIMD vall_true
In order to implement SIMD's all_true (https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#all-lanes-true), we must legalize some instruction (I chose `vall_true`) to a comparison against 0 and a similar reduction as vany_true using `PTEST` and `SETNZ`. Since `icmp` only allows integers but `vall_true` could allow more vector types, `raw_bitcast` is used to convert the lane types into integers, e.g. b32x4 to i32x4. To do so without runtime type-checking, the `raw_bitcast` instruction (which emits no instruction) can now bitcast from any vector type to the same type, e.g. i32x4 to i32x4.
2019-10-22 11:01:05 -07:00
Andrew Brown
186effc420 Add x86 SIMD vany_true and x86_ptest
In order to implement SIMD's any_true (https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#any-lane-true), we must legalize some instruction (I chose `vany_true`) to a sequence of `PTEST` and `SETNZ`. To emit `PTEST` I added the new CLIF instruction `x86_ptest` and used CLIF's `trueif ne` for `SETNZ`.
2019-10-22 11:01:05 -07:00
Andrew Brown
b927c55511 Add SIMD bitselect instruction and x86 legalization
This new instructions matches the `bitselect` behavior described in the WASM SIMD spec (https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#bitwise-select)
2019-10-17 15:49:29 -07:00
Andrew Brown
8f74333662 Add x86 SIMD band_not 2019-10-17 15:49:29 -07:00
Sean Stangl
fad6bb1a5c Fix build by marking tests as incompatible with basic-blocks. Closes #1152 2019-10-17 12:00:22 -07:00
Benjamin Bouvier
a3f55cdf1f Regalloc solver: check that a variable doesn't exist to test if it can be added (fixes #1123);
This situation could be triggered that can_add_var would return true
while a variable was already added for the given register.

For instance, when we have a reassignment (because of a fixed register
input requirement) and a fixed input conflict on the same fixed
register, this register will not be available in the regs_in set after
inputs_done (because of the fixed input conflict diversion) but will
have its own variable.
2019-10-17 08:42:08 -07:00
Nicolas B. Pierron
7c31ce40c4 i128-isplit-forward-jump.clif: BB conditional branches can only be followed by a jump statement. 2019-10-17 13:59:04 +02:00
Andrew Brown
f1904bffea Add x86 SIMD sshr and ushr
Only the shifts with applicable SSE2 instructions are implemented here: PSRL* (for ushr) only has 16-64 bit instructions and PSRA* (for sshr) only has 16-32 bit instructions.
2019-10-15 15:51:50 -07:00
Andrew Brown
6460fe705f Add x86 SIMD ishl
Only the shifts with applicable SSE2 instructions (i.e. 16-64 bit width) are implemented here.
2019-10-15 15:51:50 -07:00
Andrew Brown
1f728c1797 Add x86 legalization for SIMD bnot 2019-10-11 11:05:24 -07:00
Andrew Brown
dbe7dd59da Add x86 SIMD bxor 2019-10-11 11:05:24 -07:00