Commit Graph

111 Commits

Author SHA1 Message Date
Chris Fallin
e694fb1312 Spectre mitigation on heap access overflow checks.
This PR adds a conditional move following a heap bounds check through
which the address to be accessed flows. This conditional move ensures
that even if the branch is mispredicted (access is actually out of
bounds, but speculation goes down in-bounds path), the acually accessed
address is zero (a NULL pointer) rather than the out-of-bounds address.

The mitigation is controlled by a flag that is off by default, but can
be set by the embedding. Note that in order to turn it on by default,
we would need to add conditional-move support to the current x86
backend; this does not appear to be present. Once the deprecated
backend is removed in favor of the new backend, IMHO we should turn
this flag on by default.

Note that the mitigation is unneccessary when we use the "huge heap"
technique on 64-bit systems, in which we allocate a range of virtual
address space such that no 32-bit offset can reach other data. Hence,
this only affects small-heap configurations.
2020-07-01 08:36:09 -07:00
Andrew Brown
c9d573d841 Provide spec-compliant legalization for SIMD floating point min/max 2020-06-25 14:48:16 -07:00
Andrew Brown
3740772176 Add encoding for x86 CVTTPS2DQ
This reuses the `x86_cvtt2si` instruction since the packed and scalar versions seem to group together well.
2020-06-18 11:39:38 -07:00
Andrew Brown
772ce73f7f Add x86_pblendw instruction
This instruction is necessary for lowering `fcvt_from_uint`.
2020-06-12 15:06:22 -07:00
Andrew Brown
546fc9ddf1 Add x86_vcvtudq2ps instruction
This instruction converts i32x4 to f32x4 in several AVX512 feature sets.
2020-06-12 15:06:22 -07:00
Andrew Brown
5db384cd76 Rename opcode: PMULLQ to VPMULLQ 2020-06-03 16:27:57 -07:00
Andrew Brown
5a32500518 Remove non-existent x86 encoding for sshr_imm.i64x2
This instruction does not exist in the SSE2 feature set; it can be added later with an VEX/EVEX encoding.
2020-06-03 16:27:57 -07:00
Andrew Brown
df171f01b5 Add x86_pmuludq
This instruction multiplies the lower 32 bits of two 64x2 unsigned integers into an i64x2; this is necessary for lowering Wasm's i64x2.mul.
2020-06-03 16:27:57 -07:00
Andrew Brown
9ba9fd0f64 Add x86-specific instruction for i64x2 multiplication
Without this special instruction, legalizing to the AVX512 instruction AND the SSE instruction sequence is impossible. This extra instruction would be rendered unnecessary by the x64 backend.
2020-06-03 16:27:57 -07:00
teapotd
e430984ac4 Improve bitselect codegen with knowledge of operand origin (#1783)
* Encode vselect using BLEND instructions on x86

* Legalize vselect to bitselect

* Optimize bitselect to vselect for some operands

* Add run tests for bitselect-vselect optimization

* Address review feedback
2020-05-29 19:53:11 -07:00
whitequark
a180b5b393 x86_32: fix stack_addr encoding.
Consider this testcase:

    target i686
    function u0:0() -> i32 system_v {
        ss0 = explicit_slot 0
    block0:
        v2 = stack_addr.i32 ss0
        return v2
    }

Before this commit, in 32-bit mode the x86 backend would generate
incorrect code for stack addresses:

     0:   55                      push    ebp
     1:   89 e5                   mov     ebp, esp
     3:   83 ec 08                sub     esp, 8
     6:   8d 44 24 00             lea     eax, [esp]
     a:   00 00                   add     byte ptr [eax], al
     c:   00 83 c4 08 5d c3       add     byte ptr [ebx - 0x3ca2f73c], al

This happened because the ModRM byte indicated a disp8 encoding, but
the instruction actually used a disp32 encoding. After this commit,
correct code is generated:

     0:   55                      push    ebp
     1:   89 e5                   mov     ebp, esp
     3:   83 ec 08                sub     esp, 8
     6:   8d 84 24 00 00 00 00    lea     eax, [esp]
     d:   83 c4 08                add     esp, 8
    10:   5d                      pop     ebp
    11:   c3                      ret
2020-05-29 09:17:36 -07:00
whitequark
880e692fd4 x86: add encoding for bnot.b1.
Fixes #1743.

Co-authored-by: iximeow <git@iximeow.net>
2020-05-28 08:43:25 -07:00
whitequark
4ec16fa057 Legalize 64 bit shifts on x86_32 using PSLLQ/PSRLQ.
Co-authored-by: iximeow <git@iximeow.net>
2020-05-09 03:28:19 -07:00
Andrew Brown
a312506262 Add x86 complex encodings for SIMD load-extend instructions 2020-04-30 11:38:01 -07:00
Andrew Brown
2048d3d30c Add x86 encodings for same-size bint conversions up to 64 bits 2020-04-30 11:21:00 -07:00
Andrew Brown
fb6e8f784d Add x86 pack instructions 2020-04-23 10:55:54 -07:00
Andrew Brown
f5fc09f64a Add x86 unpack instructions 2020-04-23 10:55:54 -07:00
Andrew Brown
65856987cd Add const_addr instruction
This new instruction calculates the effective address of a constant in the constant pool using LEA (x86).
2020-04-17 11:59:47 -07:00
Andrew Brown
d0daef6f60 Avoid infer_rex() and w() on the same x86 encoding template, resolves #1342
In cranelift x86 encodings, it seemed unintuitive to specialize Templates with both `infer_rex()`` and `w()`: if `w()` is specified, the REX.W bit must be set so a REX prefix is alway required--no need to infer it. This change forces us to write `rex().w()``--it's more explicit and shows more clearly what cranelift will emit. This change also modifies the tests that expected DynRex recipes.
2020-04-02 16:50:07 -07:00
Andrew Brown
e425bfcebd Infer REX prefixes for SIMD load and store with displacement 2020-04-02 11:28:42 -07:00
Andrew Brown
dc874a5b3b Infer REX prefixes for SIMD load_extend 2020-04-02 11:28:42 -07:00
Andrew Brown
9336884db5 Avoid inferring REX prefixes in i64 mode; fixes #1421 2020-04-02 11:28:42 -07:00
Andrew Brown
54398156ea Add x86 implementation of SIMD load_extend instructions 2020-03-31 11:35:26 -07:00
Andrew Brown
0d63bd12d8 Infer REX prefix for SIMD operations; fixes #1127
- Convert recipes to have necessary size calculator
- Add a missing binemit function, `put_dynrexmp3`
- Modify the meta-encodings of x86 SIMD instructions to use `infer_rex()`, mostly through the `enc_both_inferred()` helper
- Fix up tests that previously always emitted a REX prefix
2020-03-18 10:12:50 -07:00
Andrew Brown
e1d3930ce4 Add SIMD store_complex 2020-03-17 19:37:55 -07:00
Andrew Brown
368094a95b Add SIMD load_complex 2020-03-17 19:37:55 -07:00
Andrew Brown
bda9d7cfa6 Add SIMD copy_to_ssa 2020-03-17 19:37:55 -07:00
Andrew Brown
444d021ede Add x86 implementation of fcvt_from_sint 2020-03-17 10:52:03 -07:00
Till Schneidereit
8f824a9fc1 Update outdated references to the Cranelift repository
This patch updates or removes all references to the Cranelift repository. It affects links in README documents, issues that were transferred to the Wasmtime repository, CI badges, and a small bunch of sundry items.
2020-03-09 14:06:24 +01:00
Andrew Brown
7f7196a655 Add i64x2 integer multiplication using AVX512DQ 2020-03-06 10:53:22 -08:00
bjorn3
0a1bb3ba6c Add TLS support for ELF and MachO (#1174)
* Add TLS support
* Add binemit and legalize tests
* Spill all caller-saved registers when necessary
2020-02-25 17:50:04 -08:00
Andrew Brown
032e81fd6f Add x86 SIMD average rounding 2020-02-24 09:48:38 -08:00
Andrew Brown
1a9dc743d1 Infer REX prefix for SIMD load instruction 2020-02-19 09:24:05 -08:00
Andrew Brown
936120dcf9 Infer REX prefix for SIMD store and vconst instructions 2020-02-19 09:24:05 -08:00
Ryan Hunt
bbc0a328c7 Codegen: Allow encoding of (r32|r64).(load|store)
Accessing Wasm reference globals that are reference types will
want to use the plain load/store instructions. This commit adds
encodings for these instructions to match loading a i32/i64.
Producers of IR are required to insert the appropriate barriers
around the loads/stores.
2020-01-23 13:37:11 -06:00
Ryan Hunt
848baa0aa7 Codegen: Add ref.is_invalid instruction
Spidermonkey returns a sentinel ref value of '-1' from some VM functions
to indicate failure. This commit adds an instruction analagous to ref.is_null
that checks for this value.
2020-01-23 13:37:11 -06:00
Andrew Brown
e1d513ab4b Fix remaining clippy warnings (#1340)
* clippy: allow complex encoding function

* clippy: remove unnecessary main() function in doctest

* clippy: remove redundant `Type` suffix on LaneType enum variants

* clippy: ignore incorrect debug_assert_with_mut_call warning

* clippy: fix FDE clippy warnings
2020-01-17 14:03:30 -06:00
Benjamin Bouvier
3a4b1cc989 Split define encodings + start splitting instruction definitions (#1322)
* [meta] Split the x86 encodings define function into smaller ones;
* [meta] Start splitting instruction definitions into smaller functions;
2020-01-08 09:38:40 -08:00
bjorn3
9fcd561220 Use explicit rex for brz and brnz encodings (#1308)
Fixes #1305. This papers over the problem to prevent crashes while we investigate the cause.
2019-12-21 23:10:36 -07:00
Sean Stangl
cf9e762f16 Add a DynRex recipe type for x86, decreasing the number of recipes (#1298)
This patch adds a third mode for templates: REX inference is requestable
at template instantiation time. This reduces the number of recipes
by removing rex()/nonrex() redundancy for many instructions.
2019-12-19 15:49:34 -07:00
Andrew Brown
0604ec480c Fix scalar_to_vector: move not wide enough for 64-bit values (#1287)
Previously, the use of `enc_x86_64` emitted two 64-bit mode encodings for `scalar_to_vector.i64`, neither of which contained the REX.W bit telling `MOVD/MOVQ` to move 64 bits of data instead of 32 bits. Now, `scalar_to_vector.i64` will always use a sole 64-bit mode REX.W encoding and `scalar_to_vector` with other widths will have three encodings: a 32-bit mode move, a 64-bit mode move with no REX, and a 64-bit mode move with REX (but not REX.W).
2019-12-16 10:17:08 -08:00
Andrew Brown
1f17e35e95 Add x86 SIMD immediate shifts 2019-11-15 13:45:25 -08:00
Andrew Brown
215884e907 Simplify variable name: change inst_ to inst 2019-11-12 17:05:39 -08:00
Andrew Brown
c8eb4e9612 Add x86 SIMD floating-point arithmetic 2019-11-12 17:05:39 -08:00
Benjamin Bouvier
9080a02e10 Replace CraneStation by bytecodealliance everywhere; (#1221) 2019-11-12 10:09:31 -08:00
Andrew Brown
d32301854d Add x86 SIMD implementation of float comparison 2019-11-08 14:06:53 -08:00
Andrew Brown
0ab5760fd7 Add x86 SIMD instructions for min and max
Only the I8, I16, and I32 versions are included since Cranelift lacks support for AVX.
2019-11-05 16:42:34 -08:00
Andrew Brown
c454c3c771 Add x86 SIMD encoding for icmp sgt 2019-11-05 16:42:34 -08:00
Nick Fitzgerald
a49483408c Many multi-value returns (#1147)
* Add x86 encodings for `bint` converting to `i8` and `i16`

* Introduce tests for many multi-value returns

* Support arbitrary numbers of return values

This commit implements support for returning an arbitrary number of return
values from a function. During legalization we transform multi-value signatures
to take a struct return ("sret") return pointer, instead of returning its values
in registers. Callers allocate the sret space in their stack frame and pass a
pointer to it into the caller, and once the caller returns to them, they load
the return values back out of the sret stack slot. The callee's return
operations are legalized to store the return values through the given sret
pointer.

* Keep track of old, pre-legalized signatures

When legalizing a call or return for its new legalized signature, we may need to
look at the old signature in order to figure out how to legalize the call or
return.

* Add test for multi-value returns and `call_indirect`

* Encode bool -> int x86 instructions in a loop

* Rename `Signature::uses_sret` to `Signature::uses_struct_return_param`

* Rename `p` to `param`

* Add a clarifiying comment in `num_registers_required`

* Rename `num_registers_required` to `num_return_registers_required`

* Re-add newline

* Handle already-assigned parameters in `num_return_registers_required`

* Document what some debug assertions are checking for

* Make "illegalizing" closure's control flow simpler

* Add unit tests and comments for our rounding-up-to-the-next-multiple-of-a-power-of-2 function

* Use `append_isnt_arg` instead of doing the same thing  manually

* Fix grammar in comment

* Add `Signature::uses_special_{param,return}` helper functions

* Inline the definition of `legalize_type_for_sret_load` for readability

* Move sret legalization debug assertions out into their own function

* Add `round_up_to_multiple_of_type_align` helper for readability

* Add a debug assertion that we aren't removing the wrong return value

* Rename `RetPtr` stack slots to `StructReturnSlot`

* Make `legalize_type_for_sret_store` more symmetrical to `legalized_type_for_sret`

* rustfmt

* Remove unnecessary loop labels

* Do not pre-assign offsets to struct return stack slots

Instead, let the existing frame layout algorithm decide where they should go.

* Expand "sret" into explicit "struct return" in doc comment

* typo: "than" -> "then" in comment

* Fold test's debug message into the assertion itself
2019-11-05 14:36:03 -08:00
Andrew Brown
f37d1c7ecc Simplify binding of IntCC::Equals to SIMD icmp; fixes #1150 2019-10-28 11:09:37 -07:00