Commit Graph

237 Commits

Author SHA1 Message Date
Ayomide Bamidele
e4dc9c7944 Update Intel x86 CPU presets to match LLVM (#5490)
* Update Intel x86 CPU presets

* Add LLVM reference

* Remove 32bit CPU architectures

* Rename silvermont to slm

* Fix haswell presets

* Add icelake alias

* Group streaming simd presets

* Add slm silvermont preset

* Remove duplicate alderlake def
2023-01-13 21:02:36 +00:00
Afonso Bordado
f845ebb450 cranelift: Remove is_pic predicate from x86 backend (#5548)
This is still present as shared flags and we don't use the predicate anywhere.
2023-01-09 10:04:45 -08:00
Afonso Bordado
52ba72f341 riscv64: Fix masking on iabs (#5505)
* cranelift: Add `iabs.i128` runtest

* riscv64: Fix incorrect extension in iabs

When lowering iabs, we were accidentally comparing the unextended value
this caused the instruction to misbehave with certain top bits.

This commit also adds a zbb lowering that does not use jumps.
2023-01-03 17:37:25 -08:00
yuyang-ok
cdecc858b4 add riscv64 backend for cranelift. (#4271)
Add a RISC-V 64 (`riscv64`, RV64GC) backend.

Co-authored-by: yuyang <756445638@qq.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
Co-authored-by: Afonso Bordado <afonsobordado@az8.co>
2022-09-27 17:30:31 -07:00
Anton Kirilov
d8b290898c Initial forward-edge CFI implementation (#3693)
* Initial forward-edge CFI implementation

Give the user the option to start all basic blocks that are targets
of indirect branches with the BTI instruction introduced by the
Branch Target Identification extension to the Arm instruction set
architecture.

Copyright (c) 2022, Arm Limited.

* Refactor `from_artifacts` to avoid second `make_executable` (#1)

This involves "parsing" twice but this is parsing just the header of an
ELF file so it's not a very intensive operation and should be ok to do
twice.

* Address the code review feedback

Copyright (c) 2022, Arm Limited.

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
2022-09-08 09:35:58 -05:00
Anton Kirilov
a897742593 Initial back-edge CFI implementation (#3606)
Give the user the option to sign and to authenticate function
return addresses with the operations introduced by the Pointer
Authentication extension to the Arm instruction set architecture.

Copyright (c) 2021, Arm Limited.
2022-08-03 11:08:29 -07:00
Chris Fallin
43f1765272 Cranellift: remove Baldrdash support and related features. (#4571)
* Cranellift: remove Baldrdash support and related features.

As noted in Mozilla's bugzilla bug 1781425 [1], the SpiderMonkey team
has recently determined that their current form of integration with
Cranelift is too hard to maintain, and they have chosen to remove it
from their codebase. If and when they decide to build updated support
for Cranelift, they will adopt different approaches to several details
of the integration.

In the meantime, after discussion with the SpiderMonkey folks, they
agree that it makes sense to remove the bits of Cranelift that exist
to support the integration ("Baldrdash"), as they will not need
them. Many of these bits are difficult-to-maintain special cases that
are not actually tested in Cranelift proper: for example, the
Baldrdash integration required Cranelift to emit function bodies
without prologues/epilogues, and instead communicate very precise
information about the expected frame size and layout, then stitched
together something post-facto. This was brittle and caused a lot of
incidental complexity ("fallthrough returns", the resulting special
logic in block-ordering); this is just one example. As another
example, one particular Baldrdash ABI variant processed stack args in
reverse order, so our ABI code had to support both traversal
orders. We had a number of other Baldrdash-specific settings as well
that did various special things.

This PR removes Baldrdash ABI support, the `fallthrough_return`
instruction, and pulls some threads to remove now-unused bits as a
result of those two, with the  understanding that the SpiderMonkey folks
will build new functionality as needed in the future and we can perhaps
find cleaner abstractions to make it all work.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1781425

* Review feedback.

* Fix (?) DWARF debug tests: add `--disable-cache` to wasmtime invocations.

The debugger tests invoke `wasmtime` from within each test case under
the control of a debugger (gdb or lldb). Some of these tests started to
inexplicably fail in CI with unrelated changes, and the failures were
only inconsistently reproducible locally. It seems to be cache related:
if we disable cached compilation on the nested `wasmtime` invocations,
the tests consistently pass.

* Review feedback.
2022-08-02 19:37:56 +00:00
Afonso Bordado
02c3b47db2 x64: Implement SIMD fma (#4474)
* x64: Add VEX Instruction Encoder

This uses a similar builder pattern to the EVEX Encoder.
Does not yet support memory accesses.

* x64: Add FMA Flag

* x64: Implement SIMD `fma`

* x64: Use 4 register Vex Inst

* x64: Reorder VEX pretty print args
2022-07-25 22:01:02 +00:00
Alex Crichton
709f7e0c8a Enable SSE 4.2 unconditionally (#3833)
* Enable SSE 4.2 unconditionally

Fuzzing over the weekend found that `i64x2` comparison operators
require `pcmpgtq` which is an SSE 4.2 instruction. Along the lines of #3816
this commit unconditionally enables and requires SSE 4.2 for compilation
and fuzzing. It will no longer be possible to create a compiler for
x86_64 with simd enabled if SSE 4.2 is disabled.

* Update comment
2022-02-22 13:23:51 -06:00
Chris Fallin
1c014d129a Cranelift: ensure ISA level needed for SIMD is present when SIMD is enabled. (#3816)
Addresses #3809: when we are asked to create a Cranelift backend with
shared flags that indicate support for SIMD, we should check that the
ISA level needed for our SIMD lowerings is present.
2022-02-16 17:29:30 -08:00
Chris Fallin
ca0e8d0a1d Remove incomplete/unmaintained ARM32 backend (for now). (#3799)
In #3721, we have been discussing what to do about the ARM32 backend in
Cranelift. Currently, this backend supports only 32-bit types, which is
insufficient for full Wasm-MVP; it's missing other critical bits, like
floating-point support; and it has only ever been exercised, AFAIK, via
the filetests for the individual CLIF instructions that are implemented.

We were very very thankful for the original contribution of this
backend, even in its partial state, and we had hoped at the time that we
could eventually mature it in-tree until it supported e.g. Wasm and
other use-cases. But that hasn't yet happened -- to the blame of no-one,
to be clear, we just haven't had a contributor with sufficient time.

Unfortunately, the existence of the backend and lack of active
maintainer now potentially pose a bit of a burden as we hope to make
continuing changes to the backend framework. For example, the ISLE
migration, and the use of regalloc2 that it will allow, would need all
of the existing lowering patterns in the hand-written ARM32 backend to
be rewritten as ISLE rules.

Given that we don't currently have the resources to do this, we think
it's probably best if we, sadly, for now remove this partial backend.
This is not in any way a statement of what we might accept in the
future, though. If, in the future, an ARM32 backend updated to our
latest codebase with an active maintainer were to appear, we'd be happy
to merge it (and likewise for any other architecture!). But for now,
this is probably the best path. Thanks again to the original contributor
@jmkrauz and we hope that this work can eventually be brought back and
reused if someone has the time to do so!
2022-02-14 15:03:52 -08:00
bjorn3
d590e6bc1f Remove x86 old-backend special case from cranelift-codegen-meta 2021-09-30 18:29:49 +02:00
bjorn3
eb01ba1ed1 Flatten directory structure for cranelift_codegen_meta::isa 2021-09-30 18:29:49 +02:00
bjorn3
3e4167ba95 Remove registers from cranelift-codegen-meta 2021-09-29 16:27:47 +02:00
bjorn3
18bd27e90b Remove legalizer support from cranelift-codegen-meta 2021-09-29 16:27:45 +02:00
bjorn3
59e18b7d1b Remove the old riscv backend 2021-09-29 16:23:57 +02:00
bjorn3
9e34df33b9 Remove the old x86 backend 2021-09-29 16:13:46 +02:00
Andrew Brown
26c78c06ef refactor: remove unused field
PR #3131 fixed the failing builds by allowing this field to be dead.
After looking at it further the field is not being used and can be
removedi completely.
2021-07-30 10:58:37 -07:00
bjorn3
1415dd824a Remove all dsl legalizations for arm32, arm64 and s390x 2021-06-20 18:43:45 +02:00
Ulrich Weigand
def54fb1fa s390x: Add z14 support
* Add support for processor features (including auto-detection).

* Move base architecture set requirement back to z14.

* Add z15 feature sets and re-enable z15-specific code generation
  when required features are available.
2021-06-17 10:23:15 +02:00
Andrew Brown
2a9f458ea3 x64: lower i8x16.shuffle to VPERMI2B when possible
When shuffling values from two different registers, the x64 lowering for
`i8x16.shuffle` must first shuffle each register separately and then OR
the results with SSE instructions. With `VPERMI2B`, available in
AVX512VL + AVX512VBMI, this can be done in a single instruction after
the shuffle mask has been moved into the destination register. This
change uses `VPERMI2B` for that case when the CPU supports it.
2021-06-01 11:40:53 -07:00
Andrew Brown
459fce3467 x64: lower i8x16.popcnt to VPOPCNTB when possible
When AVX512VL or AVX512BITALG are available, Wasm SIMD's `popcnt`
instruction can be lowered to a single x64 instruction, `VPOPCNTB`,
instead of 8+ instructions.
2021-05-25 12:16:25 -07:00
Ulrich Weigand
89b5fc776d Support IBM z/Architecture
This adds support for the IBM z/Architecture (s390x-ibm-linux).

The status of the s390x backend in its current form is:
- Wasmtime is fully functional and passes all tests on s390x.
- All back-end features supported, with the exception of SIMD.
- There is still a lot of potential for performance improvements.
- Currently the only supported processor type is z15.
2021-05-10 16:01:16 +02:00
Peter Huene
abf3bf29f9 Add a wasmtime settings command to print Cranelift settings.
This commit adds the `wasmtime settings` command to print out available
Cranelift settings for a target (defaults to the host).

The compile command has been updated to remove the Cranelift ISA options in
favor of encouraging users to use `wasmtime settings` to discover what settings
are available.  This will reduce the maintenance cost for syncing the compile
command with Cranelift ISA flags.
2021-04-01 19:38:19 -07:00
Peter Huene
29d366db7b Add a compile command to Wasmtime.
This commit adds a `compile` command to the Wasmtime CLI.

The command can be used to Ahead-Of-Time (AOT) compile WebAssembly modules.

With the `all-arch` feature enabled, AOT compilation can be performed for
non-native architectures (i.e. cross-compilation).

The `Module::compile` method has been added to perform AOT compilation.

A few of the CLI flags relating to "on by default" Wasm features have been
changed to be "--disable-XYZ" flags.

A simple example of using the `wasmtime compile` command:

```text
$ wasmtime compile input.wasm
$ wasmtime input.cwasm
```
2021-04-01 19:38:18 -07:00
Anton Kirilov
07c27039b1 Cranelift AArch64: Add initial support for the Armv8.1 atomics
This commit enables Cranelift's AArch64 backend to generate code
for instruction set extensions (previously only the base Armv8-A
architecture was supported); also, it makes it possible to detect
the extensions supported by the host when JIT compiling. The new
functionality is applied to the IR instruction `AtomicCas`.

Copyright (c) 2021, Arm Limited.
2021-03-13 02:31:51 +00:00
Alex Crichton
09b976e1d5 Fix a number of warnings on nightly Rust (#2652)
This fixes some issues that are cropping up where some syntax will get
phased out in 2021
2021-02-11 12:42:45 -06:00
Yury Delendik
2964023a77 [SIMD][x86_64] Add encoding for PMADDWD (#2530)
* [SIMD][x86_64] Add encoding for PMADDWD

* also for "experimental_x64"
2020-12-24 07:52:50 -06:00
bjorn3
8f7f8ee0b4 Fix iconst.i8 0 miscompilation 2020-12-12 09:44:05 +01:00
Chris Fallin
39b5736727 Remove LoadSplat opcode, in preparation for pattern-matching Load+Splat.
This was added as an incremental step to improve AArch64 code quality in
PR #2278. At the time, we did not have a way to pattern-match the load +
splat opcode sequence that the relevant Wasm opcodes lowered to.
However, now with PR #2366, we can merge effectful instructions such as
loads into other ops, and so we can do this pattern matching directly.
The pattern-matching update will come in a subsequent commit.
2020-11-16 15:31:56 -08:00
Anton Kirilov
e0b911a4df Introduce the Cranelift IR instruction LoadSplat
It corresponds to WebAssembly's `load*_splat` operations, which
were previously represented as a combination of `Load` and `Splat`
instructions. However, there are architectures such as Armv8-A
that have a single machine instruction equivalent to the Wasm
operations. In order to generate it, it is necessary to merge the
`Load` and the `Splat` in the backend, which is not possible
because the load may have side effects. The new IR instruction
works around this limitation.

The AArch64 backend leverages the new instruction to improve code
generation.

Copyright (c) 2020, Arm Limited.
2020-10-14 13:07:13 +01:00
Nick Fitzgerald
05bf9ea3f3 Rename "Stackmap" to "StackMap"
And "stackmap" to "stack_map".

This commit is purely mechanical.
2020-08-07 10:08:44 -07:00
bjorn3
7b7b1f4997 Rename sarg__ to sarg_t 2020-07-17 12:03:17 +02:00
bjorn3
4431ac1108 Implement SystemV struct argument passing 2020-07-17 12:03:17 +02:00
Andrew Brown
f0b083c6ad Legalize [u|s]widen_high for x86
Use `x86_palignr` and `[u|s]widen_low` for legalizing this instruction.
2020-07-15 11:32:08 -07:00
Andrew Brown
c8ddf8a34c Encode [u|s]widen_low for x86 2020-07-15 11:32:08 -07:00
Andrew Brown
fafef7db77 Add x86_palignr instructions
This instruction is necessary for implementing `[s|u]widen_high`.
2020-07-15 11:32:08 -07:00
Benjamin Bouvier
abf157bd69 machinst x64: Only use the feature flag to enable the x64 new backend;
Before this patch, running the x64 new backend would require both
compiling with --features experimental_x64 and running with
`use_new_backend`.

This patches changes this behavior so that the runtime flag is not
needed anymore: using the feature flag will enforce usage of the new
backend everywhere, making using and testing it much simpler:

    cargo run --features experimental_x64 ;; other CLI options/flags

This also gives a hint at what the meta language generation would look
like after switching to the new backend.

Compiling only with the x64 codegen flag gives a nice compile time speedup.
2020-07-15 13:11:28 +02:00
Andrew Brown
c5a69cee9f Add x86 legalization for fcvt_to_uint_sat.i32x4
This converts an `f32x4` into an `i32x4` (unsigned) with rounding by using a long sequence of SSE4.1 compatible instructions.
2020-07-08 10:20:01 -07:00
Andrew Brown
057c93b64e Add unarrow instruction with x86 implementation
Adds a shared `unarrow` instruction in order to lower the Wasm SIMD specification's unsigned narrowing (see https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#integer-to-integer-narrowing). Additionally, this commit implements the instruction for x86 using PACKUSWB and PACKUSDW for the applicable encodings.
2020-07-02 09:35:45 -07:00
Andrew Brown
65e6de2344 Replace x86_packss with snarrow
Since the Wasm specification contains narrowing instructions (see https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#integer-to-integer-narrowing) that lower to PACKSS*, the x86-specific instruction is not necessary in the CLIF IR.
2020-07-02 09:35:45 -07:00
Chris Fallin
a351fa52b5 Merge pull request #1930 from cfallin/spectre-heap
Spectre mitigation on heap access overflow checks.
2020-07-01 09:23:04 -07:00
Chris Fallin
e694fb1312 Spectre mitigation on heap access overflow checks.
This PR adds a conditional move following a heap bounds check through
which the address to be accessed flows. This conditional move ensures
that even if the branch is mispredicted (access is actually out of
bounds, but speculation goes down in-bounds path), the acually accessed
address is zero (a NULL pointer) rather than the out-of-bounds address.

The mitigation is controlled by a flag that is off by default, but can
be set by the embedding. Note that in order to turn it on by default,
we would need to add conditional-move support to the current x86
backend; this does not appear to be present. Once the deprecated
backend is removed in favor of the new backend, IMHO we should turn
this flag on by default.

Note that the mitigation is unneccessary when we use the "huge heap"
technique on 64-bit systems, in which we allocate a range of virtual
address space such that no 32-bit offset can reach other data. Hence,
this only affects small-heap configurations.
2020-07-01 08:36:09 -07:00
Andrew Brown
737cf1d605 Implement iabs for x86 SIMD
This only covers the types necessary for implementing the Wasm SIMD spec--`i8x16`, `i16x8`, `i32x4`.
2020-06-30 14:00:17 -07:00
Nick Fitzgerald
8c5f59c0cf wasmtime: Implement table.get and table.set
These instructions have fast, inline JIT paths for the common cases, and only
call out to host VM functions for the slow paths. This required some changes to
`cranelift-wasm`'s `FuncEnvironment`: instead of taking a `FuncCursor` to insert
an instruction sequence within the current basic block,
`FuncEnvironment::translate_table_{get,set}` now take a `&mut FunctionBuilder`
so that they can create whole new basic blocks. This is necessary for
implementing GC read/write barriers that involve branching (e.g. checking for
null, or whether a store buffer is at capacity).

Furthermore, it required that the `load`, `load_complex`, and `store`
instructions handle loading and storing through an `r{32,64}` rather than just
`i{32,64}` addresses. This involved making `r{32,64}` types acceptable
instantiations of the `iAddr` type variable, plus a few new instruction
encodings.

Part of #929
2020-06-30 12:00:57 -07:00
Andrew Brown
c9d573d841 Provide spec-compliant legalization for SIMD floating point min/max 2020-06-25 14:48:16 -07:00
Andrew Brown
3675f95bb2 Legalize fcvt_to_sint_sat.i32x4 on x86
Use a lengthy sequence involving CVTTPS2DQ to quiet NaNs and saturate overflow.
2020-06-18 11:39:38 -07:00
Andrew Brown
3740772176 Add encoding for x86 CVTTPS2DQ
This reuses the `x86_cvtt2si` instruction since the packed and scalar versions seem to group together well.
2020-06-18 11:39:38 -07:00
Andrew Brown
01d34e71b9 Add x86 legalization for fcvt_from_uint.f32x4
This converts an `i32x4` into an `f32x4` with some rounding either by using an AVX512VL/F instruction--VCVTUDQ2PS--or a long sequence of SSE4.1 compatible instructions.
2020-06-12 15:06:22 -07:00
Andrew Brown
23ed48f269 Add AVX512F flag 2020-06-12 15:06:22 -07:00