Commit Graph

1252 Commits

Author SHA1 Message Date
Andrew Brown
7ef3ae2903 x64: implement vselect with variable blend instructions
This change implements `vselect` using SSE4.1's `BLENDVPS`, `BLENDVPD`,
and `PBLENDVB`. `vselect` is a lane-selecting instruction that is used
by
[simple_preopt.rs](fa1faf5d22/cranelift/codegen/src/simple_preopt.rs (L947-L999))
to lower `bitselect` to a single x86 instruction when the condition mask
is known to be boolean (all 1s or 0s, e.g., from a conversion). This is
better than `bitselect` in general, which lowers to 4-5 instructions.
The old backend had the `vselect` lowering; this simply introduces it to
the new backend.
2021-05-17 11:23:33 -07:00
Andrew Brown
bc0df92137 peepmatic: rebuild peephole optimizers after cranelift/meta change 2021-05-17 06:54:45 -07:00
Andrew Brown
84b6f05971 cranelift: remove unreachable scalar lowerings of saturating arithmetic
Since `uadd_sat`, `sadd_sat`, `usub_sat`, and `ssub_sat` are now only
available to vector types, this removes the lowering code for the
scalar versions of these instructions in the arm32 and aarch64 backends.
2021-05-17 06:54:45 -07:00
Andrew Brown
1fe7676831 cranelift: only allow vector types with saturating arithmetic
This fixes #2883 by restricting which types are available to the `uadd_sat`, `sadd_sat`, `usub_sat`, and `ssub_sat` IR operations.
2021-05-17 06:54:45 -07:00
Andrew Brown
e676589b0c x64: lower i64x2.imul to VPMULLQ when possible
This adds the machinery to encode the VPMULLQ instruction which is
available in AVX512VL and AVX512DQ. When these feature sets are
available, we use this instruction instead of a lengthy 12-instruction
sequence.
2021-05-13 20:14:05 -07:00
Andrew Brown
c982d2be65 x64: move multiplication lowering
Since the lowering of `imul` complicated the other ALU operations it was
matched with and since future commits will alter the multiplication
lowering further, this change moves the `imul` lowering to its own match
block.
2021-05-13 20:14:05 -07:00
Andrew Brown
011e94f3fa x64: add benchmarks for EVEX encoding
This change adds a criterion-enabled benchmark, x64-evex-encoding, to
compare the performance of the builder pattern used to encode EVEX
instructions in the new x64 backend against the function pattern
used to encode EVEX instructions in the legacy x86 backend. At face
value, the results imply that the builder pattern is faster, but no
efforts were made to analyze and optimize these approaches further.
2021-05-13 10:46:08 -07:00
Andrew Brown
c89e6b2353 x64: make the x64 module public
In order to benchmark portions of the x64 module, this change makes it a
public module of cranelift-codegen.
2021-05-13 10:46:08 -07:00
Andrew Brown
02796fc670 x64: move encodings to a separate module
In order to benchmark the encoding code with criterion, the functions
and structures must be public. Moving this code to its own module
(instead of keeping as a submodule to `inst`), allows `inst` to remain
private. This avoids having to expose and document (or ignore
documenting) the numerous instruction variants in `inst` while allowing
access to the encoding code. This commit changes no functionality.
2021-05-13 10:46:08 -07:00
Benjamin Bouvier
d7053ea9c7 Upgrade to the latest versions of gimli, addr2line, object (#2901)
* Upgrade to the latest versions of gimli, addr2line, object

And adapt to API changes. New gimli supports wasm dwarf, resulting in
some simplifications in the debug crate.

* upgrade gimli usage in linux-specific profiling too

* Add "continue" statement after interpreting a wasm local dwarf opcode
2021-05-12 10:53:17 -05:00
Chris Fallin
5fb2c8c235 Merge pull request #2874 from uweigand/s390x-backend
Support IBM z/Architecture
2021-05-10 13:53:23 -07:00
Afonso Bordado
e021995323 Allow i128 amount operands on shift instructions in the x64 backend
Fixes #2727.
2021-05-10 18:32:20 +01:00
Ulrich Weigand
89b5fc776d Support IBM z/Architecture
This adds support for the IBM z/Architecture (s390x-ibm-linux).

The status of the s390x backend in its current form is:
- Wasmtime is fully functional and passes all tests on s390x.
- All back-end features supported, with the exception of SIMD.
- There is still a lot of potential for performance improvements.
- Currently the only supported processor type is z15.
2021-05-10 16:01:16 +02:00
bjorn3
82f3ad4f1a Add comment why thiserror is not used 2021-05-04 13:51:28 +02:00
bjorn3
03fdbadfb4 Remove thiserror dependency from cranelift_codegen 2021-05-04 13:45:20 +02:00
Ulrich Weigand
e1cc1a67d5 Object file support for s390x (#2872)
Add support for s390x binary format object files.  In particular,
add support for s390x ELF relocation types (currently only
S390xPCRel32Dbl).
2021-05-03 11:50:00 -05:00
Anton Kirilov
480670e17f Enable the simd_boolean test for AArch64
Also, enable the simd_i64x2_arith2 test because it doesn't need
any code changes.

Copyright (c) 2021, Arm Limited.
2021-04-27 20:19:51 +01:00
Jubilee Young
a8c956ede1 Factor out byteorder in cranelift
This removes an existing dependency on the byteorder crate in favor of
using std equivalents directly.

While not an issue for wasmtime per se, cranelift is now part of the
critical path of building and testing Rust, and minimizing dependencies,
even small ones, can help reduce the time and bandwidth required.
2021-04-23 12:05:18 -07:00
StackDoubleFlow
9637bc5a09 Fix cranelift Module and ObjectModule docs links (#2852) 2021-04-21 06:29:02 -07:00
Benjamin Bouvier
8ab3511b3b Generate unwind information on Win64 with the old backend
Following the new ABI introduced for efficient support of multiple return values, the old-backend test for generating unwind information was incomplete, resulting in no unwind information being generated and traps not being correctly caught by the runtime.
2021-04-16 18:05:49 +02:00
Benjamin Bouvier
50aa645769 cranelift: use a deferred display wrapper for logging the vcode's IR 2021-04-16 10:27:19 +02:00
Chris Fallin
03077e0de9 Merge pull request #2843 from uweigand/spillslot-fix
cranelift: Fix spillslot regression on big-endian platforms
2021-04-15 13:28:33 -07:00
Ulrich Weigand
10efe8e780 cranelift: Fix spillslot regression on big-endian platforms
PR 2840 changed the store_spillslot routine to always store
integer registers in full word size to a spill slot.  However,
the load_spillslot routine was not updated, which may causes
the contents to be reloaded in a different type.  On big-endian
systems this will fetch wrong data.

Fixed by using the same type override in load_spillslot.
2021-04-15 21:39:14 +02:00
Andrew Brown
0acc1451ea x64: lower iabs.i64x2 using a single AVX512 instruction when possible (#2819)
* x64: add EVEX encoding mechanism

Also, includes an empty stub module for the VEX encoding.

* x64: lower abs.i64x2 to VPABSQ when available

* x64: refactor EVEX encodings to use `EvexInstruction`

This change replaces the `encode_evex` function with a builder-style struct, `EvexInstruction`. This approach clarifies the code, adds documentation, and results in slight speedups when benchmarked.

* x64: rename encoding CodeSink to ByteSink
2021-04-15 11:53:58 -07:00
Chris Fallin
36c667d58d Merge pull request #2837 from uweigand/outgoing-args
Add back support for accumulating outgoing arguments
2021-04-14 12:54:06 -07:00
Chris Fallin
fd4bfbe5a7 Merge pull request #2836 from uweigand/framesizefix
Fix frame size after unwind rework
2021-04-14 12:19:38 -07:00
Chris Fallin
1f21b32e99 Merge pull request #2838 from uweigand/optionalfp
Allow unwind support to work without a frame pointer
2021-04-14 10:58:51 -07:00
Chris Fallin
337cc47d2f Merge pull request #2840 from bnjbvr/fix-2839
cranelift: always spill i32 with i64 stores
2021-04-14 10:11:47 -07:00
Benjamin Bouvier
e7bced9512 cranelift: always spill i32 with i64 stores;
Fixes #2839. See also the issue description and comments in this commits for
details of what the fix is about here.
2021-04-14 18:08:52 +02:00
Ulrich Weigand
5904c09682 Allow unwind support to work without a frame pointer
The patch extends the unwinder to support targets that do not need
to use a dedicated frame pointer register.  Specifically, the
changes include:

- Change the "fp" routine in the RegisterMapper to return an
  *optional* frame pointer regsiter via Option<Register>.

- On targets that choose to not define a FP register via the above
  routine, the UnwindInst::DefineNewFrame operation no longer switches
  the CFA to be defined in terms of the FP.  (The operation still can
  be used to define the location of the clobber area.)

- In addition, on targets that choose not to define a FP register, the
  UnwindInst::PushFrameRegs operation is not supported.

- There is a new operation UnwindInst::StackAlloc that needs to be
  called on targets without FP whenever the stack pointer is updated.
  This caused the CFA offset to be adjusted accordingly.  (On
  targets with FP this operation is a no-op.)
2021-04-14 15:32:31 +02:00
Ulrich Weigand
336c6369b4 Add back support for accumulating outgoing arguments
The unwind rework (commit 2d5db92a) removed support for the
feature to allow a target to allocate the space for outgoing
function arguments right in the prologue (originally added
via commit 80c2d70d).   This patch adds it back.
2021-04-14 13:51:16 +02:00
Ulrich Weigand
e3bb36ba77 Fix frame size after unwind rework
After the unwind rework (commit 2d5db92a) the space used to save
clobbered registers now lies between the nominal SP and the FP.
Therefore, the size of that space should now be included in the
frame size as reported by frame_size(), since this value is used
to compute the nominal_sp_to_fp offset.
2021-04-14 13:46:08 +02:00
Chris Fallin
27b3162f87 Merge pull request #2833 from abrown/2826
x64: fix Inst::store to understand all scalar types
2021-04-13 15:36:41 -07:00
Chris Fallin
8caac9ed79 Merge pull request #2823 from akirilov-arm/callee_saves
Cranelift AArch64: Improve the handling of callee-saved registers
2021-04-13 15:35:46 -07:00
Andrew Brown
6bdef48473 x64: refactor to use Inst::store during lowering
This re-factoring replaces uses of `Inst::mov_r_m` with `Inst::store` to ensure there is only one code location to troubleshoot when generating store instructions for a specific type.
2021-04-13 13:09:07 -07:00
Andrew Brown
9b25b06d86 x64: store to all scalar sizes
Previously, `Inst::store` only understood a subset of the scalar types, which resulted in failures seen in #2826. This change allows `Inst::store` to generate instructions for all scalar widths (`8 | 16 | 32 | 64`) since all of these are supported in the emission code of `Inst::MovRM`.
2021-04-13 12:38:35 -07:00
bjorn3
b272d4b7da Fix srem.{i8,i16} 2021-04-13 21:28:27 +02:00
Anton Kirilov
7248abd591 Cranelift AArch64: Improve the handling of callee-saved registers
SIMD & FP registers are now saved and restored in pairs, similarly
to general-purpose registers. Also, only the bottom 64 bits of the
registers are saved and restored (in case of non-Baldrdash ABIs),
which is the requirement from the Procedure Call Standard for the
Arm 64-bit Architecture.

As for the callee-saved general-purpose registers, if a procedure
needs to save and restore an odd number of them, it no longer uses
store and load pair instructions for the last register.

Copyright (c) 2021, Arm Limited.
2021-04-13 20:23:08 +01:00
Andrew Brown
8e495ac79d x64: match multiple ISA requirements before emitting
Because there are instructions that are present in more than one ISA feature set, we need to see if any of the ISA requirements match before emitting. This change includes the `VPABSQ` instruction as an example, which is present in both `AVX512F` and `AVX512VL`.
2021-04-08 10:30:39 -07:00
Alex Crichton
195bf0e29a Fully support multiple returns in Wasmtime (#2806)
* Fully support multiple returns in Wasmtime

For quite some time now Wasmtime has "supported" multiple return values,
but only in the mose bare bones ways. Up until recently you couldn't get
a typed version of functions with multiple return values, and never have
you been able to use `Func::wrap` with functions that return multiple
values. Even recently where `Func::typed` can call functions that return
multiple values it uses a double-indirection by calling a trampoline
which calls the real function.

The underlying reason for this lack of support is that cranelift's ABI
for returning multiple values is not possible to write in Rust. For
example if a wasm function returns two `i32` values there is no Rust (or
C!) function you can write to correspond to that. This commit, however
fixes that.

This commit adds two new ABIs to Cranelift: `WasmtimeSystemV` and
`WasmtimeFastcall`. The intention is that these Wasmtime-specific ABIs
match their corresponding ABI (e.g. `SystemV` or `WindowsFastcall`) for
everything *except* how multiple values are returned. For multiple
return values we simply define our own version of the ABI which Wasmtime
implements, which is that for N return values the first is returned as
if the function only returned that and the latter N-1 return values are
returned via an out-ptr that's the last parameter to the function.

These custom ABIs provides the ability for Wasmtime to bind these in
Rust meaning that `Func::wrap` can now wrap functions that return
multiple values and `Func::typed` no longer uses trampolines when
calling functions that return multiple values. Although there's lots of
internal changes there's no actual changes in the API surface area of
Wasmtime, just a few more impls of more public traits which means that
more types are supported in more places!

Another change made with this PR is a consolidation of how the ABI of
each function in a wasm module is selected. The native `SystemV` ABI,
for example, is more efficient at returning multiple values than the
wasmtime version of the ABI (since more things are in more registers).
To continue to take advantage of this Wasmtime will now classify some
functions in a wasm module with the "fast" ABI. Only functions that are
not reachable externally from the module are classified with the fast
ABI (e.g. those not exported, used in tables, or used with `ref.func`).
This should enable purely internal functions of modules to have a faster
calling convention than those which might be exposed to Wasmtime itself.

Closes #1178

* Tweak some names and add docs

* "fix" lightbeam compile

* Fix TODO with dummy environ

* Unwind info is a property of the target, not the ABI

* Remove lightbeam unused imports

* Attempt to fix arm64

* Document new ABIs aren't stable

* Fix filetests to use the right target

* Don't always do 64-bit stores with cranelift

This was overwriting upper bits when 32-bit registers were being stored
into return values, so fix the code inline to do a sized store instead
of one-size-fits-all store.

* At least get tests passing on the old backend

* Fix a typo

* Add some filetests with mixed abi calls

* Get `multi` example working

* Fix doctests on old x86 backend

* Add a mixture of wasmtime/system_v tests
2021-04-07 12:34:26 -05:00
Chris Fallin
6bec13da04 Bump versions: Wasmtime to 0.26.0, Cranelift to 0.73.0. 2021-04-05 10:48:42 -07:00
Chris Fallin
cb48ea406e Switch default to new x86_64 backend.
This PR switches the default backend on x86, for both the
`cranelift-codegen` crate and for Wasmtime, to the new
(`MachInst`-style, `VCode`-based) backend that has been under
development and testing for some time now.

The old backend is still available by default in builds with the
`old-x86-backend` feature, or by requesting `BackendVariant::Legacy`
from the appropriate APIs.

As part of that switch, it adds some more runtime-configurable plumbing
to the testing infrastructure so that tests can be run using the
appropriate backend. `clif-util test` is now capable of parsing a
backend selector option from filetests and instantiating the correct
backend.

CI has been updated so that the old x86 backend continues to run its
tests, just as we used to run the new x64 backend separately.

At some point, we will remove the old x86 backend entirely, once we are
satisfied that the new backend has not caused any unforeseen issues and
we do not need to revert.
2021-04-02 11:35:53 -07:00
Peter Huene
b7b47e380d Merge pull request #2791 from peterhuene/compile-command
Add a compile command to Wasmtime.
2021-04-02 11:18:14 -07:00
Andrew Brown
d32501c554 x64: refactor REX-specific encoding machinery to its own module
In preparation for adding new encoding modes to the x64 backend (e.g. VEX,
EVEX), this change moves all of the current instruction encoding functions to
`encodings::rex`. This refactor does not change any logic.
2021-04-02 11:17:39 -07:00
Peter Huene
0ddfe97a09 Change how flags are stored in serialized modules.
This commit changes how both the shared flags and ISA flags are stored in the
serialized module to detect incompatibilities when a serialized module is
instantiated.

It improves the error reporting when a compiled module has mismatched shared
flags.
2021-04-01 21:39:57 -07:00
Peter Huene
abf3bf29f9 Add a wasmtime settings command to print Cranelift settings.
This commit adds the `wasmtime settings` command to print out available
Cranelift settings for a target (defaults to the host).

The compile command has been updated to remove the Cranelift ISA options in
favor of encouraging users to use `wasmtime settings` to discover what settings
are available.  This will reduce the maintenance cost for syncing the compile
command with Cranelift ISA flags.
2021-04-01 19:38:19 -07:00
Peter Huene
29d366db7b Add a compile command to Wasmtime.
This commit adds a `compile` command to the Wasmtime CLI.

The command can be used to Ahead-Of-Time (AOT) compile WebAssembly modules.

With the `all-arch` feature enabled, AOT compilation can be performed for
non-native architectures (i.e. cross-compilation).

The `Module::compile` method has been added to perform AOT compilation.

A few of the CLI flags relating to "on by default" Wasm features have been
changed to be "--disable-XYZ" flags.

A simple example of using the `wasmtime compile` command:

```text
$ wasmtime compile input.wasm
$ wasmtime input.cwasm
```
2021-04-01 19:38:18 -07:00
Johnnie Birch
31d3db1ec2 Implements convert low signed integer to float for x64 simd 2021-03-26 12:13:29 -07:00
Alex Crichton
30d9164b6e Fix a number of warnings cropping up on nightly Rust (#2767)
Various small issues here and there, nothing major
2021-03-25 13:19:37 -05:00
Alex Crichton
3f694ae319 Use stable Rust on CI to test the x64 backend (#2766)
* Use stable Rust on CI to test the x64 backend

This commit leverages the newly-released 1.51.0 compiler to test the
new backend on Windows and Linux with a stable compiler instead of a
nightly compiler. This isolates the nightly build to just the nightly
documentation generation and fuzzing, both of which rely on nightly for
the best results right now.

* Use updated stable in book build job

* Run rustfmt for new stable

* Silence new warnings for wasi-nn

* Allow some dead code in the x64 backend

Looks like new rustc is better about emitting some dead-code warnings

* Update rust in peepmatic job

* Fix a test in the pooling allocator

* Remove `package.metdata.docs.rs` temporarily

Needs resolution of https://github.com/rust-lang/cargo/pull/9300 first

* Fix a warning in a wasi-nn example
2021-03-25 13:18:59 -05:00