Commit Graph

125 Commits

Author SHA1 Message Date
Afonso Bordado
3a4ebd7727 cranelift: Deduplicate match_imm functions
Transforming this into a generic function is proving to be a challenge
since most of the necessary methods are not in a trait. We also need to
cast between the signed and unsigned types, which is difficult to do
in a generic function.

This can be solved for example by adding the num crate as a dependency.
But adding a dependency just to solve this issue seems a bit much.
2021-09-19 15:03:46 +01:00
Afonso Bordado
8115e7252d cranelift: Add support for i128 immediates in parser 2021-09-19 15:02:04 +01:00
Nick Fitzgerald
a1f4b46f64 Bump Wasmtime to version 0.30.0; cranelift to 0.77.0 2021-09-17 10:33:50 -07:00
dheaton-arm
8f057e0482 Implement SaddSat and SsubSat for the interpreter
Implemented `SaddSat` and `SsubSat` to add and subtract signed vector
values, saturating at the type boundaries rather than overflowing.

Changed the parser to allow signed `i8` immediates in vectors as part of
this work; fixes #3276.

Copyright (c) 2021, Arm Limited.
2021-09-03 11:35:39 +01:00
Afonso Bordado
f4ff7c350a cranelift: Add heap support to filetest infrastructure (#3154)
* cranelift: Add heap support to filetest infrastructure

* cranelift: Explicit heap pointer placement in filetest annotations

* cranelift: Add documentation about the Heap directive

* cranelift: Clarify that heap filetests pointers must be laid out sequentially

* cranelift: Use wrapping add when computing bound pointer

* cranelift: Better error messages when invalid signatures are found for heap file tests.
2021-08-24 09:28:41 -07:00
Chris Fallin
a13a777230 Bump to Wasmtime v0.29.0 and Cranelift 0.76.0. 2021-08-02 11:24:09 -07:00
Alex Crichton
e8b8947956 Bump to 0.28.0 (#2972) 2021-06-09 14:00:13 -05:00
Chris Fallin
88455007b2 Bump Wasmtime to v0.27.0 and Cranelift to v0.74.0. 2021-05-20 14:06:41 -07:00
bjorn3
84c79982e7 Remove unnecessary dependencies
Found using cargo-udeps
2021-05-04 13:51:26 +02:00
Chris Fallin
6bec13da04 Bump versions: Wasmtime to 0.26.0, Cranelift to 0.73.0. 2021-04-05 10:48:42 -07:00
Chris Fallin
cb48ea406e Switch default to new x86_64 backend.
This PR switches the default backend on x86, for both the
`cranelift-codegen` crate and for Wasmtime, to the new
(`MachInst`-style, `VCode`-based) backend that has been under
development and testing for some time now.

The old backend is still available by default in builds with the
`old-x86-backend` feature, or by requesting `BackendVariant::Legacy`
from the appropriate APIs.

As part of that switch, it adds some more runtime-configurable plumbing
to the testing infrastructure so that tests can be run using the
appropriate backend. `clif-util test` is now capable of parsing a
backend selector option from filetests and instantiating the correct
backend.

CI has been updated so that the old x86 backend continues to run its
tests, just as we used to run the new x64 backend separately.

At some point, we will remove the old x86 backend entirely, once we are
satisfied that the new backend has not caused any unforeseen issues and
we do not need to revert.
2021-04-02 11:35:53 -07:00
Benjamin Bouvier
6e6713ae0b cranelift: add support for the Mac aarch64 calling convention
This bumps target-lexicon and adds support for the AppleAarch64 calling
convention. Specifically for WebAssembly support, we only have to worry
about the new stack slots convention. Stack slots don't need to be at
least 8-bytes, they can be as small as the data type's size. For
instance, if we need stack slots for (i32, i32), they can be located at
offsets (+0, +4). Note that they still need to be properly aligned on
the data type they're containing, though, so if we need stack slots for
(i32, i64), we can't start the i64 slot at the +4 offset (it must start
at the +8 offset).

Added one test that was failing on the Mac M1, as well as other tests
stressing different yet similar situations.
2021-03-22 10:06:13 +01:00
Nick Fitzgerald
d081ef9c2e Bump Wasmtime to 0.25.0; Cranelift to 0.72.0 2021-03-16 11:02:56 -07:00
Chris Fallin
2d5db92a9e Rework/simplify unwind infrastructure and implement Windows unwind.
Our previous implementation of unwind infrastructure was somewhat
complex and brittle: it parsed generated instructions in order to
reverse-engineer unwind info from prologues. It also relied on some
fragile linkage to communicate instruction-layout information that VCode
was not designed to provide.

A much simpler, more reliable, and easier-to-reason-about approach is to
embed unwind directives as pseudo-instructions in the prologue as we
generate it. That way, we can say what we mean and just emit it
directly.

The usual reasoning that leads to the reverse-engineering approach is
that metadata is hard to keep in sync across optimization passes; but
here, (i) prologues are generated at the very end of the pipeline, and
(ii) if we ever do a post-prologue-gen optimization, we can treat unwind
directives as black boxes with unknown side-effects, just as we do for
some other pseudo-instructions today.

It turns out that it was easier to just build this for both x64 and
aarch64 (since they share a factored-out ABI implementation), and wire
up the platform-specific unwind-info generation for Windows and SystemV.
Now we have simpler unwind on all platforms and we can delete the old
unwind infra as soon as we remove the old backend.

There were a few consequences to supporting Fastcall unwind in
particular that led to a refactor of the common ABI. Windows only
supports naming clobbered-register save locations within 240 bytes of
the frame-pointer register, whatever one chooses that to be (RSP or
RBP). We had previously saved clobbers below the fixed frame (and below
nominal-SP). The 240-byte range has to include the old RBP too, so we're
forced to place clobbers at the top of the frame, just below saved
RBP/RIP. This is fine; we always keep a frame pointer anyway because we
use it to refer to stack args. It does mean that offsets of fixed-frame
slots (spillslots, stackslots) from RBP are no longer known before we do
regalloc, so if we ever want to index these off of RBP rather than
nominal-SP because we add support for `alloca` (dynamic frame growth),
then we'll need a "nominal-BP" mode that is resolved after regalloc and
clobber-save code is generated. I added a comment to this effect in
`abi_impl.rs`.

The above refactor touched both x64 and aarch64 because of shared code.
This had a further effect in that the old aarch64 prologue generation
subtracted from `sp` once to allocate space, then used stores to `[sp,
offset]` to save clobbers. Unfortunately the offset only has 7-bit
range, so if there are enough clobbered registers (and there can be --
aarch64 has 384 bytes of registers; at least one unit test hits this)
the stores/loads will be out-of-range. I really don't want to synthesize
large-offset sequences here; better to go back to the simpler
pre-index/post-index `stp r1, r2, [sp, #-16]` form that works just like
a "push". It's likely not much worse microarchitecturally (dependence
chain on SP, but oh well) and it actually saves an instruction if
there's no other frame to allocate. As a further advantage, it's much
simpler to understand; simpler is usually better.

This PR adds the new backend on Windows to CI as well.
2021-03-11 20:03:52 -08:00
Dan Gohman
8854dec01d Bump version to 0.24.0
I used a specially modified version of the publish script to avoid
bumping the `witx` version.
2021-03-04 18:17:03 -08:00
Dan Gohman
8d90ea0390 Bump version to 0.23.0
I used a specially modified version of the publish script to avoid
bumping the `witx` version.
2021-02-17 15:35:43 -08:00
bjorn3
81b4e48f9f Remove some uses of riscv in tests (#2600)
* Remove some uses of riscv in tests

* Fix typo

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Benjamin Bouvier <public@benj.me>
2021-01-30 23:54:48 +01:00
Andrew Brown
2adb0e8964 security: upgrade smallvec to 1.6.1
Fixes advisory https://rustsec.org/advisories/RUSTSEC-2021-0003.
2021-01-08 16:54:54 -08:00
Nick Fitzgerald
5ad82de3c5 Bump Wasmtime to 0.22.0; Cranelift to 0.69.0 2021-01-07 14:51:12 -08:00
Andrew Brown
c9e8889d47 Update clippy annotation to use latest version (#2375) 2020-11-09 09:24:59 -06:00
Alex Crichton
ab1958434a Bump to 0.21.0 (#2359) 2020-11-05 09:39:53 -06:00
Andrew Brown
3778fa025c Switch DataValue to use Ieee32/Ieee64
As discussed in #2251, in order to be very confident that NaN signaling bits are correctly handled by the compiler, this switches `DataValue` to use Cranelift's `Ieee32` and `Ieee64` structures. This makes it a bit more inconvenient to interpreter Cranelift FP operations but this should change to something like `rustc_apfloat` in the future.
2020-10-07 12:17:17 -07:00
Andrew Brown
6f6f79ef2b refactor: move DataValue from cranelift-reader to cranelift-codegen
This is no change to functionality; the move is necessary in order to return InstructionData immediates in a structure way (see next commit).
2020-10-07 12:17:17 -07:00
Benjamin Bouvier
e2c286deeb machinst x64: enable clif testing
This adds a new feature experimental_x64 for CLIF tests.

A test is run in the new x64 backend iff:

- either the test doesn't have an x86_64 target requirement, signaling
it must be target agnostic or not run on this target.
- or the test does require the x86_64 target, and the test is marked
with the `experimental_x64` feature.

This required one workaround in the parser. The reason is that the
parser will try to use information not provided by the TargetIsa adapter
for the Mach backends, like register names. In particular, parsing test
may fail before the test runner realizes that the test must not be run.
In this case, we early return an almost-empty TestFile from the parser,
under the same conditions as above, so that the caller may filter out
the test properly.

This also copies two tests from the test suite using the new backend,
for demonstration purposes.
2020-09-25 11:12:21 +02:00
Alex Crichton
5e08eb3b83 Bump wasmtime to 0.20.0 (#2222)
At the same time bump cranelift crates to 0.67.0
2020-09-23 13:54:02 -05:00
Joshua Nelson
d28abad441 Upgrade to target-lexicon 0.11
This allows downstream library users to use `CDataModel` without having
to install two different versions of target-lexicon.
2020-09-15 11:40:09 -07:00
Nick Fitzgerald
31cbbd1d20 clif-util: Use anyhow::Error for errors instead of String
Also does the same for `cranelift-filetests`.
2020-09-14 18:29:00 -07:00
bjorn3
0c4e15a52e [reader] Replace == None with .is_none() in Parser::token
This replaces a full function call with matching on both lhs and rhs
with a single cmpb instruction.
2020-08-26 13:01:16 -07:00
bjorn3
0d3f9ad8ef [reader] Avoid handling of unicode when not necessary
Clif files are not meant to be written by end-users anyway. The main
effects are that non-ascii identifiers fail to lex instead of parse and
whitespace must now be in the ascii range. Comments still have full
unicode support.

This also inlines all char::is_* methods to avoid nested matches.

Overall this results in a slight reduction of instruction count.
2020-08-26 13:01:16 -07:00
Julian Seward
25e31739a6 Implement Wasm Atomics for Cranelift/newBE/aarch64.
The implementation is pretty straightforward.  Wasm atomic instructions fall
into 5 groups

* atomic read-modify-write
* atomic compare-and-swap
* atomic loads
* atomic stores
* fences

and the implementation mirrors that structure, at both the CLIF and AArch64
levels.

At the CLIF level, there are five new instructions, one for each group.  Some
comments about these:

* for those that take addresses (all except fences), the address is contained
  entirely in a single `Value`; there is no offset field as there is with
  normal loads and stores.  Wasm atomics require alignment checks, and
  removing the offset makes implementation of those checks a bit simpler.

* atomic loads and stores get their own instructions, rather than reusing the
  existing load and store instructions, for two reasons:

  - per above comment, makes alignment checking simpler

  - reuse of existing loads and stores would require extension of `MemFlags`
    to indicate atomicity, which sounds semantically unclean.  For example,
    then *any* instruction carrying `MemFlags` could be marked as atomic, even
    in cases where it is meaningless or ambiguous.

* I tried to specify, in comments, the behaviour of these instructions as
  tightly as I could.  Unfortunately there is no way (per my limited CLIF
  knowledge) to enforce the constraint that they may only be used on I8, I16,
  I32 and I64 types, and in particular not on floating point or vector types.

The translation from Wasm to CLIF, in `code_translator.rs` is unremarkable.

At the AArch64 level, there are also five new instructions, one for each
group.  All of them except `::Fence` contain multiple real machine
instructions.  Atomic r-m-w and atomic c-a-s are emitted as the usual
load-linked store-conditional loops, guarded at both ends by memory fences.
Atomic loads and stores are emitted as a load preceded by a fence, and a store
followed by a fence, respectively.  The amount of fencing may be overkill, but
it reflects exactly what the SM Wasm baseline compiler for AArch64 does.

One reason to implement r-m-w and c-a-s as a single insn which is expanded
only at emission time is that we must be very careful what instructions we
allow in between the load-linked and store-conditional.  In particular, we
cannot allow *any* extra memory transactions in there, since -- particularly
on low-end hardware -- that might cause the transaction to fail, hence
deadlocking the generated code.  That implies that we can't present the LL/SC
loop to the register allocator as its constituent instructions, since it might
insert spills anywhere.  Hence we must present it as a single indivisible
unit, as we do here.  It also has the benefit of reducing the total amount of
work the RA has to do.

The only other notable feature of the r-m-w and c-a-s translations into
AArch64 code, is that they both need a scratch register internally.  Rather
than faking one up by claiming, in `get_regs` that it modifies an extra
scratch register, and having to have a dummy initialisation of it, these new
instructions (`::LLSC` and `::CAS`) simply use fixed registers in the range
x24-x28.  We rely on the RA's ability to coalesce V<-->R copies to make the
cost of the resulting extra copies zero or almost zero.  x24-x28 are chosen so
as to be call-clobbered, hence their use is less likely to interfere with long
live ranges that span calls.

One subtlety regarding the use of completely fixed input and output registers
is that we must be careful how the surrounding copy from/to of the arg/result
registers is done.  In particular, it is not safe to simply emit copies in
some arbitrary order if one of the arg registers is a real reg.  For that
reason, the arguments are first moved into virtual regs if they are not
already there, using a new method `<LowerCtx for Lower>::ensure_in_vreg`.
Again, we rely on coalescing to turn them into no-ops in the common case.

There is also a ridealong fix for the AArch64 lowering case for
`Opcode::Trapif | Opcode::Trapff`, which removes a bug in which two trap insns
in a row were generated.

In the patch as submitted there are 6 "FIXME JRS" comments, which mark things
which I believe to be correct, but for which I would appreciate a second
opinion.  Unless otherwise directed, I will remove them for the final commit
but leave the associated code/comments unchanged.
2020-08-04 09:35:50 +02:00
bjorn3
7b7b1f4997 Rename sarg__ to sarg_t 2020-07-17 12:03:17 +02:00
bjorn3
4431ac1108 Implement SystemV struct argument passing 2020-07-17 12:03:17 +02:00
Alex Crichton
63d5b91930 Wasmtime 0.19.0 and Cranelift 0.66.0 (#2027)
This commit updates Wasmtime's version to 0.19.0, Cranelift's version to
0.66.0, and updates the release notes as well.
2020-07-16 12:46:21 -05:00
Alex Crichton
85ffc8f595 Switch CI back to nightly channel (#2014)
* Switch CI back to nightly channel

I think all upstream issues are now fixed so we should be good to switch
back to nightly from our previously pinned version.

* Fix doc warnings
2020-07-13 18:40:47 -05:00
Dan Gohman
caa87048ab Wasmtime 0.18.0 and Cranelift 0.65.0. 2020-06-11 17:49:56 -07:00
Dan Gohman
a76639c6fb Wasmtime 0.17.0 and Cranelift 0.64.0. (#1805) 2020-06-02 18:51:59 -07:00
Andrew Brown
0dd77d36f8 Rename BinaryImm format to BinaryImm64 2020-05-29 19:56:27 -07:00
Andrew Brown
a27a079d65 Replace ExtractLane format with BinaryImm8
Like https://github.com/bytecodealliance/wasmtime/pull/1762, this change the name of the `ExtractLane` format to the more-general `BinaryImm8` and renames its immediate argument from `lane` to `imm`.
2020-05-29 19:56:27 -07:00
Andrew Brown
7d6e94b952 Replace InsertLane format with TernaryImm8
The InsertLane format has an ordering (`value().imm().value()`) and immediate name (`"lane"`) that make it awkward to use for other instructions. This changes the ordering (`value().value().imm()`) and uses the default name (`"imm"`) throughout the codebase.
2020-05-29 19:56:27 -07:00
teapotd
fbac2e53f9 Make vconst BxN match specification 2020-05-27 09:37:13 -07:00
Julian Seward
94190d5724 cranelift/reader/src/parser.rs: fn parse_inst_resuts: produce the results as a
SmallVec<[Value; 1]>, not as a Vec<Value>.  This isn't a useful change for any
non-developer use of Cranelift, but it does significantly reduce the amount of
allocation "noise" seen when tuning the new backend pipeline as driven by
clif-util reading .clif files.  In one case the number of malloc calls
declined by about 20% with this change.
2020-05-11 12:27:15 +02:00
Andrew Brown
b4238229c2 Cast DataValues to and from native types
Also, returns a `Result` in the `RunCommand::run` helper.
2020-05-07 16:51:09 -07:00
Benjamin Bouvier
6bee767129 clif-util: try both global and target-dependent settings when parsing --set flags; 2020-05-05 16:35:41 +02:00
Andrew Brown
d6796d0d23 Improve documentation of the filetest run command (#1645)
* Improve output display of RunCommand

The previous use of Debug for displaying `print` and `run` results was less than clear.

* Avoid checking the types of vectors during trampoline construction

Because DataValue only understands `V128` vectors, we avoid type-checking vector values when constructing the trampoline arguments.

* Improve the documentation of the filetest `run` command

Adds an up-to-date example of how to use the `run` and `print` directives and includes an actual use of the new directives in a SIMD arithmetic filetest.
2020-05-04 14:08:27 -05:00
Andrew Brown
38dff29179 Add ability to call CLIF functions with arbitrary arguments in filetests
This resolves the work started in https://github.com/bytecodealliance/cranelift/pull/1231 and https://github.com/bytecodealliance/wasmtime/pull/1436. Cranelift filetests currently have the ability to run CLIF functions with a signature like `() -> b*` and check that the result is true under the `test run` directive. This PR adds the ability to call functions with arbitrary arguments and non-boolean returns and either print the result or check against a list of expected results:
 - `run` commands look like `; run: %add(2, 2) == 4` or `; run: %add(2, 2) != 5` and verify that the executed CLIF function returns the expected value
 - `print` commands look like `; print: %add(2, 2)` and print the result of the function to stdout

To make this work, this PR compiles a single Cranelift `Function` into a `CompiledFunction` using a `SingleFunctionCompiler`. Because we will not know the signature of the function until runtime, we use a `Trampoline` to place the values in the appropriate location for the calling convention; this should look a lot like what @alexcrichton is doing with `VMTrampoline` in wasmtime (see 3b7cb6ee64/crates/api/src/func.rs (L510-L526), 3b7cb6ee64/crates/jit/src/compiler.rs (L260)). To avoid re-compiling `Trampoline`s for the same function signatures, `Trampoline`s are cached in the `SingleFunctionCompiler`.
2020-04-30 11:21:00 -07:00
Dan Gohman
864cf98c8d Update release notes, wasmtime 0.16, cranelift 0.63. 2020-04-29 17:30:25 -07:00
Alex Crichton
c9a0ba81a0 Implement interrupting wasm code, reimplement stack overflow (#1490)
* Implement interrupting wasm code, reimplement stack overflow

This commit is a relatively large change for wasmtime with two main
goals:

* Primarily this enables interrupting executing wasm code with a trap,
  preventing infinite loops in wasm code. Note that resumption of the
  wasm code is not a goal of this commit.

* Additionally this commit reimplements how we handle stack overflow to
  ensure that host functions always have a reasonable amount of stack to
  run on. This fixes an issue where we might longjmp out of a host
  function, skipping destructors.

Lots of various odds and ends end up falling out in this commit once the
two goals above were implemented. The strategy for implementing this was
also lifted from Spidermonkey and existing functionality inside of
Cranelift. I've tried to write up thorough documentation of how this all
works in `crates/environ/src/cranelift.rs` where gnarly-ish bits are.

A brief summary of how this works is that each function and each loop
header now checks to see if they're interrupted. Interrupts and the
stack overflow check are actually folded into one now, where function
headers check to see if they've run out of stack and the sentinel value
used to indicate an interrupt, checked in loop headers, tricks functions
into thinking they're out of stack. An interrupt is basically just
writing a value to a location which is read by JIT code.

When interrupts are delivered and what triggers them has been left up to
embedders of the `wasmtime` crate. The `wasmtime::Store` type has a
method to acquire an `InterruptHandle`, where `InterruptHandle` is a
`Send` and `Sync` type which can travel to other threads (or perhaps
even a signal handler) to get notified from. It's intended that this
provides a good degree of flexibility when interrupting wasm code. Note
though that this does have a large caveat where interrupts don't work
when you're interrupting host code, so if you've got a host import
blocking for a long time an interrupt won't actually be received until
the wasm starts running again.

Some fallout included from this change is:

* Unix signal handlers are no longer registered with `SA_ONSTACK`.
  Instead they run on the native stack the thread was already using.
  This is possible since stack overflow isn't handled by hitting the
  guard page, but rather it's explicitly checked for in wasm now. Native
  stack overflow will continue to abort the process as usual.

* Unix sigaltstack management is now no longer necessary since we don't
  use it any more.

* Windows no longer has any need to reset guard pages since we no longer
  try to recover from faults on guard pages.

* On all targets probestack intrinsics are disabled since we use a
  different mechanism for catching stack overflow.

* The C API has been updated with interrupts handles. An example has
  also been added which shows off how to interrupt a module.

Closes #139
Closes #860
Closes #900

* Update comment about magical interrupt value

* Store stack limit as a global value, not a closure

* Run rustfmt

* Handle review comments

* Add a comment about SA_ONSTACK

* Use `usize` for type of `INTERRUPTED`

* Parse human-readable durations

* Bring back sigaltstack handling

Allows libstd to print out stack overflow on failure still.

* Add parsing and emission of stack limit-via-preamble

* Fix new example for new apis

* Fix host segfault test in release mode

* Fix new doc example
2020-04-21 11:03:28 -07:00
Andrew Brown
0672d1dc0f Declare constants in the function preamble
This allows us to give names to constants in the constant pool and then use these names in the function body. The original behavior, specifiying the constant value as an instruction immediate, is still supported as a shortcut but some filetests had to change since the canonical way of printing the CLIF constants is now in the preamble.
2020-04-17 11:59:47 -07:00
Andrew Brown
5297466add Expose parsing of run commands and trivially use in testing framework
This is necessary to avoid build errors from dead code (and I didn't want to litter all of the structs with `#[allow(dead_code)]` just to remove in a subsequent PR).
2020-04-03 13:25:10 -07:00
Andrew Brown
3d5ff8dc3d Add parsing functionality for run commands 2020-04-03 13:25:10 -07:00