Commit Graph

6471 Commits

Author SHA1 Message Date
Benjamin Bouvier
c2692ecb8a Wasmtime: allow using the experimental Cranelift x64 backend in cli;
This introduces two changes:

- first, a Cargo feature is added to make it possible to use the
Cranelift x64 backend directly from wasmtime's CLI.
- second, when passing a `cranelift-flags` parameter, and the given
parameter's name doesn't exist at the target-independent flag level, try
to set it as a target-dependent setting.

These two changes make it possible to try out the new x64 backend with:

    cargo run --features experimental_x64 -- run --cranelift-flags use_new_backend=true -- /path/to/a.wasm

Right now, this will fail because most opcodes required by the
trampolines are actually not implemented yet.
2020-06-17 17:18:46 +02:00
Benjamin Bouvier
eb548e263d machinst: label ISA-specific changes as such (#1879) 2020-06-17 15:15:14 +02:00
Nick Fitzgerald
56d93b5993 Merge pull request #1887 from fitzgen/todo-issue-for-aarch64-reference-types
Add `TODO` comments with link to issue for aarch64 reference types
2020-06-16 11:00:42 -07:00
Nick Fitzgerald
8f0e330467 Add TODO comments with link to issue for aarch64 reference types 2020-06-16 10:04:27 -07:00
Nick Fitzgerald
647d2b4231 Merge pull request #1832 from fitzgen/externref-stack-maps
externref: implement stack map-based garbage collection
2020-06-15 18:26:24 -07:00
Nick Fitzgerald
683dc15385 Only run reference types tests on x86_64
Cranelift does not support reference types on other targets.
2020-06-15 17:53:31 -07:00
Nick Fitzgerald
7e167cae10 externref: Address review feedback 2020-06-15 15:39:26 -07:00
Nick Fitzgerald
8d671c21e2 wasmtime-runtime: Allow tables to internally hold externrefs (#1882)
This commit enables `wasmtime_runtime::Table` to internally hold elements of
either `funcref` (all that is currently supported) or `externref` (newly
introduced in this commit).

This commit updates `Table`'s API, but does NOT generally propagate those
changes outwards all the way through the Wasmtime embedding API. It only does
enough to get everything compiling and the current test suite passing. It is
expected that as we implement more of the reference types spec, we will bubble
these changes out and expose them to the embedding API.
2020-06-15 16:55:23 -05:00
Nick Fitzgerald
618c278e41 externref: implement a canary for GC stack walking
This allows us to detect when stack walking has failed to walk the whole stack,
and we are potentially missing on-stack roots, and therefore it would be unsafe
to do a GC because we could free objects too early, leading to use-after-free.
When we detect this scenario, we skip the GC.
2020-06-15 09:39:37 -07:00
Nick Fitzgerald
f30ce1fe97 externref: implement stack map-based garbage collection
For host VM code, we use plain reference counting, where cloning increments
the reference count, and dropping decrements it. We can avoid many of the
on-stack increment/decrement operations that typically plague the
performance of reference counting via Rust's ownership and borrowing system.
Moving a `VMExternRef` avoids mutating its reference count, and borrowing it
either avoids the reference count increment or delays it until if/when the
`VMExternRef` is cloned.

When passing a `VMExternRef` into compiled Wasm code, we don't want to do
reference count mutations for every compiled `local.{get,set}`, nor for
every function call. Therefore, we use a variation of **deferred reference
counting**, where we only mutate reference counts when storing
`VMExternRef`s somewhere that outlives the activation: into a global or
table. Simultaneously, we over-approximate the set of `VMExternRef`s that
are inside Wasm function activations. Periodically, we walk the stack at GC
safe points, and use stack map information to precisely identify the set of
`VMExternRef`s inside Wasm activations. Then we take the difference between
this precise set and our over-approximation, and decrement the reference
count for each of the `VMExternRef`s that are in our over-approximation but
not in the precise set. Finally, the over-approximation is replaced with the
precise set.

The `VMExternRefActivationsTable` implements the over-approximized set of
`VMExternRef`s referenced by Wasm activations. Calling a Wasm function and
passing it a `VMExternRef` moves the `VMExternRef` into the table, and the
compiled Wasm function logically "borrows" the `VMExternRef` from the
table. Similarly, `global.get` and `table.get` operations clone the gotten
`VMExternRef` into the `VMExternRefActivationsTable` and then "borrow" the
reference out of the table.

When a `VMExternRef` is returned to host code from a Wasm function, the host
increments the reference count (because the reference is logically
"borrowed" from the `VMExternRefActivationsTable` and the reference count
from the table will be dropped at the next GC).

For more general information on deferred reference counting, see *An
Examination of Deferred Reference Counting and Cycle Detection* by Quinane:
https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf

cc #929

Fixes #1804
2020-06-15 09:39:37 -07:00
Benjamin Bouvier
357fb11f46 Review comments; 2020-06-15 16:39:08 +02:00
Benjamin Bouvier
28c40ba0f7 machinst x64: refactor lowering too; 2020-06-15 16:39:08 +02:00
Benjamin Bouvier
48fb9291bc machinst x64: refactor REX prefix emission; 2020-06-15 16:39:08 +02:00
Benjamin Bouvier
be4102b205 machinst x64: create a Rex wrapper to avoid flags for the REX prefix; 2020-06-15 16:39:08 +02:00
Benjamin Bouvier
d9ca974133 machinst x64: renamings in the emit functions;
This gets closer to Rust naming standards, and shorten a few names.
2020-06-15 16:39:08 +02:00
Benjamin Bouvier
b2a0718404 machinst x64: expand encoding names a bit;
This avoids one, two, and three letter structures names, which makes the
code easier to read (while a bit more verbose).
2020-06-15 16:39:08 +02:00
Benjamin Bouvier
ef5de04d32 machinst/x64: teach regalloc what FP instructions are moves;
and cosmetic changes after #1665 landed.
2020-06-15 16:39:08 +02:00
SlightlyOutOfPhase
0303834082 Fix lightbeam compilation by updating staticvec dependency to version 0.10 (#1878)
* Update StaticVec dependency from 0.9 to 0.10

* Update lockfile also
2020-06-15 09:05:26 -05:00
Benjamin Bouvier
238ae3bf21 cranelift: tweak condition in safepoint detection to check for resumable traps; 2020-06-15 12:04:28 +02:00
Benjamin Bouvier
dad56a2488 cranelift: add a new resumable_trapnz instruction;
This is useful to have to allow resumable_trap to happen in loop
headers, for instance. This is the correct way to implement interrupt
checks in Spidermonkey, which are effectively resumable traps. Previous
implementation was using traps, which is wrong, since traps semantically
can't be resumed after.
2020-06-15 12:04:28 +02:00
Jakub Konka
60d55a3483 Remove a runaway explicit drop 2020-06-13 15:55:01 +02:00
Andrew Brown
f1e773dc85 Translate Wasm's f32x4.convert_i32x4_u instruction to Cranelift's fcvt_from_uint 2020-06-12 15:06:22 -07:00
Andrew Brown
01d34e71b9 Add x86 legalization for fcvt_from_uint.f32x4
This converts an `i32x4` into an `f32x4` with some rounding either by using an AVX512VL/F instruction--VCVTUDQ2PS--or a long sequence of SSE4.1 compatible instructions.
2020-06-12 15:06:22 -07:00
Andrew Brown
23ed48f269 Add AVX512F flag 2020-06-12 15:06:22 -07:00
Andrew Brown
772ce73f7f Add x86_pblendw instruction
This instruction is necessary for lowering `fcvt_from_uint`.
2020-06-12 15:06:22 -07:00
Andrew Brown
546fc9ddf1 Add x86_vcvtudq2ps instruction
This instruction converts i32x4 to f32x4 in several AVX512 feature sets.
2020-06-12 15:06:22 -07:00
bjorn3
9788b02dd5 Bump object to 0.19.0 (#1767)
* Bump object to 0.19.0
2020-06-12 15:37:04 -05:00
Chris Fallin
3db2e3fcc6 Merge pull request #1865 from cfallin/aarch64-amode-reg-reg-extend
AArch64: make use of reg-reg-extend amode.
2020-06-12 11:58:36 -07:00
Chris Fallin
6286ca7310 AArch64: make use of reg-reg-extend amode.
When a load/store instruction needs an address of the form `v0 +
uextend(v1)` or `v0 + sextend(v1)` (or the commuted forms thereof), we
currently generate a separate zero/sign-extend operation and then use a
plain `[rA, rB]` addressing mode. This patch extends `lower_address()`
to look at both addends of an address if it has two addends and a zero
offset, recognize extension operations, and incorporate them directly
into a `[rA, rB, UXTW]` or `[rA, rB, SXTW]` form. This should improve
our performence on WebAssembly workloads, at least, because we often see
a 64-bit linear memory base indexed by a 32-bit (Wasm) pointer value.
2020-06-12 10:40:54 -07:00
Alex Crichton
9a1a0abc48 Pin nightlies to previous night (#1873)
* Pin nightlies to previous night

Fixes some upstream breakage in rust-lang/rust which should get fixed
tomorrow.

* fix-0.65

Co-authored-by: Yury Delendik <ydelendik@mozilla.com>
2020-06-12 12:35:08 -05:00
Thomas
2dbe98b823 📝 update crate requirement for the tust example (#1870) 2020-06-12 10:21:26 -05:00
Dan Gohman
caa87048ab Wasmtime 0.18.0 and Cranelift 0.65.0. 2020-06-11 17:49:56 -07:00
Chris Fallin
4d5fdfcbba Merge pull request #1866 from cfallin/remove-matches
Remove uses of `matches!()` macro, incompatible with Firefox build.
2020-06-11 16:19:57 -07:00
Chris Fallin
cdbe76a1d4 Remove uses of matches!() macro, incompatible with Firefox build.
When we vendor Cranelift into Firefox, we need to be able to build with
the Firefox CI setup (unless we carry patches on top of upstream).
Unfortunately, the Firefox CI currently appears to build with a slightly
older version of Rust: I can't work out which version exactly, but one
without stable support for `matches!()`.

A recent attempt to version-bump Cranelift failed with build errors at
the two locations in this patch:

https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=305994046&repo=autoland&lineNumber=24829

I also see a bunch of uses of `matches!()` in Peepmatic, but those
crates are not built by Firefox, so we can leave them be for now, I
think.
2020-06-11 15:11:10 -07:00
Yury Delendik
70424037c3 Refactor debug library to use object:🧝:* (#1860)
* Add GDB test

* rm stray test resource

* use object:🧝:* structures

* install gdb on CI
2020-06-11 13:53:38 -05:00
Chris Fallin
b0cccf1d87 Merge pull request #1864 from jgouly/bitwise
arm64: Implement SIMD bitwise operations
2020-06-11 11:38:39 -07:00
Chris Fallin
6ba165be01 Merge pull request #1858 from cfallin/fix-scale-b1
Bugfix: scaled addressing mode: round B1 up to one byte.
2020-06-11 11:16:07 -07:00
Joey Gouly
544c5dece5 arm64: Implement SIMD bitwise operations
Copyright (c) 2020, Arm Limited.
2020-06-11 10:58:23 -07:00
Chris Fallin
47402316e0 Add test case: b1-typed spillslot access using UImm12 addressing mode. 2020-06-11 10:27:39 -07:00
Chris Fallin
ed7e410111 Bugfix: scaled addressing mode: round B1 up to one byte.
Issue uncovered by Ben Bouvier during regalloc work.
2020-06-11 10:27:32 -07:00
Pat Hickey
9d47944f0d Merge pull request #1855 from ueno/wip/dueno/null
wasi-common: don't rely on platform dependent "NUL" device
2020-06-11 09:45:23 -07:00
Daiki Ueno
65ebfc3a03 wasi-common: don't rely on platform dependent "NUL" device
If stdio is not inherited nor associated with a file, WasiCtxBuilder
tries to open "/dev/null" ("NUL" on Windows) and attach stdio to it.
While most platforms today support those device files, it would be
good to avoid unnecessary access to the host device if possible.  This
patch instead uses a virtual Handle that emulates the "NUL" device.
2020-06-11 16:46:28 +02:00
Peter Huene
2cfaae85b0 Merge pull request #1861 from peterschwarz/fix-example-typo
Correct example module doc comment typo
2020-06-10 16:15:04 -07:00
Peter Schwarz
2926725d63 Correct example module doc comment typo
Correct the module doc comment typo of "mulit" to "multi".

Signed-off-by: Peter Schwarz <pschwarz@bitwise.io>
2020-06-10 17:20:05 -05:00
Chris Fallin
3570363c35 Merge pull request #1859 from akirilov-arm/simd_address
Enable the spec::simd::simd_address test for AArch64
2020-06-10 13:54:56 -07:00
Anton Kirilov
9d269b0123 Enable the spec::simd::simd_address test for AArch64
Copyright (c) 2020, Arm Limited.
2020-06-10 21:10:42 +01:00
Chris Fallin
a84c1931a0 Merge pull request #1854 from akirilov-arm/simd_load_splat
Enable the wast::Cranelift::spec::simd::simd_load_splat test for AArch64
2020-06-10 12:11:29 -07:00
Alex Crichton
5fa4d36b0d Disable Cranelift debug verifier when fuzzing (#1851)
* Add CLI flags for internal cranelift options

This commit adds two flags to the `wasmtime` CLI:

* `--enable-cranelift-debug-verifier`
* `--enable-cranelift-nan-canonicalization`

These previously weren't exposed from the command line but have been
useful to me at least for reproducing slowdowns found during fuzzing on
the CLI.

* Disable Cranelift debug verifier when fuzzing

This commit disables Cranelift's debug verifier for our fuzz targets.
We've gotten a good number of timeouts on OSS-Fuzz and some I've
recently had some discussion over at google/oss-fuzz#3944 about this
issue and what we can do. The result of that discussion was that there
are two primary ways we can speed up our fuzzers:

* One is independent of Wasmtime, which is to tweak the flags used to
  compile code. The conclusion was that one flag was passed to LLVM
  which significantly increased runtime for very little benefit. This
  has now been disabled in rust-fuzz/cargo-fuzz#229.

* The other way is to reduce the amount of debug checks we run while
  fuzzing wasmtime itself. To put this in perspective, a test case which
  took ~100ms to instantiate was taking 50 *seconds* to instantiate in
  the fuzz target. This 500x slowdown was caused by a ton of
  multiplicative factors, but two major contributors were NaN
  canonicalization and cranelift's debug verifier. I suspect the NaN
  canonicalization itself isn't too pricy but when paired with the debug
  verifier in float-heavy code it can create lots of IR to verify.

This commit is specifically tackling this second point in an attempt to
avoid slowing down our fuzzers too much. The intent here is that we'll
disable the cranelift debug verifier for now but leave all other checks
enabled. If the debug verifier gets a speed boost we can try re-enabling
it, but otherwise it seems like for now it's otherwise not catching any
bugs and creating lots of noise about timeouts that aren't relevant.

It's not great that we have to turn off internal checks since that's
what fuzzing is supposed to trigger, but given the timeout on OSS-Fuzz
and the multiplicative effects of all the slowdowns we have when
fuzzing, I'm not sure we can afford the massive slowdown of the debug verifier.
2020-06-10 12:50:21 -05:00
Johnnie Birch
48f0b10c7a Add initial scalar FP operations (addss, subss, etc) to x64 backend.
Adds support for addss and subss. This is the first lowering for
sse floating point alu and some move operations. The changes here do
some renaming of data structures and adds a couple of new ones
to support sse specific operations. The work done here will likely
evolve as needed to support an efficient, inituative, and consistent
framework.
2020-06-10 18:36:57 +02:00
Yury Delendik
e5b81bbc28 Migrating code to object (from faerie) (#1848)
* Using the "object" library everywhere in wasmtime.
* scroll_derive
2020-06-10 11:27:00 -05:00