Commit Graph

6452 Commits

Author SHA1 Message Date
Benjamin Bouvier
dad56a2488 cranelift: add a new resumable_trapnz instruction;
This is useful to have to allow resumable_trap to happen in loop
headers, for instance. This is the correct way to implement interrupt
checks in Spidermonkey, which are effectively resumable traps. Previous
implementation was using traps, which is wrong, since traps semantically
can't be resumed after.
2020-06-15 12:04:28 +02:00
Jakub Konka
60d55a3483 Remove a runaway explicit drop 2020-06-13 15:55:01 +02:00
Andrew Brown
f1e773dc85 Translate Wasm's f32x4.convert_i32x4_u instruction to Cranelift's fcvt_from_uint 2020-06-12 15:06:22 -07:00
Andrew Brown
01d34e71b9 Add x86 legalization for fcvt_from_uint.f32x4
This converts an `i32x4` into an `f32x4` with some rounding either by using an AVX512VL/F instruction--VCVTUDQ2PS--or a long sequence of SSE4.1 compatible instructions.
2020-06-12 15:06:22 -07:00
Andrew Brown
23ed48f269 Add AVX512F flag 2020-06-12 15:06:22 -07:00
Andrew Brown
772ce73f7f Add x86_pblendw instruction
This instruction is necessary for lowering `fcvt_from_uint`.
2020-06-12 15:06:22 -07:00
Andrew Brown
546fc9ddf1 Add x86_vcvtudq2ps instruction
This instruction converts i32x4 to f32x4 in several AVX512 feature sets.
2020-06-12 15:06:22 -07:00
bjorn3
9788b02dd5 Bump object to 0.19.0 (#1767)
* Bump object to 0.19.0
2020-06-12 15:37:04 -05:00
Chris Fallin
3db2e3fcc6 Merge pull request #1865 from cfallin/aarch64-amode-reg-reg-extend
AArch64: make use of reg-reg-extend amode.
2020-06-12 11:58:36 -07:00
Chris Fallin
6286ca7310 AArch64: make use of reg-reg-extend amode.
When a load/store instruction needs an address of the form `v0 +
uextend(v1)` or `v0 + sextend(v1)` (or the commuted forms thereof), we
currently generate a separate zero/sign-extend operation and then use a
plain `[rA, rB]` addressing mode. This patch extends `lower_address()`
to look at both addends of an address if it has two addends and a zero
offset, recognize extension operations, and incorporate them directly
into a `[rA, rB, UXTW]` or `[rA, rB, SXTW]` form. This should improve
our performence on WebAssembly workloads, at least, because we often see
a 64-bit linear memory base indexed by a 32-bit (Wasm) pointer value.
2020-06-12 10:40:54 -07:00
Alex Crichton
9a1a0abc48 Pin nightlies to previous night (#1873)
* Pin nightlies to previous night

Fixes some upstream breakage in rust-lang/rust which should get fixed
tomorrow.

* fix-0.65

Co-authored-by: Yury Delendik <ydelendik@mozilla.com>
2020-06-12 12:35:08 -05:00
Thomas
2dbe98b823 📝 update crate requirement for the tust example (#1870) 2020-06-12 10:21:26 -05:00
Dan Gohman
caa87048ab Wasmtime 0.18.0 and Cranelift 0.65.0. 2020-06-11 17:49:56 -07:00
Chris Fallin
4d5fdfcbba Merge pull request #1866 from cfallin/remove-matches
Remove uses of `matches!()` macro, incompatible with Firefox build.
2020-06-11 16:19:57 -07:00
Chris Fallin
cdbe76a1d4 Remove uses of matches!() macro, incompatible with Firefox build.
When we vendor Cranelift into Firefox, we need to be able to build with
the Firefox CI setup (unless we carry patches on top of upstream).
Unfortunately, the Firefox CI currently appears to build with a slightly
older version of Rust: I can't work out which version exactly, but one
without stable support for `matches!()`.

A recent attempt to version-bump Cranelift failed with build errors at
the two locations in this patch:

https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=305994046&repo=autoland&lineNumber=24829

I also see a bunch of uses of `matches!()` in Peepmatic, but those
crates are not built by Firefox, so we can leave them be for now, I
think.
2020-06-11 15:11:10 -07:00
Yury Delendik
70424037c3 Refactor debug library to use object:🧝:* (#1860)
* Add GDB test

* rm stray test resource

* use object:🧝:* structures

* install gdb on CI
2020-06-11 13:53:38 -05:00
Chris Fallin
b0cccf1d87 Merge pull request #1864 from jgouly/bitwise
arm64: Implement SIMD bitwise operations
2020-06-11 11:38:39 -07:00
Chris Fallin
6ba165be01 Merge pull request #1858 from cfallin/fix-scale-b1
Bugfix: scaled addressing mode: round B1 up to one byte.
2020-06-11 11:16:07 -07:00
Joey Gouly
544c5dece5 arm64: Implement SIMD bitwise operations
Copyright (c) 2020, Arm Limited.
2020-06-11 10:58:23 -07:00
Chris Fallin
47402316e0 Add test case: b1-typed spillslot access using UImm12 addressing mode. 2020-06-11 10:27:39 -07:00
Chris Fallin
ed7e410111 Bugfix: scaled addressing mode: round B1 up to one byte.
Issue uncovered by Ben Bouvier during regalloc work.
2020-06-11 10:27:32 -07:00
Pat Hickey
9d47944f0d Merge pull request #1855 from ueno/wip/dueno/null
wasi-common: don't rely on platform dependent "NUL" device
2020-06-11 09:45:23 -07:00
Daiki Ueno
65ebfc3a03 wasi-common: don't rely on platform dependent "NUL" device
If stdio is not inherited nor associated with a file, WasiCtxBuilder
tries to open "/dev/null" ("NUL" on Windows) and attach stdio to it.
While most platforms today support those device files, it would be
good to avoid unnecessary access to the host device if possible.  This
patch instead uses a virtual Handle that emulates the "NUL" device.
2020-06-11 16:46:28 +02:00
Peter Huene
2cfaae85b0 Merge pull request #1861 from peterschwarz/fix-example-typo
Correct example module doc comment typo
2020-06-10 16:15:04 -07:00
Peter Schwarz
2926725d63 Correct example module doc comment typo
Correct the module doc comment typo of "mulit" to "multi".

Signed-off-by: Peter Schwarz <pschwarz@bitwise.io>
2020-06-10 17:20:05 -05:00
Chris Fallin
3570363c35 Merge pull request #1859 from akirilov-arm/simd_address
Enable the spec::simd::simd_address test for AArch64
2020-06-10 13:54:56 -07:00
Anton Kirilov
9d269b0123 Enable the spec::simd::simd_address test for AArch64
Copyright (c) 2020, Arm Limited.
2020-06-10 21:10:42 +01:00
Chris Fallin
a84c1931a0 Merge pull request #1854 from akirilov-arm/simd_load_splat
Enable the wast::Cranelift::spec::simd::simd_load_splat test for AArch64
2020-06-10 12:11:29 -07:00
Alex Crichton
5fa4d36b0d Disable Cranelift debug verifier when fuzzing (#1851)
* Add CLI flags for internal cranelift options

This commit adds two flags to the `wasmtime` CLI:

* `--enable-cranelift-debug-verifier`
* `--enable-cranelift-nan-canonicalization`

These previously weren't exposed from the command line but have been
useful to me at least for reproducing slowdowns found during fuzzing on
the CLI.

* Disable Cranelift debug verifier when fuzzing

This commit disables Cranelift's debug verifier for our fuzz targets.
We've gotten a good number of timeouts on OSS-Fuzz and some I've
recently had some discussion over at google/oss-fuzz#3944 about this
issue and what we can do. The result of that discussion was that there
are two primary ways we can speed up our fuzzers:

* One is independent of Wasmtime, which is to tweak the flags used to
  compile code. The conclusion was that one flag was passed to LLVM
  which significantly increased runtime for very little benefit. This
  has now been disabled in rust-fuzz/cargo-fuzz#229.

* The other way is to reduce the amount of debug checks we run while
  fuzzing wasmtime itself. To put this in perspective, a test case which
  took ~100ms to instantiate was taking 50 *seconds* to instantiate in
  the fuzz target. This 500x slowdown was caused by a ton of
  multiplicative factors, but two major contributors were NaN
  canonicalization and cranelift's debug verifier. I suspect the NaN
  canonicalization itself isn't too pricy but when paired with the debug
  verifier in float-heavy code it can create lots of IR to verify.

This commit is specifically tackling this second point in an attempt to
avoid slowing down our fuzzers too much. The intent here is that we'll
disable the cranelift debug verifier for now but leave all other checks
enabled. If the debug verifier gets a speed boost we can try re-enabling
it, but otherwise it seems like for now it's otherwise not catching any
bugs and creating lots of noise about timeouts that aren't relevant.

It's not great that we have to turn off internal checks since that's
what fuzzing is supposed to trigger, but given the timeout on OSS-Fuzz
and the multiplicative effects of all the slowdowns we have when
fuzzing, I'm not sure we can afford the massive slowdown of the debug verifier.
2020-06-10 12:50:21 -05:00
Johnnie Birch
48f0b10c7a Add initial scalar FP operations (addss, subss, etc) to x64 backend.
Adds support for addss and subss. This is the first lowering for
sse floating point alu and some move operations. The changes here do
some renaming of data structures and adds a couple of new ones
to support sse specific operations. The work done here will likely
evolve as needed to support an efficient, inituative, and consistent
framework.
2020-06-10 18:36:57 +02:00
Yury Delendik
e5b81bbc28 Migrating code to object (from faerie) (#1848)
* Using the "object" library everywhere in wasmtime.
* scroll_derive
2020-06-10 11:27:00 -05:00
Benjamin Bouvier
5d01603390 mach backend: allow snapshotting IR graphs with the SNAPSHOT_REGALLOC env variable;
This also requires the serde feature, which isn't enabled by default,
thus it must be passed as a command-line argument to cargo.
2020-06-10 18:23:04 +02:00
Benjamin Bouvier
46093f6119 Bump regalloc.rs to 0.0.26;
And adapt to regalloc.rs API change to provide the exact number of vregs.
2020-06-10 18:23:04 +02:00
Benjamin Bouvier
5d5b39d685 Allow setting other Cranelift flags when running in Wasmtime. 2020-06-10 17:50:25 +02:00
Anton Kirilov
d941034c2e Enable the wast::Cranelift::spec::simd::simd_load_splat test for AArch64
Copyright (c) 2020, Arm Limited.
2020-06-10 15:01:37 +01:00
Yury Delendik
4ebbcb82a9 Refactor CompiledModule to separate compile and linking stages (#1831)
* Refactor how relocs are stored and handled

* refactor CompiledModule::instantiate and link_module

* Refactor DWARF creation: split generation and serialization

* Separate DWARF data transform from instantiation

* rm LinkContext
2020-06-09 15:09:48 -05:00
Chris Fallin
ac87ed12bd Merge pull request #1847 from akirilov-arm/simd_load_extend
Enable the wast::Cranelift::spec::simd::simd_load_extend test for AArch64
2020-06-09 12:29:06 -07:00
Chris Fallin
c30f99d293 Merge pull request #1826 from jgouly/cmp-i16-i32
arm64: Implement Icmp for I16X8 and I32X4
2020-06-09 12:28:16 -07:00
Jakub Konka
f47133b229 Allow different Handles to act as stdio (#1600)
* Allow any type which implements Handle to act as stdio

There have been requests to allow more than just raw OS handles to
act as stdio in `wasi-common`. This commit makes this possible by
loosening the requirement of the `WasiCtxBuilder` to accept any
type `T: Handle + 'static` to act as any of the stdio handles.

A couple words about correctness of this approach. Currently, since
we only have a single `Handle` super-trait to represent all possible
WASI handle types (files, dirs, stdio, pipes, virtual, etc.), it
is possible to pass in any type to act as stdio which can be wrong.
However, I envision this being a problem only in the near(est) future
until we work out how to split `Handle` into several traits, each
representing a different type of WASI resource. In this particular
case, this would be a resource which would implement the interface
required for a handle to act as a stdio (with appropriate rights, etc.).

* Use OsFile in c-api

* Add some documention to the types exposed by this PR, and a few others

Signed-off-by: Jakub Konka <kubkon@jakubkonka.com>

* Add construction examples and missing docs for Handle trait

* Fix example on Windows

* Merge wasi_preview_builder into create_preview1_instance

Co-authored-by: Pat Hickey <pat@moreproductive.org>
2020-06-09 20:19:20 +02:00
Joey Gouly
df2b031b6a arm64: Implement Icmp for I16X8 and I32X4
Copyright (c) 2020, Arm Limited.
2020-06-09 11:07:43 -07:00
Anton Kirilov
7ac19af498 Enable the wast::Cranelift::spec::simd::simd_load_extend test for AArch64
Copyright (c) 2020, Arm Limited.
2020-06-09 18:05:38 +01:00
Chris Fallin
8da71a145c Merge pull request #1802 from akirilov-arm/simd_align
Enable the wast::Cranelift::spec::simd::simd_align test for AArch64
2020-06-09 09:58:26 -07:00
Chris Fallin
02ae1b4464 Merge pull request #1846 from julian-seward1/better-phis
Rewrite `lower_edge` to produce better phi-translations:
2020-06-09 09:56:52 -07:00
Maciej Woś
858e6f3805 fix memory memory creator reserved_size (#1844)
* fix memory memory creator reserved_size

* update doc comment
2020-06-09 08:56:00 -05:00
Anton Kirilov
51a551fb39 Implement vector element extensions for AArch64
This commit also includes load and extend operations. Both are
prerequisites for enabling further SIMD spec tests.

Copyright (c) 2020, Arm Limited.
2020-06-09 12:28:49 +01:00
Julian Seward
6d25759c8e Rewrite lower_edge to produce better phi-translations:
* ensure that all const assignments are placed at the end of the sequence.
  This minimises live ranges.

* for the non-const assignments, ignore self-assignments.  This can
  dramatically reduce the total number of moves generated, because any
  self-assignments trigger the overlap-case handling, hence invoking the
  double-copy behaviour in cases where it's not necessary.

It's worth pointing out that self-assignments are common, and are not due to
deficiencies in CLIR optimisation.  Rather, they occur whenever a loop back
edge doesn't modify *all* loop-carried values.  This can easily happen if
the loop has multiple "early" back-edges -- "continues" in C parlance.  Eg:

   loop_header(a, b, c, d, e, f):
      ...
      a_new = ...
      b_new = ...
      if (..) goto loop_header(a_new, b_new, c, d, e, f)

      ...
      c_new = ...
      d_new = ...
      if (..) goto loop_header(a_new, b_new, c_new, d_new, e, f)

      etc

For functions with many live values, this can dramatically reduce the number
of spill moves we throw into the register allocator.

In terms of compilation costs, this ranges from neutral for functions which
spill not at all, or minimally (joey_small, joey_med) to a 7.1% reduction in
insn count.

In terms of run costs, for one spill-heavy test (bz2 w/ custom timing harness),
instruction counts are reduced by 4.3%, data reads by 12.3% and data writes
by 18.5%.  Note those last two figures include all reads and writes made by the
generated code, not just spills/reloads, so the proportional reduction in
spill/reload traffic must be greater.
2020-06-09 10:36:32 +02:00
Nick Fitzgerald
fb9f39ce17 Merge pull request #1824 from fitzgen/test-stack-maps
cranelift: Better document and test stack maps
2020-06-08 15:58:20 -07:00
Nick Fitzgerald
6aac4c891e cranelift: Better document and test stack maps 2020-06-08 15:05:20 -07:00
Chris Fallin
e3d89c8a92 Merge pull request #1825 from cfallin/spidermonkey-fixes
Three fixes to various SpiderMonkey-related issues
2020-06-08 13:54:13 -07:00
Chris Fallin
fc2a6f273b Three fixes to various SpiderMonkey-related issues:
- Properly mask constant values down to appropriate width when
  generating a constant value directly in aarch64 backend. This was a
  miscompilation introduced in the new-isel refactor. In combination
  with failure to respect NarrowValueMode, this resulted in a very
  subtle bug when an `i32` constant was used in bit-twiddling logic.

- Add support for `iadd_ifcout` in aarch64 backend as used in explicit
  heap-check mode. With this change, we no longer fail heap-related
  tests with the huge-heap-region mode disabled.

- Remove a panic that was occurring in some tests that are currently
  ignored on aarch64, by simply returning empty/default information in
  `value_label` functionality rather than touching unimplemented APIs.
  This is not a bugfix per-se, but removes confusing panic messages from
  `cargo test` output that might otherwise mislead.
2020-06-08 13:02:00 -07:00