This commit goes through the dependencies that wasmtime has and updates
versions where possible. This notably brings in a wasmparser/wast update
which has some simd spec changes with new instructions. Otherwise most
of these are just routine updates.
Re-enables spec tests that were turned off for #2432 and #2470 while
also enabling tests that now work due to patch pushes in the interim.
Currently all SIMD spec tests past. Testing to assure this is ok to
enable hasn't been super intense so we should monitor but there
was an attempt of doing 1000 runs 3 different times to try and reproduce
the issue and it did not occur. In the past would have occurred several
times with that many runs.
We are seeing some repeated but nondeterministic failures of x64 SIMD
tests, as tracked in #2470. In order to err on the side of not
inadvertently breaking/blocking CI for others, we will disable all SIMD
tests for now while we investigate.
This commit implements the interpretation necessary of the instance
section of the module linking proposal. Instantiating a module which
itself has nested instantiated instances will now instantiate the nested
instances properly. This isn't all that useful without the ability to
alias exports off the result, but we can at least observe the side
effects of instantiation through the `start` function.
cc #2094
This patch implements, for aarch64, the following wasm SIMD extensions
Floating-point rounding instructions
https://github.com/WebAssembly/simd/pull/232
Pseudo-Minimum and Pseudo-Maximum instructions
https://github.com/WebAssembly/simd/pull/122
The changes are straightforward:
* `build.rs`: the relevant tests have been enabled
* `cranelift/codegen/meta/src/shared/instructions.rs`: new CLIF instructions
`fmin_pseudo` and `fmax_pseudo`. The wasm rounding instructions do not need
any new CLIF instructions.
* `cranelift/wasm/src/code_translator.rs`: translation into CLIF; this is
pretty much the same as any other unary or binary vector instruction (for
the rounding and the pmin/max respectively)
* `cranelift/codegen/src/isa/aarch64/lower_inst.rs`:
- `fmin_pseudo` and `fmax_pseudo` are converted into a two instruction
sequence, `fcmpgt` followed by `bsl`
- the CLIF rounding instructions are converted to a suitable vector
`frint{n,z,p,m}` instruction.
* `cranelift/codegen/src/isa/aarch64/inst/mod.rs`: minor extension of `pub
enum VecMisc2` to handle the rounding operations. And corresponding `emit`
cases.
The `bitmask.{8x16,16x8,32x4}` instructions do not map neatly to any single
AArch64 SIMD instruction, and instead need a sequence of around ten
instructions. Because of this, this patch is somewhat longer and more complex
than it would be for (eg) x64.
Main changes are:
* the relevant testsuite test (`simd_boolean.wast`) has been enabled on aarch64.
* at the CLIF level, add a new instruction `vhigh_bits`, into which these wasm
instructions are to be translated.
* in the wasm->CLIF translation (code_translator.rs), translate into
`vhigh_bits`. This is straightforward.
* in the CLIF->AArch64 translation (lower_inst.rs), translate `vhigh_bits`
into equivalent sequences of AArch64 instructions. There is a different
sequence for each of the `{8x16, 16x8, 32x4}` variants.
All other changes are AArch64-specific, and add instruction definitions needed
by the previous step:
* Add two new families of AArch64 instructions: `VecShiftImm` (vector shift by
immediate) and `VecExtract` (effectively a double-length vector shift)
* To the existing AArch64 family `VecRRR`, add a `zip1` variant. To the
`VecLanesOp` family add an `addv` variant.
* Add supporting code for the above changes to AArch64 instructions:
- getting the register uses (`aarch64_get_regs`)
- mapping the registers (`aarch64_map_regs`)
- printing instructions
- emitting instructions (`impl MachInstEmit for Inst`). The handling of
`VecShiftImm` is a bit complex.
- emission tests for new instructions and variants.
This commit adds initial (gated) support for the multi-memory wasm
proposal. This was actually quite easy since almost all of wasmtime
already expected multi-memory to be implemented one day. The only real
substantive change is the `memory.copy` intrinsic changes, which now
accounts for the source/destination memories possibly being different.
* wasmtime: Implement `global.{get,set}` for externref globals
We use libcalls to implement these -- unlike `table.{get,set}`, for which we
create inline JIT fast paths -- because no known toolchain actually uses
externref globals.
Part of #929
* wasmtime: Enable `{extern,func}ref` globals in the API
These instructions have fast, inline JIT paths for the common cases, and only
call out to host VM functions for the slow paths. This required some changes to
`cranelift-wasm`'s `FuncEnvironment`: instead of taking a `FuncCursor` to insert
an instruction sequence within the current basic block,
`FuncEnvironment::translate_table_{get,set}` now take a `&mut FunctionBuilder`
so that they can create whole new basic blocks. This is necessary for
implementing GC read/write barriers that involve branching (e.g. checking for
null, or whether a store buffer is at capacity).
Furthermore, it required that the `load`, `load_complex`, and `store`
instructions handle loading and storing through an `r{32,64}` rather than just
`i{32,64}` addresses. This involved making `r{32,64}` types acceptable
instantiations of the `iAddr` type variable, plus a few new instruction
encodings.
Part of #929
`funcref`s are implemented as `NonNull<VMCallerCheckedAnyfunc>`.
This should be more efficient than using a `VMExternRef` that points at a
`VMCallerCheckedAnyfunc` because it gets rid of an indirection, dynamic
allocation, and some reference counting.
Note that the null function reference is *NOT* a null pointer; it is a
`VMCallerCheckedAnyfunc` that has a null `func_ptr` member.
Part of #929
* Enable the spec::simd::simd_align test for AArch64
Copyright (c) 2020, Arm Limited.
* Disable static memory under QEMU on CI
This commit disables the usage of "static" memory on CI and instead
forces all memories to be "dynamic" meaning that they reserve much
smaller chunks of memory. This causes the QEMU process's memory to
drastically drop (10GiB -> 600MiB) and should allow us to keep enabling
tests without hitting the OOM killer on CI.
Closes#1871 (includes that)
Closes#1893
* Fix typo
Co-authored-by: Anton Kirilov <anton.kirilov@arm.com>