Removes unneeded data structure that was holding instructions for
xmm based move instructions. These instructions can should be categorized
as rm not just r. This change is intended to simplify organization and
cases when lowering.
This patch implements the required but not already available
x64 instructions for copysign as well as the actual lowering sequence
and tests for the newly implemented x64 instructions.
Those instructions include:
andps,
andnps,
movaps,
movd,
orps,
The lowering sequence is based on the lowering for f32.copysign
in the current cranelift backend. movd does not have a test yet
due to some logic needed express a 32-bit register as a source
for xmm_rm_r instructions. This code also begins some
rethinking/refactoring of how the sse move instuctions
are written and so also includes new emit cases that will replace
current ones that match a different enum used to describe sse moves.
`funcref`s are implemented as `NonNull<VMCallerCheckedAnyfunc>`.
This should be more efficient than using a `VMExternRef` that points at a
`VMCallerCheckedAnyfunc` because it gets rid of an indirection, dynamic
allocation, and some reference counting.
Note that the null function reference is *NOT* a null pointer; it is a
`VMCallerCheckedAnyfunc` that has a null `func_ptr` member.
Part of #929
* Allow different Cranelift IR types to be used for different Wasm reference
types.
* Do not assume that all Wasm reference types are always a Cranelift IR
reference type. For example, `funcref`s might not need GC in some
implementations, and can therefore be represented with a pointer rather than a
reference type.
This serves two purposes:
1. It ensures that we call `get_or_create_table` to ensure that the embedder
already had a chance to create the given table (although this is mostly
redundant due to validation).
2. It allows the embedder to easily get the `ir::TableData` associated with this
table, and more easily emit whatever inline JIT code to translate the table
instruction (rather than falling back to VM calls).
* C API: expose wasmtime_linker_get_one_by_name()
* C API: remove unnecessary 'unsafe' qualifiers
* C API: avoid unnecessary mutable borrows of the Linker
This allows embedders to take a different action when a WASI program
exits depending on whether it exited with an exit status or exited with
a trap.
cc bytecodealliance/wasmtime-py#30
From discussion with Julian and Ben, this PR makes a few documentation-
and naming-level changes (no functionality change):
- Document that the `LowerCtx`-provided output register can be used as a
scratch register during the lowered instruction sequence before
placing the final result in it.
- Rename `input_to_*` helpers in the AArch64 backend to
`put_input_in_*`, emphasizing that these are side-effecting helpers
that potentially generate code (e.g., sign/zero-extensions) to ensure
an input value is in a register.
* Enable the spec::simd::simd_align test for AArch64
Copyright (c) 2020, Arm Limited.
* Disable static memory under QEMU on CI
This commit disables the usage of "static" memory on CI and instead
forces all memories to be "dynamic" meaning that they reserve much
smaller chunks of memory. This causes the QEMU process's memory to
drastically drop (10GiB -> 600MiB) and should allow us to keep enabling
tests without hitting the OOM killer on CI.
Closes#1871 (includes that)
Closes#1893
* Fix typo
Co-authored-by: Anton Kirilov <anton.kirilov@arm.com>
Adds support for lowering clif instructions Fdiv and Fmul
for new vcode backend. Misc adds lowering and test for
sqrtss and removes a redundant to_string() func for the
SseOpcode struct.
This introduces two changes:
- first, a Cargo feature is added to make it possible to use the
Cranelift x64 backend directly from wasmtime's CLI.
- second, when passing a `cranelift-flags` parameter, and the given
parameter's name doesn't exist at the target-independent flag level, try
to set it as a target-dependent setting.
These two changes make it possible to try out the new x64 backend with:
cargo run --features experimental_x64 -- run --cranelift-flags use_new_backend=true -- /path/to/a.wasm
Right now, this will fail because most opcodes required by the
trampolines are actually not implemented yet.
This commit enables `wasmtime_runtime::Table` to internally hold elements of
either `funcref` (all that is currently supported) or `externref` (newly
introduced in this commit).
This commit updates `Table`'s API, but does NOT generally propagate those
changes outwards all the way through the Wasmtime embedding API. It only does
enough to get everything compiling and the current test suite passing. It is
expected that as we implement more of the reference types spec, we will bubble
these changes out and expose them to the embedding API.
This allows us to detect when stack walking has failed to walk the whole stack,
and we are potentially missing on-stack roots, and therefore it would be unsafe
to do a GC because we could free objects too early, leading to use-after-free.
When we detect this scenario, we skip the GC.
For host VM code, we use plain reference counting, where cloning increments
the reference count, and dropping decrements it. We can avoid many of the
on-stack increment/decrement operations that typically plague the
performance of reference counting via Rust's ownership and borrowing system.
Moving a `VMExternRef` avoids mutating its reference count, and borrowing it
either avoids the reference count increment or delays it until if/when the
`VMExternRef` is cloned.
When passing a `VMExternRef` into compiled Wasm code, we don't want to do
reference count mutations for every compiled `local.{get,set}`, nor for
every function call. Therefore, we use a variation of **deferred reference
counting**, where we only mutate reference counts when storing
`VMExternRef`s somewhere that outlives the activation: into a global or
table. Simultaneously, we over-approximate the set of `VMExternRef`s that
are inside Wasm function activations. Periodically, we walk the stack at GC
safe points, and use stack map information to precisely identify the set of
`VMExternRef`s inside Wasm activations. Then we take the difference between
this precise set and our over-approximation, and decrement the reference
count for each of the `VMExternRef`s that are in our over-approximation but
not in the precise set. Finally, the over-approximation is replaced with the
precise set.
The `VMExternRefActivationsTable` implements the over-approximized set of
`VMExternRef`s referenced by Wasm activations. Calling a Wasm function and
passing it a `VMExternRef` moves the `VMExternRef` into the table, and the
compiled Wasm function logically "borrows" the `VMExternRef` from the
table. Similarly, `global.get` and `table.get` operations clone the gotten
`VMExternRef` into the `VMExternRefActivationsTable` and then "borrow" the
reference out of the table.
When a `VMExternRef` is returned to host code from a Wasm function, the host
increments the reference count (because the reference is logically
"borrowed" from the `VMExternRefActivationsTable` and the reference count
from the table will be dropped at the next GC).
For more general information on deferred reference counting, see *An
Examination of Deferred Reference Counting and Cycle Detection* by Quinane:
https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929Fixes#1804