Lower nop in ISLE in the x64 backend, and remove the final Ok(()) from the lower function to assert that all cases that aren't handled in ISLE will panic.
Ported the existing implementation of `fcmp` for AArch64 to ISLE.
This also ports the `lower_vector_comparison` method to ISLE.
Copyright (c) 2022 Arm Limited
* Add android aarch64 support into c-api
* Remove target test and clean up CMake script c-api
* Deduplicate ExternalProject_Add in c-api Android support
The x64 lowring of `vany_true` both sinks mergeable loads and uses the
original register. This PR fixes the lowering to force the value into a
register first. Ideally we should solve the issue by catching this in
the ISLE type system, as described in #4745, but this resolves the issue
for now.
Fixes#4807.
This retains `lower_amode` in the handwritten code (@akirilov-arm
reports that there is an upcoming patch to port this), but tweaks it
slightly to take a `Value` rather than an `Inst`.
The version of the `arbitrary` crate used in fuzz targets needs to be
the same as the version used in `libfuzzer-sys`. That's why the latter
crate re-exports the former.
But we need to make sure to consistently use the re-exported version.
That's most easily done if that's the only version we have available.
However, `fuzz/Cargo.toml` declared a direct dependency on `arbitrary`,
making it available for import, and leading to that version being used
in a couple places.
There were two copies of `arbitrary` built before, even though they were
the same version: one with the `derive` feature turned on, through the
direct dependency, and one with it turned off when imported through
`libfuzzer-sys`. So I haven't specifically tested this but fuzzer builds
might be slightly faster now.
I have not removed the build-dep on `arbitrary`, because `build.rs` is
not invoked by libFuzzer and so it doesn't matter what version of
`arbitrary` it uses.
Our other crates, like `cranelift-fuzzgen` and `wasmtime-fuzzing`, can
still accidentally use a different version of `arbitrary` than the fuzz
targets which rely on them. This commit only fixes the direct cases
within `fuzz/**`.
Ensure that constants generated for the memory case of XmmMem values are always 16 bytes, ensuring that we don't accidantally perform an unaligned load.
Fixes#4761
* cranelift: Change test runner order
Changes the ordering of runtests to run per target and then per function.
This change doesn't do a lot by itself, but helps future refactorings of runtests.
* cranelift: Rename SingleFunctionCompiler to TestCaseCompiler
* cranelift: Skip runtests per target instead of per run
* cranelift: Deduplicate test names
With the upcoming changes to the runtest infrastructure we require unique ExtNames for all tests.
Note that for test names we have a 16 character limit on test names, and must be unique within those 16 characters.
* cranelift: Add TestFileCompiler to runtests
TestFileCompiler allows us to compile the entire file once, and then call the trampolines for each test.
The previous code was compiling the function for each invocation of a test.
* cranelift: Deduplicate ExtName for avg_round tests
* cranelift: Rename functions as they are defined.
The JIT internally only deals with User functions, and cannot link test name funcs.
This also caches trampolines by signature.
* cranelift: Preserve original name when reporting errors.
* cranelift: Rename aarch64 test functions
* cranelift: Add `call` and `call_indirect` tests!
* cranelift: Add pauth runtests for aarch64
* cranelift: Rename duplicate s390x tests
* cranelift: Delete `i128_bricmp_of` function from i128-bricmp
It looks like we forgot to delete it when it was moved to
`i128-bricmp-overflow`, and since it didn't have a run invocation
it was never compiled.
However, s390x does not support this, and panics when lowering.
* cranelift: Add `colocated` call tests
* cranelift: Rename *more* `s390x` tests
* cranelift: Add pauth + sign_return_address call tests
* cranelift: Undeduplicate test names
With the latest main changes we now support *unlimited* length test names.
This commit reverts:
52274676ff631c630f9879dd32e756566d3e700f
7989edc172493547cdf63e180bb58365e8a43a42
25c8a8395527d98976be6a34baa3b0b214776739
792e8cfa8f748077f9d80fe7ee5e958b7124e83b
* cranelift: Add LibCall tests
* cranelift: Revert more test names
These weren't auto reverted by the previous revert.
* cranelift: Disable libcall tests for aarch64
* cranelift: Runtest fibonacci tests
* cranelift: Misc cleanup
* Implement the remaining socket-related WASI functions.
The original WASI specification included `sock_read`, `sock_write`, and
`shutdown`. Now that we have some sockets support, implement these
additional functions, to make it easier for people porting existing code
to WASI.
It's expected that this will all be subsumed by the wasi-sockets
proposal, but for now, this is a relatively small change which should
hopefully unblock people trying to use the current `accept` support.
* Update to system-interface 0.22, which has fixes for Windows.
* Make wasi-common-std-sync's dependency on system-interface private.
Change some `pub` functions which exposed system-interface types to be
non-`pub`.
And, change `from_sysif_fdflags` functions to `get_fd_flags` functions
that take `impl AsFilelike` arguments instead of system-interface types.
With these changes, system-interface is no longer exposed in the
public API.
* Add a public API for `is_read_write` too.
Implementors using types implementing `AsFilelike` may want to use the
same `is_read_write` logic, without explicitly depending on
system-interface, so provide a function that provides that.
POSIX specifies that functions like `nanosleep` use the REALTIME clock,
so allow WASI `poll_oneoff` calls to use the REALTIME clock, at least
for non-absolute intervals. POSIX specifies that the timeouts should not
be affected by subsequent `clock_settime` calls, so they behave the same
way as MONOTONIC clock requests, so we can implement them as monotonic
requests.
Lower extractlane, scalar_to_vector and splat in ISLE.
This PR also makes some changes to the SinkableLoad api
* change the return type of sink_load to RegMem as there are more functions available for dealing with RegMem
* add reg_mem_to_reg_mem_imm and register it as an automatic conversion
This change is a follow-on from #4515 to add the ability to configure
the `differential` fuzz target by limiting which engines and modules are
used for fuzzing. This is incredibly useful when troubleshooting, e.g.,
when an engine is more prone to failure, we can target that engine
exclusively. The effect of this configuration is visible in the
statistics now printed out from #4739.
Engines are configured using the `ALLOWED_ENGINES` environment variable.
We can either subtract from the set of allowed engines (e.g.,
`ALLOWED_ENGINES=-v8`) or build up a set of allowed engines (e.g.,
`ALLOWED_ENGINES=wasmi,spec`), but not both at the same time.
`ALLOWED_ENGINES` only configures the left-hand side engine; the
right-hand side is always Wasmtime. When omitted, `ALLOWED_ENGINES`
defaults to [`wasmtime`, `wasmi`, `spec`, `v8`].
The generated WebAssembly modules are configured using
`ALLOWED_MODULES`. This environment variables works the same as above
but the available options are: [`wasm-smith`, `single-inst`].
This was likely a copy-paste from the `ast::Pattern` case, but here it
is checking a term name in `ast::Expr` and so should say "... in
expression", not "... in pattern".
Lower `shuffle` and `swizzle` in ISLE.
This PR surfaced a bug with the lowering of `shuffle` when avx512vl and avx512vbmi are enabled: we use `vpermi2b` as the implementation, but panic if the immediate shuffle mask contains any out-of-bounds values. The behavior when the avx512 extensions are not present is that out-of-bounds values are turned into `0` in the result.
I've resolved this by detecting when the shuffle immediate has out-of-bounds indices in the avx512-enabled lowering, and generating an additional mask to zero out the lanes where those indices occur. This brings the avx512 case into line with the semantics of the `shuffle` op: 94bcbe8446/cranelift/codegen/meta/src/shared/instructions.rs (L1495-L1498)
* Port `Fcopysign`..``FcvtToSintSat` to ISLE (AArch64)
Ported the existing implementations of the following opcodes to ISLE on
AArch64:
- `Fcopysign`
- Also introduced missing support for `fcopysign` on vector values, as
per the docs.
- This introduces the vector encoding for the `SLI` machine
instruction.
- `FcvtToUint`
- `FcvtToSint`
- `FcvtFromUint`
- `FcvtFromSint`
- `FcvtToUintSat`
- `FcvtToSintSat`
Copyright (c) 2022 Arm Limited
* Document helpers and abstract conversion checks
* x64: Mask shift amounts for small types
* cranelift: Disable i128 shifts in fuzzer again
They are fixed. But we had a bunch of fuzzgen issues come in, and we don't want to accidentaly mark them as fixed
* cranelift: Avoid masking shifts for 32 and 64 bit cases
* cranelift: Add const shift tests and fix them
* cranelift: Remove const `rotl` cases
Now that `put_masked_in_imm8_gpr` works properly we can simplify rotl/rotr
In order to keep the `ExternalName` enum small, the `TestcaseName`
struct was limited to 17 bytes: a 1 byte length and a 16 byte buffer.
Due to alignment, that made `ExternalName` 20 bytes.
That fixed-size buffer means that the names of functions in Cranelift
filetests are truncated to fit, which limits our ability to give tests
meaningful names. And I think meaningful names are important in tests.
This patch replaces the inline `TestcaseName` buffer with a
heap-allocated slice. We don't care about performance for test names, so
an indirection out to the heap is fine in that case. But we do care
somewhat about the size of `ExternalName` when it's used during
compiles.
On 64-bit systems, `Box<[u8]>` is 16 bytes, so `TestcaseName` gets one
byte smaller. Unfortunately, its alignment is 8 bytes, so `ExternalName`
grows from 20 to 24 bytes.
According to `valgrind --tool=dhat`, this change has very little effect
on compiler performance. Building wasmtime with `--no-default-features
--release`, and compiling the pulldown-cmark benchmark from Sightglass,
I measured these differences between `main` and this patch:
- total number of allocations didn't change (`ExternalName::TestCase` is
not used in normal compiles)
- 592 more bytes allocated over the process lifetime, out of 171.5MiB
- 320 more bytes allocated at peak heap size, out of 12MiB
- 0.24% more instructions executed
- 16,987 more bytes written
- 12,120 _fewer_ bytes read
Lower stack_addr, udiv, sdiv, urem, srem, umulhi, and smulhi in ISLE.
For udiv, sdiv, urem, and srem I opted to move the original lowering into an extern constructor, as the interactions with rax and rdx for the div instruction didn't seem meaningful to implement in ISLE. However, I'm happy to revisit this choice and move more of the embedding into ISLE.
Ported the existing implementations of the following opcodes for AArch64
to ISLE, and implemented support for 64-bit vectors (per the docs):
- `SwidenLow`
- `SwidenHigh`
- `UwidenLow`
- `UwidenHigh`
Also ported `WideningPairwiseDotProductS` as-is.
Copyright (c) 2022 Arm Limited
* Port `vconst` to ISLE (AArch64)
Ported the existing implementation of `vconst` to ISLE for AArch64, and
added support for 64-bit vector constants.
Also introduced 64-bit `vconst` support to the interpreter.
Copyright (c) 2022 Arm Limited
* Replace if-chains with match statements
Copyright (c) 2022 Arm Limited
* Cranelift: extend docs on Inst to discuss `call` instructions
the docs on `Inst` note that the type is returned by non-resultful
instructions built from `InstBuilder`, but did _not_ note that it is
also returned by `call` and `call_indirect`. if you're trying to learn
and use Cranelift by following the docs, this means you'd follow a doc
link to `Inst` that implies that `call` does not return a value - this
is actively misleading, since you'd want to use the returned `Inst` to
find exactly those returned values!
so, this adds a few sentences talking about the case of call `Inst`s.
* cranelift: Add assert to prevent wrong InstFormat being used for the wrong opcode
* cranelift: Use correct instruction format when inserting opcodes in fuzzgen (fixes#4733)
* cranelift: Use debug assert on InstFormat assert
Fixes#4736
Fix lowerings that were using values as both a Reg and a RegMem, making it look like a load could be sunk while its value in a register was still being used. Also add an assert that checks that loads that are sunk are never used.
* Port v8 fuzzer to the new framework
This commit aims to improve the support for the new "meta" differential
fuzzer added in #4515 by ensuring that all existing differential fuzzing
is migrated to this new fuzzer. This PR includes features such as:
* The V8 differential execution is migrated to the new framework.
* `Config::set_differential_config` no longer force-disables wasm
features, instead allowing them to be enabled as per the fuzz input.
* `DiffInstance::{hash, hash}` was replaced with
`DiffInstance::get_{memory,global}` to allow more fine-grained
assertions.
* Support for `FuncRef` and `ExternRef` have been added to `DiffValue`
and `DiffValueType`. For now though generating an arbitrary
`ExternRef` and `FuncRef` simply generates a null value.
* Arbitrary `DiffValue::{F32,F64}` values are guaranteed to use
canonical NaN representations to fix an issue with v8 where with the
v8 engine we can't communicate non-canonical NaN values through JS.
* `DiffEngine::evaluate` allows "successful failure" for cases where
engines can't support that particular invocation, for example v8 can't
support `v128` arguments or return values.
* Smoke tests were added for each engine to ensure that a simple wasm
module works at PR-time.
* Statistics printed from the main fuzzer now include percentage-rates
for chosen engines as well as percentage rates for styles-of-module.
There's also a few small refactorings here and there but mostly just
things I saw along the way.
* Update the fuzzing README
* [fuzz] Remove the `differential` fuzz target
This functionality is already covered by the `differential_meta` target.
* [fuzz] Rename `differential_meta` to `differential`
Now that the `differential_meta` fuzz target does everything that the
existing `differential` target did and more, it can take over the
original name.