Commit Graph

10334 Commits

Author SHA1 Message Date
Benjamin Bouvier
2ce03cce08 [cranelift] Re-add missing blanket implementations for &mut Module (#4719) 2022-08-16 08:42:40 -07:00
Alex Crichton
1e12645ab1 Fix a bad bounds check in component trampolines (#4716)
A `GtU` condition needed to actually be `GeU`, as the comment right
above it stated but apparently I forgot to translate the comment to
actual code. This fixes a fuzz bug that arose from oss-fuzz over the
weekend.
2022-08-16 09:20:45 -05:00
Afonso Bordado
0f944937c0 cranelift: Fuzz icmp and fcmp (#4713)
* cranelift: Add `fcmp` to fuzzer

* cranelift: Add IntCC::all()

* cranelift: Add `icmp` to fuzzer
2022-08-15 23:16:50 +00:00
Trevor Elliott
498e7156b4 Remove the handling of cmpps in produces_const (#4714)
https://github.com/bytecodealliance/wasmtime/pull/4714
2022-08-15 15:48:01 -07:00
Nick Fitzgerald
ae7688059d Cranelift: Use bump allocation in remove_constant_phis pass (#4710)
* Cranelift: Use bump allocation in `remove_constant_phis` pass

This makes compilation 2-6% faster for Sightglass's bz2 benchmark:

```
compilation :: cycles :: benchmarks/bz2/benchmark.wasm

  Δ = 7290648.36 ± 4245152.07 (confidence = 99%)

  bump.so is 1.02x to 1.06x faster than main.so!

  [166388177 183238542.98 214732518] bump.so
  [172836648 190529191.34 217514271] main.so

compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm

  No difference in performance.

  [182220055 225793551.12 277857575] bump.so
  [193212613 227784078.61 277175335] main.so

compilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm

  No difference in performance.

  [3848442474 4295214144.37 4665127241] bump.so
  [3969505457 4262415290.10 4563869974] main.so
```

* Add audit for `bumpalo`

* Add an audit of `arrayvec` version 0.7.2

* Remove unnecessary `collect` into `Vec`

I wasn't able to measure any perf difference here, but its nice to do anyways.

* Use a `SecondaryMap` for keeping track of summaries
2022-08-15 21:36:01 +00:00
Alex Crichton
cc955e4e7e Rename MmapVec::drain to split_off (#4673)
* Rename `MmapVec::drain` to `split_off`

As suggested on #4609

* Fix tests

* Make MmapVec::split_off work like Vec::split_off

Co-authored-by: Jamey Sharp <jsharp@fastly.com>
2022-08-15 21:00:12 +00:00
Nick Fitzgerald
e0d4934ef4 Cranelift: Remove the ABICaller trait (#4711)
* Cranelift: Remove the `ABICaller` trait

It has only one implementation: the `ABICallerImpl` struct. We can just use that
directly rather than having extra, unnecessary layers of generics and abstractions.

* Cranelift: Rename `ABICallerImpl` to `Caller`
2022-08-15 20:41:08 +00:00
Trevor Elliott
1d0f6fa4fb Fix a bug in produces_const (#4709)
https://github.com/bytecodealliance/wasmtime/pull/4709
2022-08-15 19:00:33 +00:00
Nick Fitzgerald
f0c60f46a8 Cranelift: Remove ABICallee trait (#4701)
* Cranelift: Remove `ABICallee` trait

It has only one implementation: the `ABICalleeImpl` struct. By using that
directly we can avoid unnecessary layers of generics and abstractions as well as
a couple `Box`es that were previously putting the single implementation into a
`Box<dyn>`.

* Cranelift: Rename `ABICalleeImpl` to `AbiCallee`

* Fix comments as per review

* Rename `AbiCallee` to `Callee`
2022-08-15 18:27:05 +00:00
Afonso Bordado
863cbc345c cranelift: Fix icmp.i128 eq for aarch64 (#4706)
* cranelift: Fix `icmp.i128 eq` for aarch64

* cranelift: Use ccmp in `icmp.i128 eq` for aarch64
2022-08-15 11:11:22 -07:00
Afonso Bordado
e577a76c0d cranelift: Sign extend immediates in instructions that embed them. (#4602)
* cranelift: Sign extend immediates in instructions that embed them.

* cranelift: Clarify imm instruction behaviour

* cranelift: Deduplicate imm_const

* cranelift: zero extend logical imm ops
2022-08-15 18:08:20 +00:00
Afonso Bordado
c6d2a3f94e cranelift: Add ireduce/iconcat/isplit to the clif fuzzer (#4703)
* cranelift: Add ireduce to fuzzer

* cranelift: Add iconcat/isplit to fuzzer
2022-08-15 09:18:08 -07:00
Alex Crichton
7ddb90d990 Remove recursion building types in component_api fuzzer (#4694)
* Remove recursion building types in `component_api` fuzzer

Sure enough the fuzzers found an input that blows the stack, so the
type-building here was rewritten to use a heap-based stack instead of a
stack-based-stack.

* Review comments
2022-08-15 14:55:34 +00:00
Alex Crichton
69483a2575 Add source tarballs to our releases (#4294)
* Add source tarballs to our releases

This commit adds a small script to create a source tarball as part of
the release process. This goes further than requested by #3808 by
vendoring all Rust dependencies as well to be more in line with
"download the source once then build somewhere without a network".
Vendoring the Rust dependencies makes the tarball pretty beefy (67M
compressed, 500M uncompressed). Unfortunately most of this size comes
from vendored crates such as v8, pqcrypto-kyber, winapi, capstone-sys,
plotters, and web-sys. Only `winapi` in this list is actually needed for
`wasmtime`-the-binary and only on Windows as well but for now this is
the state of things related to `cargo vendor`. If this becomes an issue
we could specifically remove the bulky contents of crates in the
`vendor` directory such as `v8` since it's only used for fuzzing.

Closes #3808

* Review feedback

* Review comments
2022-08-15 09:27:09 -05:00
Benjamin Bouvier
8a9b1a9025 Implement an incremental compilation cache for Cranelift (#4551)
This is the implementation of https://github.com/bytecodealliance/wasmtime/issues/4155, using the "inverted API" approach suggested by @cfallin (thanks!) in Cranelift, and trait object to provide a backend for an all-included experience in Wasmtime. 

After the suggestion of Chris, `Function` has been split into mostly two parts:

- on the one hand, `FunctionStencil` contains all the fields required during compilation, and that act as a compilation cache key: if two function stencils are the same, then the result of their compilation (`CompiledCodeBase<Stencil>`) will be the same. This makes caching trivial, as the only thing to cache is the `FunctionStencil`.
- on the other hand, `FunctionParameters` contain the... function parameters that are required to finalize the result of compilation into a `CompiledCode` (aka `CompiledCodeBase<Final>`) with proper final relocations etc., by applying fixups and so on.

Most changes are here to accomodate those requirements, in particular that `FunctionStencil` should be `Hash`able to be used as a key in the cache:

- most source locations are now relative to a base source location in the function, and as such they're encoded as `RelSourceLoc` in the `FunctionStencil`. This required changes so that there's no need to explicitly mark a `SourceLoc` as the base source location, it's automatically detected instead the first time a non-default `SourceLoc` is set.
- user-defined external names in the `FunctionStencil` (aka before this patch `ExternalName::User { namespace, index }`) are now references into an external table of `UserExternalNameRef -> UserExternalName`, present in the `FunctionParameters`, and must be explicitly declared using `Function::declare_imported_user_function`.
- some refactorings have been made for function names:
  - `ExternalName` was used as the type for a `Function`'s name; while it thus allowed `ExternalName::Libcall` in this place, this would have been quite confusing to use it there. Instead, a new enum `UserFuncName` is introduced for this name, that's either a user-defined function name (the above `UserExternalName`) or a test case name.
  - The future of `ExternalName` is likely to become a full reference into the `FunctionParameters`'s mapping, instead of being "either a handle for user-defined external names, or the thing itself for other variants". I'm running out of time to do this, and this is not trivial as it implies touching ISLE which I'm less familiar with.

The cache computes a sha256 hash of the `FunctionStencil`, and uses this as the cache key. No equality check (using `PartialEq`) is performed in addition to the hash being the same, as we hope that this is sufficient data to avoid collisions.

A basic fuzz target has been introduced that tries to do the bare minimum:

- check that a function successfully compiled and cached will be also successfully reloaded from the cache, and returns the exact same function.
- check that a trivial modification in the external mapping of `UserExternalNameRef -> UserExternalName` hits the cache, and that other modifications don't hit the cache.
  - This last check is less efficient and less likely to happen, so probably should be rethought a bit.

Thanks to both @alexcrichton and @cfallin for your very useful feedback on Zulip.

Some numbers show that for a large wasm module we're using internally, this is a 20% compile-time speedup, because so many `FunctionStencil`s are the same, even within a single module. For a group of modules that have a lot of code in common, we get hit rates up to 70% when they're used together. When a single function changes in a wasm module, every other function is reloaded; that's still slower than I expect (between 10% and 50% of the overall compile time), so there's likely room for improvement. 

Fixes #4155.
2022-08-12 16:47:43 +00:00
Afonso Bordado
ac9725840d cranelift: Add shifts and extends to fuzzer (#4700)
* cranelift: Remove shifts-small-types runtests

These were moved to the main shifts file in #4519 but this file was accidentaly left in tree.

It also fixes the missing sshr_i8_i8 testcase

* cranelift: Add shifts to fuzzer

* cranelift: Add extends to fuzzer
2022-08-11 17:57:00 -07:00
Nick Fitzgerald
532fb22af6 Cranelift: Remove the LowerCtx trait (#4697)
The trait had only one implementation: the `Lower` struct. It is easier to just
use that directly, and not introduce unnecessary layers of generics and
abstractions.

Once upon a time, there was hope that we would have other implementations of the
`LowerCtx` trait, that did things like lower CLIF to SMTLIB for
verification. However, this is not practical these days given the way that the
trait has evolved over time, and our verification efforts are focused on ISLE
now anyways, and we're actually making some progress on that front (much more
than anyone ever did on a second `LowerCtx` trait implementation!)
2022-08-11 16:54:17 -07:00
Andrew Brown
a83c50321f cranelift: fix build warning (#4698)
In #4375 we introduced a code pattern that appears as a warning when
building the `cranelift-interpreter` crate:

```
warning: cannot borrow `*state` as mutable because it is also borrowed as immutable
   --> cranelift/interpreter/src/step.rs:412:13
    |
47  |     let arg = |index: usize| -> Result<V, StepError> {
    |               -------------------------------------- immutable borrow occurs here
48  |         let value_ref = inst_context.args()[index];
49  |         state
    |         ----- first borrow occurs due to use of `*state` in closure
...
412 |             state.set_pinned_reg(arg(0)?);
    |             ^^^^^^^^^^^^^^^^^^^^^---^^^^^
    |             |                    |
    |             |                    immutable borrow later used here
    |             mutable borrow occurs here
    |
    = note: `#[warn(mutable_borrow_reservation_conflict)]` on by default
    = warning: this borrowing pattern was not meant to be accepted, and may become a hard error in the future
    = note: for more information, see issue #59159 <https://github.com/rust-lang/rust/issues/59159>
```

This change fixes the warning.
2022-08-11 23:52:00 +00:00
Afonso Bordado
2be15ab814 cranelift: Add Rotates to fuzzer (#4696) 2022-08-11 23:12:54 +00:00
Afonso Bordado
3ea1813173 x64: Add native lowering for scalar fma (#4539)
Use `vfmadd213{ss,sd}` for these lowerings.
2022-08-11 22:48:16 +00:00
Alex Crichton
755cd4311e Update max tuple size in component api fuzzing (#4675)
Fixes a build failure on #4673 where tuples of length 16 don't implement
`Debug` from the standard library.
2022-08-11 20:24:48 +00:00
Alex Crichton
380db48ce6 Enable the memory-init-cow feature building the C API (#4690)
This feature was accidentally disabled by default when building the C API.
2022-08-11 20:09:46 +00:00
Trevor Elliott
0c2e0494bd x64: Lower fcvt_from_uint in ISLE (#4684)
* Add a test for the existing behavior of fcvt_from_unit

* Migrate the I8, I16, I32 cases of fcvt_from_uint

* Implement the I64 case of fcvt_from_uint

* Add a test for the existing behavior of fcvt_from_uint.f64x2

* Migrate fcvt_from_uint.f64x2 to ISLE

* Lower the last case of `fcvt_from_uint`

* Add a test for `fcvt_from_uint`

* Finish lowering fcmp_from_uint

* Format
2022-08-11 12:28:41 -07:00
Andrew Brown
c4fd6a95da [fuzz] Remove unnecessary allocation (#4689)
This resolves a comment @jameysharp made in a previous PR.
2022-08-11 19:26:33 +00:00
Afonso Bordado
e4adc46e6d cranelift: Fix shifts and implement rotates in interpreter (#4519)
* cranelift: Fix shifts and implement rotates in interpreter

* x64: Implement `rotl`/`rotr` for some small type combinations
2022-08-11 12:15:52 -07:00
Ulrich Weigand
67870d1518 s390x: Support both big- and little-endian vector lane order (#4682)
This implements the s390x back-end portion of the solution for
https://github.com/bytecodealliance/wasmtime/issues/4566

We now support both big- and little-endian vector lane order
in code generation.  The order used for a function is determined
by the function's ABI: if it uses a Wasmtime ABI, it will use
little-endian lane order, and big-endian lane order otherwise.
(This ensures that all raw_bitcast instructions generated by
both wasmtime and other cranelift frontends can always be
implemented as a no-op.)

Lane order affects the implementation of a number of operations:
- Vector immediates
- Vector memory load / store (in big- and little-endian variants)
- Operations explicitly using lane numbers
  (insertlane, extractlane, shuffle, swizzle)
- Operations implicitly using lane numbers
  (iadd_pairwise, narrow/widen, promote/demote, fcvt_low, vhigh_bits)

In addition, when calling a function using a different lane order,
we need to lane-swap all vector values passed or returned in registers.

A small number of changes to common code were also needed:

- Ensure we always select a Wasmtime calling convention on s390x
  in crates/cranelift (func_signature).

- Fix vector immediates for filetests/runtests.  In PR #4427,
  I attempted to fix this by byte-swapping the V128 value, but
  with the new scheme, we'd instead need to perform a per-lane
  byte swap.  Since we do not know the actual type in write_to_slice
  and read_from_slice, this isn't easily possible.

  Revert this part of PR #4427 again, and instead just mark the
  memory buffer as little-endian when emitting the trampoline;
  the back-end will then emit correct code to load the constant.

- Change a runtest in simd-bitselect-to-vselect.clif to no longer
  make little-endian lane order assumptions.

- Remove runtests in simd-swizzle.clif that make little-endian
  lane order assumptions by relying on implicit type conversion
  when using a non-i16x8 swizzle result type (this feature should
  probably be removed anyway).

Tested with both wasmtime and cg_clif.
2022-08-11 12:10:46 -07:00
Alex Crichton
c1c48b4386 Don't be clever about representing non-CoW images (#4691)
This commit fixes a build warning on Rust 1.63 when the `memory-init-cow`
feature is disabled in the `wasmtime-runtime` crate. Some "tricks" were
used prior to have the `MemoryImage` type be an empty `enum {}` but that
wreaks havoc with warnings so this commit instead just makes it a unit
struct and makes all methods panic (as they shouldn't be hit anyway).
2022-08-11 18:16:28 +00:00
Afonso Bordado
c5bc368cfe cranelift: Add COFF TLS Support (#4546)
* cranelift: Implement COFF TLS Relocations

* cranelift: Emit SecRel relocations

* cranelift: Handle _tls_index symbol in backend
2022-08-11 09:33:40 -07:00
Benjamin Bouvier
a40b253792 Uncomment unwind stack frame tests that now pass on aarch64 (#4687)
Thanks to #4431 and @fitzgen who implemented it!
2022-08-11 15:09:04 +00:00
Andrew Brown
c3e31c9946 [fuzz] Document Wasm-JS conversions (#4683)
During differential execution against V8, Wasm values need to be
converted back and forth from JS values. This change documents the
location in the specification where this is defined.
2022-08-10 23:43:43 +00:00
Afonso Bordado
268ddf2f6c cranelift: Implement pinned reg in interpreter (#4375) 2022-08-10 21:33:45 +00:00
Afonso Bordado
11f0b003eb cranelift: Build a runtest case from fuzzer TestCase's (#4590)
* cranelift: Build a runtest case from fuzzer TestCase's

* cranelift: Add a default expected output for a fuzzgen case
2022-08-10 21:17:11 +00:00
Alex Crichton
597eb6f4ce Limit the type hierarchies in component fuzzing (#4668)
* Limit the type hierarchies in component fuzzing

For now `wasmparser` has a hard limit on the size of tuples and such at
1000 recursive types within the tuple itself. Respect this limit by
limiting the width of recursive types generated for the `component_api`
fuzzer. This commit unifies this new requirement with the preexisting
`TupleArray` and `NonEmptyArray` types into one `VecInRange<T, L, H>`
which allow expressing all of these various requirements in one type.

* Fix a compile error on `main`

* Review comments
2022-08-10 20:49:51 +00:00
bjorn3
54f9587569 Don't use libtest harness for filetests (#4655)
We are using our own test harness for filetests and embedding it in
libtest isn't useful. It only hides test output until the end and
results in unnecessary noise.
2022-08-10 13:48:15 -07:00
Dan Gohman
918debfe59 Stop returning NOTCAPABLE errors from WASI calls. (#4666)
* Stop returning `NOTCAPABLE` errors from WASI calls.

`ENOTCAPABLE` was an error code that is used as part of the rights
system, from CloudABI. There is a set of flags associated with each file
descriptor listing which operations can be performed with the file
descriptor, and if an attempt is made to perform an operation with a
file descriptor that isn't permitted by its rights flags, it fails with
`ENOTCAPABLE`.

WASI is removing the rights system. For example, WebAssembly/wasi-libc#294
removed support for translating `ENOTCAPABLE` into POSIX error codes, on
the assumption that engines should stop using it.

So as another step to migrating away from the rights system, remove uses
of the `ENOTCAPABLE` error.

* Update crates/wasi-common/src/file.rs

Co-authored-by: Jamey Sharp <jamey@minilop.net>

* Update crates/wasi-common/src/dir.rs

Co-authored-by: Jamey Sharp <jamey@minilop.net>

Co-authored-by: Jamey Sharp <jamey@minilop.net>
2022-08-10 13:44:23 -07:00
Ulrich Weigand
be36dd6b1e s390x: Enable object backend (#4680)
This enables the object backend for s390x, in particular the
processing of all required relocations.

This uncovered a bug: we need to use PLT relocations for the
target of calls, which we currently do not.  Fixed by adding
a new S390xPLTRel32Dbl reloc type and using it where needed.
2022-08-10 20:07:54 +00:00
Jamey Sharp
ecb91c0b06 List preset's settings in generated comment (#4679)
Figuring out which boolean settings go into each preset is not easy by
inspecting the DSL source (e.g. meta/src/isa/x86.rs). This patch extends
the comments in the Rust that's generated by that DSL to list the names
of the settings together with the name of the preset.
2022-08-10 19:56:23 +00:00
Trevor Elliott
a25d52046b x64: Migrate fcvt_from_sint and fcvt_low_from_sint to ISLE (#4650)
https://github.com/bytecodealliance/wasmtime/pull/4650
2022-08-10 10:49:02 -07:00
bjorn3
f8c0a88299 Fix sret for AArch64 (#4634)
* Fix sret for AArch64

AArch64 requires the struct return address argument to be stored in the x8
register. This register is never used for regular arguments.

* Add extra sret tests for x86_64
2022-08-10 10:34:51 -07:00
Ulrich Weigand
50fcab2984 s390x: Implement tls_value (#4616)
Implement the tls_value for s390 in the ELF general-dynamic mode.

Notable differences to the x86_64 implementation are:
- We use a __tls_get_offset libcall instead of __tls_get_addr.
- The current thread pointer (stored in a pair of access registers)
  needs to be added to the result of __tls_get_offset.
- __tls_get_offset has a variant ABI that requires the address of
  the GOT (global offset table) is passed in %r12.

This means we need a new libcall entries for __tls_get_offset.
In addition, we also need a way to access _GLOBAL_OFFSET_TABLE_.
The latter is a "magic" symbol with a well-known name defined
by the ABI and recognized by the linker.  This patch introduces
a new ExternalName::KnownSymbol variant to support such names
(originally due to @afonso360).

We also need to emit a relocation on a symbol placed in a
constant pool, as well as an extra relocation on the call
to __tls_get_offset required for TLS linker optimization.

Needed by the cg_clif frontend.
2022-08-10 10:02:07 -07:00
Andrew Brown
354daf5b93 [fuzz] Fix issues with single-inst module generator (#4674)
* [fuzz] Fix signature of `i64.extend32_s` single-instruction test

This single-instruction test incorrectly attempted to convert an `i32`
to an `i64`; the correct signature is `i64 -> i64`. See the [WebAssembly
specification](https://webassembly.github.io/spec/core/bikeshed/#a7-index-of-instructions).

* [fuzz] Fix typo in single-instruction function generator

Previously, the `unary!` macro created functions that used two operands
instead of the expected one.
2022-08-10 16:47:02 +00:00
Alex Crichton
96a2ba70b4 Update 0.40.0 release notes (#4660)
* Update 0.40.0 release notes

Not a ton happened in terms of user-facing improvements here so I
outlined some internal changes as well. The cumulative effect of
improving compile times is Sightglass showing 30-40% improvements for
major benchmarks. Additionally I wrote down a note indicating that this
is likely the last `0.*` release and the next release of Wasmtime on
September 20 is planned to be 1.0.

* Remove perf-related relnotes

* Call out s390x simd at the top-level
2022-08-10 16:23:27 +00:00
Afonso Bordado
30e2a9bd29 cranelift: Upgrade libm to 0.2.4 (#4670)
* cranelift: Upgrade libm to 0.2.4

This resolves an issue with incorrect fmaf on the x86_64-pc-windows-gnu target under some inputs.

See: #4517

* supply-chain: Vet `libm` 0.2.4
2022-08-10 16:08:39 +00:00
Alex Crichton
fd28d94352 Shield compiled modules from their appended metadata (#4609)
This commit fixes #4600 in a somewhat roundabout fashion. Currently the
`main` branch of Wasmtime exhibits unusual behavior:

* If `./ci/run-tests.sh` is run then the `cache_accounts_for_opt_level`
  test does not fail.
* If `cargo test -p wasmtime --lib` is run, however, then the test
  fails.

This test is indeed being run as part of `./ci/run-tests.sh` and it's
also passing in CI. The exact failure is that part of the debuginfo
support we have takes an existing ELF image, copies it, and then appends
some information to inform profilers/gdb about the image. This code is
all quite old at this point and not 100% optimal, but that's at least
where we're at.

The problem is that the appended `ProgramHeader64` is not aligned
correctly during `cargo test -p wasmtime --lib`, which is the panic that
happens causing the test to fail. The reason, however, that this test
passes with `./ci/run-tests.sh` is that the alignment of
`ProgramHeader64` is 1 instead of 8. The reason for that is that the
`object` crate has an `unaligned` feature which forcibly unaligns all
primitives to 1 byte instead of their natural alignment. During `cargo
test -p wasmtime --lib` this feature is not enabled but during
`./ci/run-tests.sh` this feature is enabled. The feature is currently
enabled through inclusion of the `backtrace` crate which only happens
for some tests in some crates.

The alignment issue explains why the test fails on a single crate test
but fails on the whole workspace tests. The next issue I investigated
was if this test ever passed. It turns out that on v0.39.0 this test
passed, and the regression to main was introduced during #4571. That
PR, however, has nothing to do with any of this! The reason that this
showed up as causing a "regression" however is because it changed
cranelift settings which changed the size of serialized metadata at the
end of a Wasmtime cache object.

Wasmtime compiled artifacts are ELF images with Wasmtime-specific
metadata appended after them. This appended metadata was making its way
all the way through to the gdbjit image itself which mean that while the
end of the ELF file itself was properly aligned the space after the
Wasmtime metadata was not aligned. This metadata changes in size over
time as Cranelift settings change which explains why #4571 was the
"source" of the regression.

The fix in this commit is to discard the extra Wasmtime metadata when
creating an `MmapVec` representing the underlying ELF image. This is
already supported with `MmapVec::drain` so it was relatively easy to
insert that. This means that the gdbjit image starts with just the ELF
file itself which is always aligned at the end, which gets the test
passing with/without the `unaligned` feature in the `object` crate.
2022-08-10 09:58:34 -05:00
Andrew Brown
7fa89c4a4f [fuzz] Fix order of operands passed in to wasm-spec-interpreter (#4672)
In #4671, the meta-differential fuzz target was finding errors when
running certain Wasm modules (specifically `shr_s` in that case).
@conrad-watt diagnosed the issue as a missing reversal in the operands
passed to the spec interpreter. This change fixes #4671 and adds an
additional unit test to keep it fixed.
2022-08-10 09:55:33 -05:00
Trevor Elliott
63c2d1e0c3 x64: Remove unnecessary register use when comparing against constants (#4645)
https://github.com/bytecodealliance/wasmtime/pull/4645
2022-08-09 23:53:51 +00:00
Afonso Bordado
4d2a2cfae6 cranelift: Use cranelift-jit in runtests (#4453)
* cranelift: Use JIT in runtests

Using `cranelift-jit` in run tests allows us to preform relocations and
libcalls. This is important since some instruction lowerings fallback
to libcall's when an extension is missing, or when it's too complicated
to implement manually.

This is also a first step to being able to test `call`'s between functions
in the runtest suite. It should also make it easier to eventually test
TLS relocations, symbol resolution and ABI's.

Another benefit of this is that we also get to test the JIT more, since
it now runs the runtests, and gets some fuzzing via `fuzzgen` (which
uses the `SingleFunctionCompiler`).

This change causes regressions in terms of runtime for the filetests.
I haven't done any serious benchmarking but what I've been seeing is
that it now takes about ~3 seconds to run the testsuite while it
previously took around 2 seconds.

* Add FMA tests for X86
2022-08-09 14:54:25 -07:00
Afonso Bordado
97b2680f20 cranelift: Remove legalized_to_pointer from function generator (#4665) 2022-08-09 21:47:26 +00:00
Nick Fitzgerald
b17a734a57 Fix unused result that is #[must_use] (#4663)
Fixes this compiler warning:

```
warning: unused return value of `Box::<T>::from_raw` that must be used
   --> crates/bench-api/src/lib.rs:351:9
    |
351 |         Box::from_raw(state as *mut BenchState);
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
```
2022-08-09 13:17:43 -07:00
Alex Crichton
bd70dbebbd Deduplicate some size/align calculations (#4658)
This commit is an effort to reduce the amount of complexity around
managing the size/alignment calculations of types in the canonical ABI.
Previously the logic for the size/alignment of a type was spread out
across a number of locations. While each individual calculation is not
really the most complicated thing in the world having the duplication in
so many places was constantly worrying me.

I've opted in this commit to centralize all of this within the runtime
at least, and now there's only one "duplicate" of this information in
the fuzzing infrastructure which is to some degree less important to
deduplicate. This commit introduces a new `CanonicalAbiInfo` type to
house all abi size/align information for both memory32 and memory64.
This new type is then used pervasively throughout fused adapter
compilation, dynamic `Val` management, and typed functions. This type
was also able to reduce the complexity of the macro-generated code
meaning that even `wasmtime-component-macro` is performing less math
than it was before.

One other major feature of this commit is that this ABI information is
now saved within a `ComponentTypes` structure. This avoids recursive
querying of size/align information frequently and instead effectively
caching it. This was a worry I had for the fused adapter compiler which
frequently sought out size/align information and would recursively
descend each type tree each time. The `fact-valid-module` fuzzer is now
nearly 10x faster in terms of iterations/s which I suspect is due to
this caching.
2022-08-09 14:52:20 -05:00