Commit Graph

186 Commits

Author SHA1 Message Date
Trevor Elliott
aeceea28e2 Remove trapif and trapff (#5162)
This branch removes the trapif and trapff instructions, in favor of using an explicit comparison and trapnz. This moves us closer to removing iflags and fflags, but introduces the need to implement instructions like iadd_cout in the x64 and aarch64 backends.
2022-11-03 09:25:11 -07:00
Ulrich Weigand
961107ec63 Merge raw_bitcast and bitcast (#5175)
- Allow bitcast for vectors with differing lane widths
- Remove raw_bitcast IR instruction
- Change all users of raw_bitcast to bitcast
- Implement support for no-op bitcast cases across backends

This implements the second step of the plan outlined here:
https://github.com/bytecodealliance/wasmtime/issues/4566#issuecomment-1234819394
2022-11-02 10:16:27 -07:00
Afonso Bordado
faeeed4fb9 cranelift: Correctly calculate heap addresses in interpreter (#5155)
We were accidentally including the size as part of the offset when
computing heap addresses.
2022-10-31 15:07:14 -07:00
11evan
4ca9e82bd1 cranelift: Add Bswap instruction (#1092) (#5147)
Adds Bswap to the Cranelift IR. Implements the Bswap instruction
in the x64 and aarch64 codegen backends. Cranelift users can now:
```
builder.ins().bswap(value)
```
to get a native byteswap instruction.

* x64: implements the 32- and 64-bit bswap instruction, following
the pattern set by similar unary instrutions (Neg and Not) - it
only operates on a dst register, but is parameterized with both
a src and dst which are expected to be the same register.

As x64 bswap instruction is only for 32- or 64-bit registers,
the 16-bit swap is implemented as a rotate left by 8.

Updated x64 RexFlags type to support emitting for single-operand
instructions like bswap

* aarch64: Bswap gets emitted as aarch64 rev16, rev32,
or rev64 instruction as appropriate.

* s390x: Bswap was already supported in backend, just had to add
a bit of plumbing

* For completeness, added bswap to the interpreter as well.

* added filetests and runtests for each ISA

* added bswap to fuzzgen, thanks to afonso360 for the code there

* 128-bit swaps are not yet implemented, that can be done later
2022-10-31 19:30:00 +00:00
Trevor Elliott
02620441c3 Add uadd_overflow_trap (#5123)
Add a new instruction uadd_overflow_trap, which is a fused version of iadd_ifcout and trapif. Adding this instruction removes a dependency on the iflags type, and would allow us to move closer to removing it entirely.

The instruction is defined for the i32 and i64 types only, and is currently only used in the legalization of heap_addr.
2022-10-27 09:43:15 -07:00
Afonso Bordado
4867813f77 cranelift: Remove copy instruction (#5125) 2022-10-25 17:27:33 -07:00
Trevor Elliott
ec12415b1f cranelift: Remove redundant branch and select instructions (#5097)
As discussed in the 2022/10/19 meeting, this PR removes many of the branch and select instructions that used iflags, in favor if using brz/brnz and select in their place. Additionally, it reworks selectif_spectre_guard to take an i8 input instead of an iflags input.

For reference, the removed instructions are: br_icmp, brif, brff, trueif, trueff, and selectif.
2022-10-24 16:14:35 -07:00
Trevor Elliott
32a7593c94 cranelift: Remove booleans (#5031)
Remove the boolean types from cranelift, and the associated instructions breduce, bextend, bconst, and bint. Standardize on using 1/0 for the return value from instructions that produce scalar boolean results, and -1/0 for boolean vector elements.

Fixes #3205

Co-authored-by: Afonso Bordado <afonso360@users.noreply.github.com>
Co-authored-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
2022-10-17 16:00:27 -07:00
Afonso Bordado
1d8f982fe5 fuzzgen: Add bitops (#5040)
* cranelift: Implement some bitops for i128 values

* fuzzgen: Add bitops
2022-10-12 05:52:48 -07:00
Jun Ryung Ju
39fbff92c3 cranelift: Added fp and, or, xor, not ops to interpreter. (#4999)
* cranelift: Added fp and, or, xor, not ops to interpreter.

* Formatting.

* Removed archtecture dependent test on float-bitops.
2022-10-06 18:24:45 -07:00
wasmtime-publish
a9be4a9b56 Bump Wasmtime to 3.0.0 (#5016)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-10-05 09:30:55 -05:00
Afonso Bordado
65a3af72c7 fuzzgen: Statistics framework (#4868)
* cranelift: Add non user trap codes function

* cranelift: Add Fuzzgen stats

* cranelift: Use `once_cell` and cleanup some stuff

* fuzzgen: Remove total_inputs metric

* fuzzgen: Filter empty trap codes
2022-09-27 16:04:57 +00:00
Alex Crichton
7b311004b5 Leverage Cargo's workspace inheritance feature (#4905)
* Leverage Cargo's workspace inheritance feature

This commit is an attempt to reduce the complexity of the Cargo
manifests in this repository with Cargo's workspace-inheritance feature
becoming stable in Rust 1.64.0. This feature allows specifying fields in
the root workspace `Cargo.toml` which are then reused throughout the
workspace. For example this PR shares definitions such as:

* All of the Wasmtime-family of crates now use `version.workspace =
  true` to have a single location which defines the version number.
* All crates use `edition.workspace = true` to have one default edition
  for the entire workspace.
* Common dependencies are listed in `[workspace.dependencies]` to avoid
  typing the same version number in a lot of different places (e.g. the
  `wasmparser = "0.89.0"` is now in just one spot.

Currently the workspace-inheritance feature doesn't allow having two
different versions to inherit, so all of the Cranelift-family of crates
still manually specify their version. The inter-crate dependencies,
however, are shared amongst the root workspace.

This feature can be seen as a method of "preprocessing" of sorts for
Cargo manifests. This will help us develop Wasmtime but shouldn't have
any actual impact on the published artifacts -- everything's dependency
lists are still the same.

* Fix wasi-crypto tests
2022-09-26 11:30:01 -05:00
Jamey Sharp
bd870a9d6c Shrink all SmallVecs by 8 bytes (#4951)
We weren't using the "union" cargo feature for the smallvec crate, which
reduces the size of a SmallVec by one machine word. This feature
requires Rust 1.49 but we already require much newer versions.

When using Wasmtime to compile pulldown-cmark from Sightglass, this
saves a decent amount of memory allocations and writes. According to
`valgrind --tool=dhat`:

- 6.2MiB (3.69%) less memory allocated over the program's lifetime
- 0.5MiB (4.13%) less memory allocated at maximum heap size
- 5.5MiB (1.88%) fewer bytes written to
- 0.44% fewer instructions executed

Sightglass reports a statistically significant runtime improvement too:

compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm

  Δ = 24379323.60 ± 20051394.04 (confidence = 99%)

  shrink-abiarg-0406da67c.so is 1.01x to 1.13x faster than main-be690a468.so!

  [227506364 355007998.78 423280514] main-be690a468.so
  [227686018 330628675.18 406025344] shrink-abiarg-0406da67c.so

compilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm

  Δ = 360151622.56 ± 278294316.90 (confidence = 99%)

  shrink-abiarg-0406da67c.so is 1.01x to 1.07x faster than main-be690a468.so!

  [8709162212 8911001926.44 9535111576] main-be690a468.so
  [5058015392 8550850303.88 9282148438] shrink-abiarg-0406da67c.so

compilation :: cycles :: benchmarks/bz2/benchmark.wasm

  Δ = 6936570.28 ± 6897696.38 (confidence = 99%)

  shrink-abiarg-0406da67c.so is 1.00x to 1.08x faster than main-be690a468.so!

  [155810934 175260571.20 234737344] main-be690a468.so
  [119128240 168324000.92 257451074] shrink-abiarg-0406da67c.so
2022-09-23 16:32:13 -07:00
Damian Heaton
3f8cccfb59 Port flag-based ops to ISLE (AArch64) (#4942)
Ported the existing implementations of the following opcodes for AArch64
to ISLE:
- `Trueif`
- `Trueff`
- `Trapif`
- `Trapff`
- `Select`
- `Selectif`
- `SelectifSpectreGuard`

Copyright (c) 2022 Arm Limited
2022-09-22 15:44:32 -07:00
Damian Heaton
e786bda002 Vector bitcast support (AArch64 & Interpreter) (#4820)
* Vector bitcast support (AArch64 & Interpreter)

Implemented support for `bitcast` on vector values for AArch64 and the
interpreter.

Also corrected the verifier to ensure that the size, in bits, of the input and
output types match for a `bitcast`, per the docs.

Copyright (c) 2022 Arm Limited

* `I128` same-type bitcast support

Copyright (c) 2022 Arm Limited

* Directly return input for 64-bit GPR<=>GPR bitcast

Copyright (c) 2022 Arm Limited
2022-09-21 09:20:28 -07:00
Damian Heaton
cae7c196bb Interpreter: Implement floating point conversions (#4884)
* Interpreter: Implement floating point conversions

Implemented the following opcodes for the interpreter:
- `FcvtToUint`
- `FcvtToSint`
- `FcvtToUintSat`
- `FcvtToSintSat`
- `FcvtFromUint`
- `FcvtFromSint`
- `FcvtLowFromSint`
- `FvpromoteLow`
- `Fvdemote`

Copyright (c) 2022 Arm Limited

* Fix `I128` bounds checks for `FcvtTo{U,S}int{_,Sat}`

Copyright (c) 2022 Arm Limited

* Fix broken test

Copyright (c) 2022 Arm Limited
2022-09-20 11:10:20 -07:00
Jamey Sharp
3d6d49daba cranelift: Remove of/nof overflow flags from icmp (#4879)
* cranelift: Remove of/nof overflow flags from icmp

Neither Wasmtime nor cg-clif use these flags under any circumstances.
From discussion on #3060 I see it's long been unclear what purpose these
flags served.

Fixes #3060, fixes #4406, and fixes #4875... by deleting all the code
that could have been buggy.

This changes the cranelift-fuzzgen input format by removing some IntCC
options, so I've gone ahead and enabled I128 icmp tests at the same
time. Since only the of/nof cases were failing before, I expect these to
work.

* Restore trapif tests

It's still useful to validate that iadd_ifcout's iflags result can be
forwarded correctly to trapif, and for that purpose it doesn't really
matter what condition code is checked.
2022-09-07 08:38:41 -07:00
Alex Crichton
65930640f8 Bump Wasmtime to 2.0.0 (#4874)
This commit replaces #4869 and represents the actual version bump that
should have happened had I remembered to bump the in-tree version of
Wasmtime to 1.0.0 prior to the branch-cut date. Alas!
2022-09-06 13:49:56 -05:00
Jamey Sharp
9856664f1f Make DataValue, not Ieee32/64, respect IEEE754 (#4860)
* cranelift-codegen: Remove all uses of DataValue

This type is only used by the interpreter, cranelift-fuzzgen, and
filetests. I haven't found another convenient crate for those to all
depend on where this type can live instead, but this small refactor at
least makes it obvious that code generation does not in any way depend
on the implementation of this type.

* Make DataValue, not Ieee32/64, respect IEEE754

This fixes #4857 by partially reverting #4849.

It turns out that Ieee32 and Ieee64 need bitwise equality semantics so
they can be used as hash-table keys.

Moving the IEEE754 semantics up a layer to DataValue makes sense in
conjunction with #4855, where we introduced a DataValue::bitwise_eq
alternative implementation of equality for those cases where users of
DataValue still want the bitwise equality semantics.

* cranelift-interpreter: Use eq/ord from DataValue

This fixes #4828, again, now that the comparison operators on DataValue
have the right IEEE754 semantics.

* Add regression test from issue #4857
2022-09-03 00:26:14 +00:00
Afonso Bordado
f30a7eb0c9 cranelift: Implement PartialEq on Ieee{32,64} (#4849)
* cranelift: Add `fcmp` tests

Some of these are disabled on aarch64 due to not being implemented yet.

* cranelift: Implement float PartialEq for Ieee{32,64} (fixes #4828)

Previously `PartialEq` was auto derived. This means that it was implemented in terms of PartialEq in a u32.

This is not correct for floats because `NaN != NaN`.

PartialOrd was manually implemented in 6d50099816, but it seems like it was an oversight to leave PartialEq out until now.

The test suite depends on the previous behaviour so we adjust it to keep comparing bits instead of floats.

* cranelift: Disable `fcmp ord` tests on aarch64

* cranelift: Disable `fcmp ueq` tests on aarch64
2022-09-02 10:42:42 -07:00
Jamey Sharp
84ac24c23d cranelift: Remove const_addr instruction (fixes #2398) (#4843) 2022-09-01 21:57:37 +00:00
Afonso Bordado
3ce3eeb668 cranelift: Register all functions in test file for interpreter (#4817)
* cranelift: Implement `bnot` in interpreter

* cranelift: Register all functions in test file for interpreter

* cranelift: Relax signature checking for bools and vectors
2022-08-30 15:45:21 -07:00
Chris Fallin
2b4b257834 Revert "cranelift: Register all functions in test file for interpreter (#4800)" (#4810)
This reverts commit 500a9f17be.
2022-08-30 01:15:11 +00:00
Afonso Bordado
500a9f17be cranelift: Register all functions in test file for interpreter (#4800)
* cranelift: Implement `bnot` in interpreter

* cranelift: Register all functions in test file for interpreter
2022-08-29 23:39:50 +00:00
Afonso Bordado
9a8bd5be02 cranelift: Add LibCalls to the interpreter (#4782)
* cranelift: Add libcall handlers to interpreter

* cranelift: Fuzz IshlI64 libcall

* cranelift: Revert back to fuzzing udivi64

* cranelift: Use sdiv as a fuzz libcall

* cranelift: Register Sdiv in fuzzgen

* cranelift: Add multiple libcalls to fuzzer

* cranelift: Register a single libcall handler

* cranelift: Simplify args checking in interpreter

* cranelift: Remove unused LibCalls

* cranelift: Cleanup interpreter libcall types

* cranelift: Fix Interpreter Docs
2022-08-29 13:36:33 -07:00
Damian Heaton
94bcbe8446 Port Fcopysign..FcvtToSintSat to ISLE (AArch64) (#4753)
* Port `Fcopysign`..``FcvtToSintSat` to ISLE (AArch64)

Ported the existing implementations of the following opcodes to ISLE on
AArch64:
- `Fcopysign`
  - Also introduced missing support for `fcopysign` on vector values, as
    per the docs.
  - This introduces the vector encoding for the `SLI` machine
    instruction.
- `FcvtToUint`
- `FcvtToSint`
- `FcvtFromUint`
- `FcvtFromSint`
- `FcvtToUintSat`
- `FcvtToSintSat`

Copyright (c) 2022 Arm Limited

* Document helpers and abstract conversion checks
2022-08-24 10:37:14 -07:00
Damian Heaton
da1fb305a3 Port vconst to ISLE (AArch64) (#4750)
* Port `vconst` to ISLE (AArch64)

Ported the existing implementation of `vconst` to ISLE for AArch64, and
added support for 64-bit vector constants.

Also introduced 64-bit `vconst` support to the interpreter.

Copyright (c) 2022 Arm Limited

* Replace if-chains with match statements

Copyright (c) 2022 Arm Limited
2022-08-23 09:40:11 -07:00
Benjamin Bouvier
8a9b1a9025 Implement an incremental compilation cache for Cranelift (#4551)
This is the implementation of https://github.com/bytecodealliance/wasmtime/issues/4155, using the "inverted API" approach suggested by @cfallin (thanks!) in Cranelift, and trait object to provide a backend for an all-included experience in Wasmtime. 

After the suggestion of Chris, `Function` has been split into mostly two parts:

- on the one hand, `FunctionStencil` contains all the fields required during compilation, and that act as a compilation cache key: if two function stencils are the same, then the result of their compilation (`CompiledCodeBase<Stencil>`) will be the same. This makes caching trivial, as the only thing to cache is the `FunctionStencil`.
- on the other hand, `FunctionParameters` contain the... function parameters that are required to finalize the result of compilation into a `CompiledCode` (aka `CompiledCodeBase<Final>`) with proper final relocations etc., by applying fixups and so on.

Most changes are here to accomodate those requirements, in particular that `FunctionStencil` should be `Hash`able to be used as a key in the cache:

- most source locations are now relative to a base source location in the function, and as such they're encoded as `RelSourceLoc` in the `FunctionStencil`. This required changes so that there's no need to explicitly mark a `SourceLoc` as the base source location, it's automatically detected instead the first time a non-default `SourceLoc` is set.
- user-defined external names in the `FunctionStencil` (aka before this patch `ExternalName::User { namespace, index }`) are now references into an external table of `UserExternalNameRef -> UserExternalName`, present in the `FunctionParameters`, and must be explicitly declared using `Function::declare_imported_user_function`.
- some refactorings have been made for function names:
  - `ExternalName` was used as the type for a `Function`'s name; while it thus allowed `ExternalName::Libcall` in this place, this would have been quite confusing to use it there. Instead, a new enum `UserFuncName` is introduced for this name, that's either a user-defined function name (the above `UserExternalName`) or a test case name.
  - The future of `ExternalName` is likely to become a full reference into the `FunctionParameters`'s mapping, instead of being "either a handle for user-defined external names, or the thing itself for other variants". I'm running out of time to do this, and this is not trivial as it implies touching ISLE which I'm less familiar with.

The cache computes a sha256 hash of the `FunctionStencil`, and uses this as the cache key. No equality check (using `PartialEq`) is performed in addition to the hash being the same, as we hope that this is sufficient data to avoid collisions.

A basic fuzz target has been introduced that tries to do the bare minimum:

- check that a function successfully compiled and cached will be also successfully reloaded from the cache, and returns the exact same function.
- check that a trivial modification in the external mapping of `UserExternalNameRef -> UserExternalName` hits the cache, and that other modifications don't hit the cache.
  - This last check is less efficient and less likely to happen, so probably should be rethought a bit.

Thanks to both @alexcrichton and @cfallin for your very useful feedback on Zulip.

Some numbers show that for a large wasm module we're using internally, this is a 20% compile-time speedup, because so many `FunctionStencil`s are the same, even within a single module. For a group of modules that have a lot of code in common, we get hit rates up to 70% when they're used together. When a single function changes in a wasm module, every other function is reloaded; that's still slower than I expect (between 10% and 50% of the overall compile time), so there's likely room for improvement. 

Fixes #4155.
2022-08-12 16:47:43 +00:00
Andrew Brown
a83c50321f cranelift: fix build warning (#4698)
In #4375 we introduced a code pattern that appears as a warning when
building the `cranelift-interpreter` crate:

```
warning: cannot borrow `*state` as mutable because it is also borrowed as immutable
   --> cranelift/interpreter/src/step.rs:412:13
    |
47  |     let arg = |index: usize| -> Result<V, StepError> {
    |               -------------------------------------- immutable borrow occurs here
48  |         let value_ref = inst_context.args()[index];
49  |         state
    |         ----- first borrow occurs due to use of `*state` in closure
...
412 |             state.set_pinned_reg(arg(0)?);
    |             ^^^^^^^^^^^^^^^^^^^^^---^^^^^
    |             |                    |
    |             |                    immutable borrow later used here
    |             mutable borrow occurs here
    |
    = note: `#[warn(mutable_borrow_reservation_conflict)]` on by default
    = warning: this borrowing pattern was not meant to be accepted, and may become a hard error in the future
    = note: for more information, see issue #59159 <https://github.com/rust-lang/rust/issues/59159>
```

This change fixes the warning.
2022-08-11 23:52:00 +00:00
Afonso Bordado
e4adc46e6d cranelift: Fix shifts and implement rotates in interpreter (#4519)
* cranelift: Fix shifts and implement rotates in interpreter

* x64: Implement `rotl`/`rotr` for some small type combinations
2022-08-11 12:15:52 -07:00
Afonso Bordado
268ddf2f6c cranelift: Implement pinned reg in interpreter (#4375) 2022-08-10 21:33:45 +00:00
Afonso Bordado
30e2a9bd29 cranelift: Upgrade libm to 0.2.4 (#4670)
* cranelift: Upgrade libm to 0.2.4

This resolves an issue with incorrect fmaf on the x86_64-pc-windows-gnu target under some inputs.

See: #4517

* supply-chain: Vet `libm` 0.2.4
2022-08-10 16:08:39 +00:00
Damian Heaton
eb332b8369 Convert fma, valltrue & vanytrue to ISLE (AArch64) (#4608)
* Convert `fma`, `valltrue` & `vanytrue` to ISLE (AArch64)

Ported the existing implementations of the following opcodes to ISLE on
AArch64:
- `fma`
  - Introduced missing support for `fma` on vector values, as per the
    docs.
- `valltrue`
- `vanytrue`

Also fixed `fcmp` on scalar values in the interpreter, and enabled
interpreter tests in `simd-fma.clif`.

This introduces the `FMLA` machine instruction.

Copyright (c) 2022 Arm Limited

* Add comments for `Fmla` and `Bsl`

Copyright (c) 2022 Arm Limited
2022-08-05 09:47:56 -07:00
wasmtime-publish
412fa04911 Bump Wasmtime to 0.41.0 (#4620)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-08-04 20:02:19 -05:00
Nick Fitzgerald
42bba452a6 Cranelift: Add instructions for getting the current stack/frame/return pointers (#4573)
* Cranelift: Add instructions for getting the current stack/frame pointers and return address

This is the initial part of https://github.com/bytecodealliance/wasmtime/issues/4535

* x64: Remove `Amode::RbpOffset` and use `Amode::ImmReg` instead

We just special case getting operands from `Amode`s now.

* Fix s390x `get_return_address`; require `preserve_frame_pointers=true`

* Assert that `Amode::ImmRegRegShift` doesn't use rbp/rsp

* Handle non-allocatable registers in Amode::with_allocs

* Use "stack" instead of "r15" on s390x

* r14 is an allocatable register on s390x, so it shouldn't be used with `MovPReg`
2022-08-02 14:37:17 -07:00
Chris Fallin
8dddd6f1f7 Cranelift: Remove ifcmp_sp opcode. (#4578)
This was temporarily added back in #3502 due to a need from Lucet; now
that Lucet is EOL, the opcode is no longer needed and we can remove it.
2022-08-02 13:15:39 -07:00
Chris Fallin
43f1765272 Cranellift: remove Baldrdash support and related features. (#4571)
* Cranellift: remove Baldrdash support and related features.

As noted in Mozilla's bugzilla bug 1781425 [1], the SpiderMonkey team
has recently determined that their current form of integration with
Cranelift is too hard to maintain, and they have chosen to remove it
from their codebase. If and when they decide to build updated support
for Cranelift, they will adopt different approaches to several details
of the integration.

In the meantime, after discussion with the SpiderMonkey folks, they
agree that it makes sense to remove the bits of Cranelift that exist
to support the integration ("Baldrdash"), as they will not need
them. Many of these bits are difficult-to-maintain special cases that
are not actually tested in Cranelift proper: for example, the
Baldrdash integration required Cranelift to emit function bodies
without prologues/epilogues, and instead communicate very precise
information about the expected frame size and layout, then stitched
together something post-facto. This was brittle and caused a lot of
incidental complexity ("fallthrough returns", the resulting special
logic in block-ordering); this is just one example. As another
example, one particular Baldrdash ABI variant processed stack args in
reverse order, so our ABI code had to support both traversal
orders. We had a number of other Baldrdash-specific settings as well
that did various special things.

This PR removes Baldrdash ABI support, the `fallthrough_return`
instruction, and pulls some threads to remove now-unused bits as a
result of those two, with the  understanding that the SpiderMonkey folks
will build new functionality as needed in the future and we can perhaps
find cleaner abstractions to make it all work.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1781425

* Review feedback.

* Fix (?) DWARF debug tests: add `--disable-cache` to wasmtime invocations.

The debugger tests invoke `wasmtime` from within each test case under
the control of a debugger (gdb or lldb). Some of these tests started to
inexplicably fail in CI with unrelated changes, and the failures were
only inconsistently reproducible locally. It seems to be cache related:
if we disable cached compilation on the nested `wasmtime` invocations,
the tests consistently pass.

* Review feedback.
2022-08-02 19:37:56 +00:00
Sam Parker
37cd96beff [AArch64] i64x2 support for min/max (#4575)
Also added interpreter support for vector min/max.

Copyright (c) 2022, Arm Limited.
2022-08-02 11:42:05 -07:00
Afonso Bordado
1f058a02c0 cranelift: Add MinGW fma regression tests (#4517)
* cranelift: Add MinGW `fma` regression tests

* cranelift: Fix FMA in interpreter

* cranelift: Add separate `fma` test suite for the interpreter

The interpreter can run `fma.clif` on most platforms, however on
`x86_64-pc-windows-gnu` we use libm which has issues with some inputs.
We should delete `fma-interpreter.clif` and enable the interpreter on
the main `fma.clif` file once those are fixed.
2022-07-29 09:09:37 -05:00
Afonso Bordado
e121c209fc cranelift: Fix urem/srem in interpreter (#4532) 2022-07-27 10:47:08 -07:00
Damian Heaton
3ef89b7787 Allow 64-bit vectors and implement for interpreter (#4509)
* Allow 64-bit vectors and implement for interpreter

The AArch64 backend already supports 64-bit vectors; this simply allows
instructions to make use of that.

Implemented support for 64-bit vectors within the interpreter to allow
interpret runtests to use them.

Copyright (c) 2022 Arm Limited

* Disable 64-bit SIMD `iaddpairwise` tests on s390x

Copyright (c) 2022 Arm Limited
2022-07-25 13:00:43 -07:00
Afonso Bordado
446efd3e11 cranelift: Fix icmp_imm for small types in interpreter (#4506) 2022-07-23 00:26:56 +00:00
Afonso Bordado
d89c262657 cranelift: Implement {u,s}extend.i128 in interpreter (#4505) 2022-07-22 10:47:10 -07:00
Afonso Bordado
80976b6fc7 cranelift: Add fadd/fsub/fmul/fdiv to interpreter (#4446)
Fuzzgen found these as soon as I added float support
2022-07-14 21:53:03 +00:00
Afonso Bordado
4ea46c3ca8 cranelift: Implement table_addr in interpreter (#4433) 2022-07-13 12:53:42 -07:00
Afonso Bordado
16cb287c53 cranelift: Use round_ties_even for nearest in interpreter (#4413)
As @MaxGraey pointed out (thanks!) in #4397, `round` has different
 behavior from `nearest`. And it looks like the native rust
 implementation is still pending stabilization.

 Right now we duplicate the wasmtime implementation, merged in #2171.

 However, we definitely should switch to the rust native version
 when it is available.
2022-07-07 16:36:43 -07:00
Sam Parker
9c43749dfe [RFC] Dynamic Vector Support (#4200)
Introduce a new concept in the IR that allows a producer to create
dynamic vector types. An IR function can now contain global value(s)
that represent a dynamic scaling factor, for a given fixed-width
vector type. A dynamic type is then created by 'multiplying' the
corresponding global value with a fixed-width type. These new types
can be used just like the existing types and the type system has a
set of hard-coded dynamic types, such as I32X4XN, which the user
defined types map onto. The dynamic types are also used explicitly
to create dynamic stack slots, which have no set size like their
existing counterparts. New IR instructions are added to access these
new stack entities.

Currently, during codegen, the dynamic scaling factor has to be
lowered to a constant so the dynamic slots do eventually have a
compile-time known size, as do spill slots.

The current lowering for aarch64 just targets Neon, using a dynamic
scale of 1.

Copyright (c) 2022, Arm Limited.
2022-07-07 12:54:39 -07:00
Afonso Bordado
f98076ae88 cranelift: Implement float rounding operations (#4397)
Implements the following operations on the interpreter:
* `ceil`
* `floor`
* `nearest`
* `trunc`
2022-07-06 16:43:54 -07:00
Afonso Bordado
9575ed4eb7 cranelift: Implement global_value in interpreter (#4396) 2022-07-06 15:53:52 -07:00