Commit Graph

3740 Commits

Author SHA1 Message Date
Alex Crichton
41ba851a95 Bump versions of wasm-tools crates (#4380)
* Bump versions of wasm-tools crates

Note that this leaves new features in the component model, outer type
aliases for core wasm types, unimplemented for now.

* Move to crates.io-based versions of tools
2022-07-05 14:23:03 -05:00
Dan Gohman
371ae80ac3 Migrate most of wasmtime from lazy_static to once_cell (#4368)
* Update tracing-core to a version which doesn't depend on lazy-static.

* Update crossbeam-utils to a version that doesn't depend on lazy-static.

* Update crossbeam-epoch to a version that doesn't depend on lazy-static.

* Update clap to a version that doesn't depend on lazy-static.

* Convert Wasmtime's own use of lazy_static to once_cell.

* Make `GDB_REGISTRATION`'s comment a doc comment.

* Fix compilation on Windows.
2022-07-05 10:52:48 -07:00
Sam Parker
d9e0e6a6a9 [AArch64] Port min/max to ISLE (#4374)
Copyright (c) 2022, Arm Limited.
2022-07-05 09:16:45 -07:00
Afonso Bordado
e91f493ff5 cranelift: Add heap support to the interpreter (#3302)
* cranelift: Add heaps to interpreter

* cranelift: Add RunTest Environment mechanism to  test interpret

* cranelift: Remove unused `MemoryError`

* cranelift: Add docs for `State::resolve_global_value`

* cranelift: Rename heap tests

* cranelift: Refactor heap address resolution

* Fix typos and clarify docs (thanks @cfallin)
2022-07-05 09:05:26 -07:00
Afonso Bordado
2003ae99a0 Implement fma/fabs/fneg/fcopysign on the interpreter (#4367)
* cranelift: Implement `fma` on interpreter

* cranelift: Implement `fabs` on interpreter

* cranelift: Fix `fneg` implementation on interpreter

`fneg` was implemented as `0 - x` which is not correct according to the
standard since that operation makes no guarantees on what the output
is when the input is `NaN`. However for `fneg` the output for `NaN`
inputs is fully defined.

* cranelift: Implement `fcopysign` on interpreter
2022-07-05 09:03:04 -07:00
wasmtime-publish
7c428bbd62 Bump Wasmtime to 0.40.0 (#4378)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-07-05 09:10:52 -05:00
Afonso Bordado
f2e6ff5e70 cranelift: Implement sqrt in interpreter (#4362)
This ignores SIMD for now.
2022-07-01 09:39:11 -07:00
Afonso Bordado
38ecd3744f aarch64: Implement bmask/bextend in ISLE (#4358)
* aarch64: Implement `bmask`/`bextend` in ISLE

* cranelift: Remove vector versions of `bextend`

* aarch64: Cleanup `bmask`/`bextend` documentation
2022-07-01 09:37:18 -07:00
Dan Gohman
64759f04a4 Migrate cranelift-jit from winapi to windows-sys (#4363)
* Migrate cranelift-jit from `winapi` to `windows-sys`

Following up on #4346, this migrates one more place in the tree from
winapi to windows-sys.
2022-07-01 08:41:02 -07:00
Ulrich Weigand
ec83144c88 s390x: use full vector register file for FP operations (#4360)
This defines the full set of 32 128-bit vector registers on s390x.
(Note that the VRs overlap the existing FPRs.)  In addition, this
adds support to use all 32 vector registers to implement floating-
point operations, by using vector floating-point instructions with
the 'W' bit set to operate only on the first element.

This part of the vector instruction set mostly matches the old FP
instruction set, with two exceptions:

- There is no vector version of the COPY SIGN instruction.  Instead,
  now use a VECTOR SELECT with an appropriate bit mask to implement
  the fcopysign operation.

- There are no vector version of the float <-> int conversion
  instructions where source and target differ in bit size.  Use
  appropriate multiple conversion steps instead.  This also requires
  use of explicit checking to implement correct overflow handling.
  As a side effect, this version now also implements the i8 / i16
  variants of all conversions, which had been missing so far.

For all operations except those two above, we continue to use the
old FP instruction if applicable (i.e. if all operands happen to
have been allocated to the original FP register set), and use the
vector instruction otherwise.
2022-06-30 16:33:39 -07:00
Sam Parker
a2d49ebf27 Use u32 in Type API (#4280)
Move from passing and returning u8 and u16 values to u32 in many of
the functions. This removes a number of type conversions and gives
a small compilation time speedup, around ~0.7% on my aarch64 machine.

Copyright (c) 2022, Arm Limited.
2022-06-30 12:43:36 -07:00
Ulrich Weigand
95836ba114 s390x: clean up lower.rs (#4355)
Now that lowering is fully done in ISLE, clean up some code remnants
in lower.rs.  In particular, move code to lower/isle.rs where
possible, and inline lower_insn_to_regs into its caller and simplify.
2022-06-30 11:16:59 -07:00
Afonso Bordado
919604b8c5 aarch64: Implement ireduce/breduce in ISLE (#4331)
* aarch64: Implement `ireduce`/`breduce` in ISLE

* cranelift: Remove vector versions of `breduce`/`ireduce`
2022-06-30 11:15:47 -07:00
bjorn3
d1446f767d Mark return value as define instead of clobber for TLS pseudoinstructions (#4357) 2022-06-30 10:44:51 -07:00
Ulrich Weigand
7a9479f77c ISLE: Migrate call and return instructions (#3785)
This adds infrastructure to allow implementing call and return
instructions in ISLE, and migrates the s390x back-end.

To implement ABI details, this patch creates public accessors
for `ABISig` and makes them accessible in ISLE.  All actual
code generation is then done in ISLE rules, following the
information provided by that signature.

[ Note that the s390x back end never requires multiple slots for
a single argument - the infrastructure to handle this should
already be present, however. ]

To implement loops in ISLE rules, this patch uses regular tail
recursion, employing a `Range` data structure holding a range
of integers to be looped over.
2022-06-29 14:22:50 -07:00
Waleed Dahshan
688168b4d7 Fix a mistake in the language reference (#4352)
It is clear that the third rule does not contribute to the rewriting of the expression `(A (B (D 42)))` to `(C (D 42))` to `(E 42)`.
2022-06-29 12:17:41 -07:00
Sam Parker
fb61774df2 [AArch64] Port AtomicLoad and AtomicStore to ISLE (#4301)
Copyright (c) 2022, Arm Limited.
2022-06-29 12:12:48 -07:00
Chris Fallin
2034c8aa45 Cranelift: add a config option for alias analysis and redundant-load elimination. (#4349)
This allows for experiments as in here [1] and also generally gives an
option to anyone who is concerned that the extra optimization may be
counterproductive or take too much time. The optimization remains
enabled by default.

[1]
https://github.com/bytecodealliance/wasmtime/pull/4163#issuecomment-1169303683
2022-06-28 15:25:47 -07:00
Chris Fallin
b2e28b917a Cranelift: update to latest regalloc2: (#4324)
- Handle call instructions' clobbers with the clobbers API, using RA2's
  clobbers bitmask (bytecodealliance/regalloc2#58) rather than clobbers
  list;

- Pull in changes from bytecodealliance/regalloc2#59 for much more sane
  edge-case behavior w.r.t. liverange splitting.
2022-06-28 09:01:59 -07:00
Afonso Bordado
42d4f97b78 cranelift: Fix cls for small types on aarch64 (#4305)
The previous `cls` code was producing wrong results when fed with a -1 i8.

The fix here is to sign extend instead of zero extending since we want
to keep the sign bit as one in order for it to be counted correctly
in the cls instruction

This also merges the interpreter only tests now that aarch64
correctly supports this instruction
2022-06-27 15:55:02 -07:00
Afonso Bordado
aef53784ec aarch64: Implement bint in ISLE (#4319) 2022-06-27 15:50:46 -07:00
Chris Fallin
0d829a57ee Upgrade to regalloc2 v0.2.3 to get bugfix from bytecodealliance/regalloc2#60. (#4335)
* Upgrade to regalloc2 v0.2.3 to get bugfix from bytecodealliance/regalloc2#60.

* Update RELEASES.md.

* Update two compile tests based on slightly shifting regalloc output.
2022-06-27 15:58:54 -05:00
Chris Fallin
5c2c285dd7 Cranelift/x64: fix register allocator metadata for 8-bit divides. (#4332)
`idiv` on x86-64 only reads `rdx`/`edx`/`dx`/`dl` for divides with width
greater than 8 bits; for an 8-bit divide, it reads the whole 16-bit
divisor from `ax`, as our CISC ancestors intended. This PR fixes the
metadata to avoid a regalloc panic (due to undefined `rdx`) in this
case. Does not affect Wasmtime or other Wasm-frontend embedders.
2022-06-27 12:31:06 -07:00
Afonso Bordado
2327127b7d cranelift: Support boolean arguments larger than b1 in trampoline (#4323) 2022-06-27 11:51:55 -07:00
Alex Crichton
dc2fe0ac67 x64: Fix codegen for the i8x16.swizzle instruction (#4318)
This commit fixes a mistake in the `Swizzle` opcode implementation in
the x64 backend of Cranelift. Previously an input register was casted to
a writable register and then modified, which I believe instructions are
not supposed to do. This was discovered as part of my investigation
into #4315.
2022-06-27 13:20:31 -05:00
Alex Crichton
8bb07523e2 x64: Fix codegen for the select instruction with v128 (#4317)
This commit fixes a bug in the previous codegen for the `select`
instruction when the operations of the `select` were of the `v128` type.
Previously teh `XmmCmove` instruction only stored an `OperandSize` of 32
or 64 for a 64 or 32-bit move, but this was also used for these 128-bit
types which meant that when used the wrong move instruction was
generated. The fix applied here is to store the whole `Type` being moved
so the 128-bit variant can be selected as well.
2022-06-27 11:02:40 -07:00
Afonso Bordado
23ae9016af cranelift: Implement scalar ireduce on interpreter (#4320) 2022-06-27 11:00:37 -07:00
Afonso Bordado
87007c5839 cranelift: Fix bint implementation on interpreter (#4299)
* cranelift: Fix `bint` implementation on interpreter

The interpreter was returning -1 instead of 1 for positive values.
This also extends the bint test suite to cover all types.

* cranelift: Restrict `bint` to scalar values only
2022-06-23 13:43:35 -07:00
Afonso Bordado
51c1655b6e cranelift: Remove duplicated clz/ctz tests (#4304)
These were interpreter only since none of the architectures supported
them but we added support for these instructions when moving to ISLE (#72e2b7fe)
2022-06-23 13:37:16 -07:00
Anton Kirilov
25a588c35f Cranelift AArch64: Use an allocated encoding for Udf (#4281)
Preserve the current behaviour when code is generated for SpiderMonkey.

Copyright (c) 2022, Arm Limited.
2022-06-22 15:03:28 +01:00
Trevor Elliott
337c1ca832 Use similar to diff expected and actual output in filetests (#4282) 2022-06-16 14:27:49 -05:00
Trevor Elliott
7e0bb465d0 X64: port the rest of icmp to ISLE (#4254)
Finish migrating icmp to ISLE for x64
2022-06-13 16:34:11 -07:00
Benjamin Bouvier
43d4f0b93b Serialize BlockNode's cold field too when serializing a Layout (#4265)
This fixes a bug when the `cold` field would not be serialized, since
we're using a custom (de)serializer for `Layout`. This is now properly
handled by adding a boolean in the serialized stream.

This was caught during the work on #4155, as this would result in cache
mismatches between a function and itself.
2022-06-13 12:04:37 -07:00
Alex Crichton
7d7ddceb17 Update wasm-tools crates (#4246)
This commit updates the wasm-tools family of crates, notably pulling in
the refactorings and updates from bytecodealliance/wasm-tools#621 for
the latest iteration of the component model. This commit additionally
updates all support for the component model for these changes, notably:

* Many bits and pieces of type information was refactored. Many
  `FooTypeIndex` namings are now `TypeFooIndex`. Additionally there is
  now `TypeIndex` as well as `ComponentTypeIndex` for the two type index
  spaces in a component.

* A number of new sections are now processed to handle the core and
  component variants.

* Internal maps were split such as the `funcs` map into
  `component_funcs` and `funcs` (same for `instances`).

* Canonical options are now processed individually instead of one bulk
  `into` definition.

Overall this was not a major update to the internals of handling the
component model in Wasmtime. Instead this was mostly a surface-level
refactoring to make sure that everything lines up with the new binary
format for components.

* All text syntax used in tests was updated to the new syntax.
2022-06-09 11:16:07 -05:00
Anton Kirilov
c15c3061ca CFI improvements to the AArch64 fiber implementation (#4195)
Now the fiber implementation on AArch64 authenticates function
return addresses and includes the relevant BTI instructions, except
on macOS.

Also, change the locations of the saved FP and LR registers on the
fiber stack to make them compliant with the Procedure Call Standard
for the Arm 64-bit Architecture.

Copyright (c) 2022, Arm Limited.
2022-06-09 09:17:12 -05:00
Trevor Elliott
823817595a Fix some typos in the isle language reference (#4248) 2022-06-08 16:01:14 -07:00
Chris Fallin
5033f9994b cranelift-native flags detection: fix flags on SSE2-only systems. (#4231)
In #4224 we saw that an SSE2-only x86-64 system somehow was still
 detecting SSE3/SSSE3/SSE4.1/SSE4.2. It turns out that we enabled these
 in the baseline `Flags` in #3816, because without that, a ton of other
 things break: default flags no longer produce a compiler backend that
 works with default Wasmtime settings. However the logic to set them
 when detected (via `CPUID`-using feature-test macros) only does an "if
 detected then set bit" step per feature; the bits are never *cleared*.
 This PR fixes that.
2022-06-08 13:48:41 -07:00
Chris Fallin
ed9db962de x64 backend: fix cmpxchg (don't return RealReg as result). (#4243)
The current lowering helper for `cmpxchg` returns the literal RealReg
`rax` as its result. However, this breaks a number of invariants, and
eventually causes a regalloc panic if used as a blockparam arg (pinned
vregs cannot be used in this way).

In general we have to return regular vregs, not a RealReg, as results of
instructions during lowering. However #4223 added a helper for
`x64_cmpxchg` that returns a literal `rax`.

Fortunately we can do the right thing here by just giving a fresh vreg
to the instruction; the regalloc constraints mean that this vreg is
constrained to `rax` at the instruction (at its def/late point), so the
generator of the instruction need not worry about `rax` here.
2022-06-08 06:13:31 -07:00
Trevor Elliott
bc3c4fa206 X64: port fvpromote to ISLE (#4242) 2022-06-07 17:18:23 -07:00
Chris Fallin
54acd8b3e2 x64 backend: fix to_amode with constant address (no registers). (#4239)
If an address expression is given to `to_amode` that is completely
constant (no registers at all), then it will produce an `Amode` that has
the resulting constant as an offset, and `(invalid_reg)` as the base.
This is a side-effect of the way we build up the amode step-by-step --
we're waiting to see a register and plug it into the base field. If we
never get a reg though, we need to generate a constant zero into a
register and use that as the base. This PR adds a `finalize_amode`
helper to do just that.

Fixes #4234.
2022-06-07 11:40:10 -07:00
Johnnie Birch
3f152273d3 X64: Port fpromote to ISLE (#4230) 2022-06-06 14:47:44 -07:00
Andrew Brown
6df56e6aa6 x64: port atomic_cas to ISLE (#4223) 2022-06-06 13:20:33 -07:00
Chris Fallin
d8ba1ddc86 Update Cranelift-ISLE integration docs to reflect no more checked-in code. (#4229)
* Update Cranelift-ISLE integration docs to reflect no more checked-in code.

In #4143, we removed the checked-in-generated-code aspect of the ISLE
build process, in order to simplify the development cycle and reduce
errors. However, I failed to update the docs at the same time. This PR
fixes that. Supersedes #4228 (thanks @jlb6740 for noticing this issue!).

* fix typo
2022-06-06 12:54:23 -07:00
Sam Parker
acfeda4d80 [AArch64] Port IaddPairwise to ISLE (#4201)
Copyright (c) 2022, Arm Limited.
2022-06-06 15:37:13 +01:00
wasmtime-publish
55946704cb Bump Wasmtime to 0.39.0 (#4225)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-06-06 09:12:47 -05:00
Chris Fallin
ae2c84205f Upgrade to regalloc2 v0.2.2. (#4222)
Pulls in an improvement to spillslot allocation
(bytecodealliance/regalloc2#56).
2022-06-03 17:12:32 -07:00
Chris Fallin
d65d8b25a5 Update Cranelift README. (#4220)
Our README was starting to show its age; it did not reflect the current
status of Cranelift well with respect to production maturity, current
supported backends, or performance. This PR makes a pass over the
"Status" section to fix that. It also removes some old/out-of-date
details, like `no_std` support (which has bitrotted).
2022-06-03 16:02:57 -07:00
Andrew Brown
816aae6aca x64: port some atomics to ISLE (#4212)
* x64: port `fence` to ISLE
* x64: port `atomic_load` to ISLE
* x64: port `atomic_store` to ISLE
2022-06-02 14:13:10 -07:00
Chris Fallin
44d1dee76e Fix failing cranelift-object tests: make panic message match more generic. (#4211)
Rust 1.61 changed the way `Debug` output looks for strings with null
bytes in them, which broke some expected-panic error message matches.
This makes the expectations more generic while still capturing the
important part ("has a null byte").
2022-06-02 13:34:10 -07:00
Chris Fallin
8f61eb9341 Upgrade to regalloc2 version 0.2.1. (#4199)
This resolves an edge-case where mul.i128 with an input that continues
to be live after the instruction could cause an invalid regalloc
constraint (basically, the regalloc did not previously support an
instruction use and def both being constrained to the same physical reg;
and the "mul" variant used for mul.i128 on x64 was the only instance of
such operands in Cranelift).

Causes two extra move instructions in the mul.i128 filetest, but that's
the price to pay for the slightly more general (works in all cases)
handling of the constraints.
2022-06-01 13:26:20 -07:00