Commit Graph

555 Commits

Author SHA1 Message Date
yuyang-ok
cdecc858b4 add riscv64 backend for cranelift. (#4271)
Add a RISC-V 64 (`riscv64`, RV64GC) backend.

Co-authored-by: yuyang <756445638@qq.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
Co-authored-by: Afonso Bordado <afonsobordado@az8.co>
2022-09-27 17:30:31 -07:00
Alex Crichton
7b311004b5 Leverage Cargo's workspace inheritance feature (#4905)
* Leverage Cargo's workspace inheritance feature

This commit is an attempt to reduce the complexity of the Cargo
manifests in this repository with Cargo's workspace-inheritance feature
becoming stable in Rust 1.64.0. This feature allows specifying fields in
the root workspace `Cargo.toml` which are then reused throughout the
workspace. For example this PR shares definitions such as:

* All of the Wasmtime-family of crates now use `version.workspace =
  true` to have a single location which defines the version number.
* All crates use `edition.workspace = true` to have one default edition
  for the entire workspace.
* Common dependencies are listed in `[workspace.dependencies]` to avoid
  typing the same version number in a lot of different places (e.g. the
  `wasmparser = "0.89.0"` is now in just one spot.

Currently the workspace-inheritance feature doesn't allow having two
different versions to inherit, so all of the Cranelift-family of crates
still manually specify their version. The inter-crate dependencies,
however, are shared amongst the root workspace.

This feature can be seen as a method of "preprocessing" of sorts for
Cargo manifests. This will help us develop Wasmtime but shouldn't have
any actual impact on the published artifacts -- everything's dependency
lists are still the same.

* Fix wasi-crypto tests
2022-09-26 11:30:01 -05:00
Damian Heaton
e786bda002 Vector bitcast support (AArch64 & Interpreter) (#4820)
* Vector bitcast support (AArch64 & Interpreter)

Implemented support for `bitcast` on vector values for AArch64 and the
interpreter.

Also corrected the verifier to ensure that the size, in bits, of the input and
output types match for a `bitcast`, per the docs.

Copyright (c) 2022 Arm Limited

* `I128` same-type bitcast support

Copyright (c) 2022 Arm Limited

* Directly return input for 64-bit GPR<=>GPR bitcast

Copyright (c) 2022 Arm Limited
2022-09-21 09:20:28 -07:00
Anton Kirilov
d8b290898c Initial forward-edge CFI implementation (#3693)
* Initial forward-edge CFI implementation

Give the user the option to start all basic blocks that are targets
of indirect branches with the BTI instruction introduced by the
Branch Target Identification extension to the Arm instruction set
architecture.

Copyright (c) 2022, Arm Limited.

* Refactor `from_artifacts` to avoid second `make_executable` (#1)

This involves "parsing" twice but this is parsing just the header of an
ELF file so it's not a very intensive operation and should be ok to do
twice.

* Address the code review feedback

Copyright (c) 2022, Arm Limited.

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
2022-09-08 09:35:58 -05:00
Jamey Sharp
3d6d49daba cranelift: Remove of/nof overflow flags from icmp (#4879)
* cranelift: Remove of/nof overflow flags from icmp

Neither Wasmtime nor cg-clif use these flags under any circumstances.
From discussion on #3060 I see it's long been unclear what purpose these
flags served.

Fixes #3060, fixes #4406, and fixes #4875... by deleting all the code
that could have been buggy.

This changes the cranelift-fuzzgen input format by removing some IntCC
options, so I've gone ahead and enabled I128 icmp tests at the same
time. Since only the of/nof cases were failing before, I expect these to
work.

* Restore trapif tests

It's still useful to validate that iadd_ifcout's iflags result can be
forwarded correctly to trapif, and for that purpose it doesn't really
matter what condition code is checked.
2022-09-07 08:38:41 -07:00
Alex Crichton
65930640f8 Bump Wasmtime to 2.0.0 (#4874)
This commit replaces #4869 and represents the actual version bump that
should have happened had I remembered to bump the in-tree version of
Wasmtime to 1.0.0 prior to the branch-cut date. Alas!
2022-09-06 13:49:56 -05:00
Afonso Bordado
08e7a7f1a0 cranelift: Add inline stack probing for x64 (#4747)
* cranelift: Add inline stack probe for x64

* cranelift: Cleanups comments

Thanks @jameysharp!
2022-09-01 22:32:54 +00:00
Jamey Sharp
84ac24c23d cranelift: Remove const_addr instruction (fixes #2398) (#4843) 2022-09-01 21:57:37 +00:00
Johnnie Birch
6368c6b188 Modifies fcvt_to_sint and fcvt_to_unit clif to make scalar only (#4794)
Closes #4693.
2022-08-29 08:40:39 -07:00
Afonso Bordado
d620705a32 Fix Invalid Instruction format in fuzzgen (#4738)
* cranelift: Add assert to prevent wrong InstFormat being used for the wrong opcode

* cranelift: Use correct instruction format when inserting opcodes in fuzzgen (fixes #4733)

* cranelift: Use debug assert on InstFormat assert
2022-08-20 00:49:54 +00:00
Afonso Bordado
e577a76c0d cranelift: Sign extend immediates in instructions that embed them. (#4602)
* cranelift: Sign extend immediates in instructions that embed them.

* cranelift: Clarify imm instruction behaviour

* cranelift: Deduplicate imm_const

* cranelift: zero extend logical imm ops
2022-08-15 18:08:20 +00:00
Benjamin Bouvier
8a9b1a9025 Implement an incremental compilation cache for Cranelift (#4551)
This is the implementation of https://github.com/bytecodealliance/wasmtime/issues/4155, using the "inverted API" approach suggested by @cfallin (thanks!) in Cranelift, and trait object to provide a backend for an all-included experience in Wasmtime. 

After the suggestion of Chris, `Function` has been split into mostly two parts:

- on the one hand, `FunctionStencil` contains all the fields required during compilation, and that act as a compilation cache key: if two function stencils are the same, then the result of their compilation (`CompiledCodeBase<Stencil>`) will be the same. This makes caching trivial, as the only thing to cache is the `FunctionStencil`.
- on the other hand, `FunctionParameters` contain the... function parameters that are required to finalize the result of compilation into a `CompiledCode` (aka `CompiledCodeBase<Final>`) with proper final relocations etc., by applying fixups and so on.

Most changes are here to accomodate those requirements, in particular that `FunctionStencil` should be `Hash`able to be used as a key in the cache:

- most source locations are now relative to a base source location in the function, and as such they're encoded as `RelSourceLoc` in the `FunctionStencil`. This required changes so that there's no need to explicitly mark a `SourceLoc` as the base source location, it's automatically detected instead the first time a non-default `SourceLoc` is set.
- user-defined external names in the `FunctionStencil` (aka before this patch `ExternalName::User { namespace, index }`) are now references into an external table of `UserExternalNameRef -> UserExternalName`, present in the `FunctionParameters`, and must be explicitly declared using `Function::declare_imported_user_function`.
- some refactorings have been made for function names:
  - `ExternalName` was used as the type for a `Function`'s name; while it thus allowed `ExternalName::Libcall` in this place, this would have been quite confusing to use it there. Instead, a new enum `UserFuncName` is introduced for this name, that's either a user-defined function name (the above `UserExternalName`) or a test case name.
  - The future of `ExternalName` is likely to become a full reference into the `FunctionParameters`'s mapping, instead of being "either a handle for user-defined external names, or the thing itself for other variants". I'm running out of time to do this, and this is not trivial as it implies touching ISLE which I'm less familiar with.

The cache computes a sha256 hash of the `FunctionStencil`, and uses this as the cache key. No equality check (using `PartialEq`) is performed in addition to the hash being the same, as we hope that this is sufficient data to avoid collisions.

A basic fuzz target has been introduced that tries to do the bare minimum:

- check that a function successfully compiled and cached will be also successfully reloaded from the cache, and returns the exact same function.
- check that a trivial modification in the external mapping of `UserExternalNameRef -> UserExternalName` hits the cache, and that other modifications don't hit the cache.
  - This last check is less efficient and less likely to happen, so probably should be rethought a bit.

Thanks to both @alexcrichton and @cfallin for your very useful feedback on Zulip.

Some numbers show that for a large wasm module we're using internally, this is a 20% compile-time speedup, because so many `FunctionStencil`s are the same, even within a single module. For a group of modules that have a lot of code in common, we get hit rates up to 70% when they're used together. When a single function changes in a wasm module, every other function is reloaded; that's still slower than I expect (between 10% and 50% of the overall compile time), so there's likely room for improvement. 

Fixes #4155.
2022-08-12 16:47:43 +00:00
Jamey Sharp
ecb91c0b06 List preset's settings in generated comment (#4679)
Figuring out which boolean settings go into each preset is not easy by
inspecting the DSL source (e.g. meta/src/isa/x86.rs). This patch extends
the comments in the Rust that's generated by that DSL to list the names
of the settings together with the name of the preset.
2022-08-10 19:56:23 +00:00
Damian Heaton
e463890f26 Port AvgRound & SqmulRoundSat to ISLE (AArch64) (#4639)
Ported the existing implementations of the following opcodes on AArch64
to ISLE:
- `AvgRound`
  - Also introduced support for `i64x2` vectors, as per the docs.
- `SqmulRoundSat`

Copyright (c) 2022 Arm Limited
2022-08-08 11:35:43 -07:00
wasmtime-publish
412fa04911 Bump Wasmtime to 0.41.0 (#4620)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-08-04 20:02:19 -05:00
Anton Kirilov
a897742593 Initial back-edge CFI implementation (#3606)
Give the user the option to sign and to authenticate function
return addresses with the operations introduced by the Pointer
Authentication extension to the Arm instruction set architecture.

Copyright (c) 2021, Arm Limited.
2022-08-03 11:08:29 -07:00
Nick Fitzgerald
42bba452a6 Cranelift: Add instructions for getting the current stack/frame/return pointers (#4573)
* Cranelift: Add instructions for getting the current stack/frame pointers and return address

This is the initial part of https://github.com/bytecodealliance/wasmtime/issues/4535

* x64: Remove `Amode::RbpOffset` and use `Amode::ImmReg` instead

We just special case getting operands from `Amode`s now.

* Fix s390x `get_return_address`; require `preserve_frame_pointers=true`

* Assert that `Amode::ImmRegRegShift` doesn't use rbp/rsp

* Handle non-allocatable registers in Amode::with_allocs

* Use "stack" instead of "r15" on s390x

* r14 is an allocatable register on s390x, so it shouldn't be used with `MovPReg`
2022-08-02 14:37:17 -07:00
Chris Fallin
8dddd6f1f7 Cranelift: Remove ifcmp_sp opcode. (#4578)
This was temporarily added back in #3502 due to a need from Lucet; now
that Lucet is EOL, the opcode is no longer needed and we can remove it.
2022-08-02 13:15:39 -07:00
Chris Fallin
43f1765272 Cranellift: remove Baldrdash support and related features. (#4571)
* Cranellift: remove Baldrdash support and related features.

As noted in Mozilla's bugzilla bug 1781425 [1], the SpiderMonkey team
has recently determined that their current form of integration with
Cranelift is too hard to maintain, and they have chosen to remove it
from their codebase. If and when they decide to build updated support
for Cranelift, they will adopt different approaches to several details
of the integration.

In the meantime, after discussion with the SpiderMonkey folks, they
agree that it makes sense to remove the bits of Cranelift that exist
to support the integration ("Baldrdash"), as they will not need
them. Many of these bits are difficult-to-maintain special cases that
are not actually tested in Cranelift proper: for example, the
Baldrdash integration required Cranelift to emit function bodies
without prologues/epilogues, and instead communicate very precise
information about the expected frame size and layout, then stitched
together something post-facto. This was brittle and caused a lot of
incidental complexity ("fallthrough returns", the resulting special
logic in block-ordering); this is just one example. As another
example, one particular Baldrdash ABI variant processed stack args in
reverse order, so our ABI code had to support both traversal
orders. We had a number of other Baldrdash-specific settings as well
that did various special things.

This PR removes Baldrdash ABI support, the `fallthrough_return`
instruction, and pulls some threads to remove now-unused bits as a
result of those two, with the  understanding that the SpiderMonkey folks
will build new functionality as needed in the future and we can perhaps
find cleaner abstractions to make it all work.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1781425

* Review feedback.

* Fix (?) DWARF debug tests: add `--disable-cache` to wasmtime invocations.

The debugger tests invoke `wasmtime` from within each test case under
the control of a debugger (gdb or lldb). Some of these tests started to
inexplicably fail in CI with unrelated changes, and the failures were
only inconsistently reproducible locally. It seems to be cache related:
if we disable cached compilation on the nested `wasmtime` invocations,
the tests consistently pass.

* Review feedback.
2022-08-02 19:37:56 +00:00
Afonso Bordado
02c3b47db2 x64: Implement SIMD fma (#4474)
* x64: Add VEX Instruction Encoder

This uses a similar builder pattern to the EVEX Encoder.
Does not yet support memory accesses.

* x64: Add FMA Flag

* x64: Implement SIMD `fma`

* x64: Use 4 register Vex Inst

* x64: Reorder VEX pretty print args
2022-07-25 22:01:02 +00:00
Damian Heaton
3ef89b7787 Allow 64-bit vectors and implement for interpreter (#4509)
* Allow 64-bit vectors and implement for interpreter

The AArch64 backend already supports 64-bit vectors; this simply allows
instructions to make use of that.

Implemented support for 64-bit vectors within the interpreter to allow
interpret runtests to use them.

Copyright (c) 2022 Arm Limited

* Disable 64-bit SIMD `iaddpairwise` tests on s390x

Copyright (c) 2022 Arm Limited
2022-07-25 13:00:43 -07:00
Afonso Bordado
af62037f62 cranelift: Restrict br_table to i32 indices (#4510)
* cranelift: Restrict `br_table` to `i32` indices

In #4498 it was proposed that we should only accept `i32` indices
to `br_table`. The rationale for this is that larger types lead the
users to a false sense of flexibility (since we don't support jump
tables larger than u32's), and narrower types are not well tested
paths that would be safer if we removed them.

* cranelift: Reduce directly from i128 to i32 in Switch
2022-07-22 23:32:40 +00:00
Benjamin Bouvier
4ce329d1eb Add a cranelift flag to enable/disable verbose logs for regalloc2 (#4481) 2022-07-21 09:12:13 +00:00
Nick Fitzgerald
22d91a7c84 cranelift: Add a flag for preserving frame pointers (#4469)
Preserving frame pointers -- even inside leaf functions -- makes it easy to
capture the stack of a running program, without requiring any side tables or
metadata (like `.eh_frame` sections). Many sampling profilers and similar tools
walk frame pointers to capture stacks. Enabling this option will play nice with
those tools.
2022-07-20 08:02:21 -07:00
Sam Parker
9c43749dfe [RFC] Dynamic Vector Support (#4200)
Introduce a new concept in the IR that allows a producer to create
dynamic vector types. An IR function can now contain global value(s)
that represent a dynamic scaling factor, for a given fixed-width
vector type. A dynamic type is then created by 'multiplying' the
corresponding global value with a fixed-width type. These new types
can be used just like the existing types and the type system has a
set of hard-coded dynamic types, such as I32X4XN, which the user
defined types map onto. The dynamic types are also used explicitly
to create dynamic stack slots, which have no set size like their
existing counterparts. New IR instructions are added to access these
new stack entities.

Currently, during codegen, the dynamic scaling factor has to be
lowered to a constant so the dynamic slots do eventually have a
compile-time known size, as do spill slots.

The current lowering for aarch64 just targets Neon, using a dynamic
scale of 1.

Copyright (c) 2022, Arm Limited.
2022-07-07 12:54:39 -07:00
Chris Fallin
00f357c028 Cranelift: support 14-bit Type index with some bitpacking. (#4269)
* Cranelift: make `ir::Type` a `u16`.

* Cranelift: pack ValueData back into 64 bits.

After extending `Type` to a `u16`, `ValueData` became 12 bytes rather
than 8. This packs it back down to 8 bytes (64 bits) by stealing two
bits from the `Type` for the enum discriminant (leaving 14 bits for the
type itself).

Performance comparison (3-way between original (`ty-u8`), 16-bit `Type`
(`ty-u16`), and this PR (`ty-packed`)):

```
~/work/sightglass% target/release/sightglass-cli benchmark \
    -e ~/ty-u8.so -e ~/ty-u16.so -e ~/ty-packed.so \
    --iterations-per-process 10 --processes 2 \
    benchmarks-next/spidermonkey/benchmark.wasm

compilation
  benchmarks-next/spidermonkey/benchmark.wasm
    cycles
      [20654406874 21749213920.50 22958520306] /home/cfallin/ty-packed.so
      [22227738316 22584704883.90 22916433748] /home/cfallin/ty-u16.so
      [20659150490 21598675968.60 22588108428] /home/cfallin/ty-u8.so
    nanoseconds
      [5435333269 5723139427.25 6041072883] /home/cfallin/ty-packed.so
      [5848788229 5942729637.85 6030030341] /home/cfallin/ty-u16.so
      [5436002390 5683248226.10 5943626225] /home/cfallin/ty-u8.so
```

So, when compiling SpiderMonkey.wasm, making `Type` 16 bits regresses
performance by 4.5% (5.683s -> 5.723s), while this PR gets 14 bits for a 1.0%
cost (5.683s -> 5.723s). That's still not great, and we can likely do better,
but it's a start.

* Fix test failure: entities to/from u32 via `{from,to}_bits`, not `{from,to}_u32`.
2022-07-05 14:51:02 -07:00
wasmtime-publish
7c428bbd62 Bump Wasmtime to 0.40.0 (#4378)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-07-05 09:10:52 -05:00
Afonso Bordado
38ecd3744f aarch64: Implement bmask/bextend in ISLE (#4358)
* aarch64: Implement `bmask`/`bextend` in ISLE

* cranelift: Remove vector versions of `bextend`

* aarch64: Cleanup `bmask`/`bextend` documentation
2022-07-01 09:37:18 -07:00
Afonso Bordado
919604b8c5 aarch64: Implement ireduce/breduce in ISLE (#4331)
* aarch64: Implement `ireduce`/`breduce` in ISLE

* cranelift: Remove vector versions of `breduce`/`ireduce`
2022-06-30 11:15:47 -07:00
Chris Fallin
2034c8aa45 Cranelift: add a config option for alias analysis and redundant-load elimination. (#4349)
This allows for experiments as in here [1] and also generally gives an
option to anyone who is concerned that the extra optimization may be
counterproductive or take too much time. The optimization remains
enabled by default.

[1]
https://github.com/bytecodealliance/wasmtime/pull/4163#issuecomment-1169303683
2022-06-28 15:25:47 -07:00
Afonso Bordado
87007c5839 cranelift: Fix bint implementation on interpreter (#4299)
* cranelift: Fix `bint` implementation on interpreter

The interpreter was returning -1 instead of 1 for positive values.
This also extends the bint test suite to cover all types.

* cranelift: Restrict `bint` to scalar values only
2022-06-23 13:43:35 -07:00
wasmtime-publish
55946704cb Bump Wasmtime to 0.39.0 (#4225)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-06-06 09:12:47 -05:00
bjorn3
c4eab2beb6 Avoid spurious build script runs (#4150)
* Don't attempt to track the generated clif.isle in cargo

This causes the build script to rerun every time for me.

* Put build script debug messages on stderr instead of stdout

This keeps stdout reserved for cargo build script directives
2022-05-12 11:49:20 -07:00
Chris Fallin
5d671952ee Cranelift: do not check in generated ISLE code; regenerate on every compile. (#4143)
This PR fixes #4066: it modifies the Cranelift `build.rs` workflow to
invoke the ISLE DSL compiler on every compilation, rather than only
when the user specifies a special "rebuild ISLE" feature.

The main benefit of this change is that it vastly simplifies the mental
model required of developers, and removes a bunch of failure modes
we have tried to work around in other ways. There is now just one
"source of truth", the ISLE source itself, in the repository, and so there
is no need to understand a special "rebuild" step and how to handle
merge errors. There is no special process needed to develop the compiler
when modifying the DSL. And there is no "noise" in the git history produced
by constantly-regenerated files.

The two main downsides we discussed in #4066 are:
- Compile time could increase, by adding more to the "meta" step before the main build;
- It becomes less obvious where the source definitions are (everything becomes
  more "magic"), which makes exploration and debugging harder.

This PR addresses each of these concerns:

1. To maintain reasonable compile time, it includes work to cut down the
   dependencies of the `cranelift-isle` crate to *nothing* (only the Rust stdlib),
   in the default build. It does this by putting the error-reporting bits
   (`miette` crate) under an optional feature, and the logging (`log` crate) under
   a feature-controlled macro, and manually writing an `Error` impl rather than
   using `thiserror`. This completely avoids proc macros and the `syn` build slowness.

   The user can still get nice errors out of `miette`: this is enabled by specifying
   a Cargo feature `--features isle-errors`.

2. To allow the user to optionally inspect the generated source, which nominally
   lives in a hard-to-find path inside `target/` now, this PR adds a feature `isle-in-source-tree`
   that, as implied by the name, moves the target for ISLE generated source into
   the source tree, at `cranelift/codegen/isle_generated_source/`. It seems reasonable
   to do this when an explicit feature (opt-in) is specified because this is how ISLE regeneration
   currently works as well. To prevent surprises, if the feature is *not* specified, the
   build fails if this directory exists.
2022-05-11 22:25:24 -07:00
Chris Fallin
7c5a56b836 Cranelift: division/remainder CLIF ops are scalar-only. (#4141)
In #4104 we discussed whether it makes sense for the division and
remainder ops to support vector types. We concluded that because most
hardware doesn't support it directly, it probably is not ideal to force
all backends to polyfill it. In the future we can always reverse this
decision, perhaps with a platform-independent legalization.

This PR restricts the allowed types on the CLIF ops to integer types
only.
2022-05-11 11:10:02 -07:00
wasmtime-publish
9a6854456d Bump Wasmtime to 0.38.0 (#4103)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-05-05 13:43:02 -05:00
Chris Fallin
61dc38c065 Implement Spectre mitigations for table accesses and br_tables. (#4092)
Currently, we have partial Spectre mitigation: we protect heap accesses
with dynamic bounds checks. Specifically, we guard against errant
accesses on the misspeculated path beyond the bounds-check conditional
branch by adding a conditional move that is also dependent on the
bounds-check condition. This data dependency on the condition is not
speculated and thus will always pick the "safe" value (in the heap case,
a NULL address) on the misspeculated path, until the pipeline flushes
and recovers onto the correct path.

This PR uses the same technique both for table accesses -- used to
implement Wasm tables -- and for jumptables, used to implement Wasm
`br_table` instructions.

In the case of Wasm tables, the cmove picks the table base address on
the misspeculated path. This is equivalent to reading the first table
entry. This prevents loads of arbitrary data addresses on the
misspeculated path.

In the case of `br_table`, the cmove picks index 0 on the misspeculated
path. This is safer than allowing a branch to an address loaded from an
index under misspeculation (i.e., it preserves control-flow integrity
even under misspeculation).

The table mitigation is controlled by a Cranelift setting, on by
default. The br_table mitigation is always on, because it is part of the
single lowering pseudoinstruction. In both cases, the impact should be
minimal: a single extra cmove in a (relatively) rarely-used operation.

The table mitigation is architecture-independent (happens during
legalization); the br_table mitigation has been implemented for both x64
and aarch64. (I don't know enough about s390x to implement this
confidently there, but would happily review a PR to do the same on that
platform.)
2022-05-02 11:19:16 -07:00
Chris Fallin
0af8737ec3 Add support for running the regalloc2 checker. (#4043)
With these fixes, all this PR has to do is instantiate and run the
checker on the `regalloc2::Output`. This is off by default, and is
enabled by setting the `regalloc_checker` Cranelift option.

This restores the old functionality provided by e.g. the
`backtracking_checked` regalloc algorithm setting rather than
`backtracking` when we were still on regalloc.rs.
2022-04-18 14:06:07 -07:00
Chris Fallin
a0318f36f0 Switch Cranelift over to regalloc2. (#3989)
This PR switches Cranelift over to the new register allocator, regalloc2.

See [this document](https://gist.github.com/cfallin/08553421a91f150254fe878f67301801)
for a summary of the design changes. This switchover has implications for
core VCode/MachInst types and the lowering pass.

Overall, this change brings improvements to both compile time and speed of
generated code (runtime), as reported in #3942:

```
Benchmark       Compilation (wallclock)     Execution (wallclock)
blake3-scalar   25% faster                  28% faster
blake3-simd     no diff                     no diff
meshoptimizer   19% faster                  17% faster
pulldown-cmark  17% faster                  no diff
bz2             15% faster                  no diff
SpiderMonkey,   21% faster                  2% faster
  fib(30)
clang.wasm      42% faster                  N/A
```
2022-04-14 10:28:21 -07:00
wasmtime-publish
78a595ac88 Bump Wasmtime to 0.37.0 (#3994)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-04-05 09:24:28 -05:00
Alex Crichton
7b5176baea Upgrade all crates to the Rust 2021 edition (#3991)
* Upgrade all crates to the Rust 2021 edition

I've personally started using the new format strings for things like
`panic!("some message {foo}")` or similar and have been upgrading crates
on a case-by-case basis, but I think it probably makes more sense to go
ahead and blanket upgrade everything so 2021 features are always
available.

* Fix compile of the C API

* Fix a warning

* Fix another warning
2022-04-04 12:27:12 -05:00
Alex Crichton
c89dc55108 Add a two-week delay to Wasmtime's release process (#3955)
* Bump to 0.36.0

* Add a two-week delay to Wasmtime's release process

This commit is a proposal to update Wasmtime's release process with a
two-week delay from branching a release until it's actually officially
released. We've had two issues lately that came up which led to this proposal:

* In #3915 it was realized that changes just before the 0.35.0 release
  weren't enough for an embedding use case, but the PR didn't meet the
  expectations for a full patch release.

* At Fastly we were about to start rolling out a new version of Wasmtime
  when over the weekend the fuzz bug #3951 was found. This led to the
  desire internally to have a "must have been fuzzed for this long"
  period of time for Wasmtime changes which we felt were better
  reflected in the release process itself rather than something about
  Fastly's own integration with Wasmtime.

This commit updates the automation for releases to unconditionally
create a `release-X.Y.Z` branch on the 5th of every month. The actual
release from this branch is then performed on the 20th of every month,
roughly two weeks later. This should provide a period of time to ensure
that all changes in a release are fuzzed for at least two weeks and
avoid any further surprises. This should also help with any last-minute
changes made just before a release if they need tweaking since
backporting to a not-yet-released branch is much easier.

Overall there are some new properties about Wasmtime with this proposal
as well:

* The `main` branch will always have a section in `RELEASES.md` which is
  listed as "Unreleased" for us to fill out.
* The `main` branch will always be a version ahead of the latest
  release. For example it will be bump pre-emptively as part of the
  release process on the 5th where if `release-2.0.0` was created then
  the `main` branch will have 3.0.0 Wasmtime.
* Dates for major versions are automatically updated in the
  `RELEASES.md` notes.

The associated documentation for our release process is updated and the
various scripts should all be updated now as well with this commit.

* Add notes on a security patch

* Clarify security fixes shouldn't be previewed early on CI
2022-04-01 13:11:10 -05:00
Andrew Brown
bd6fe11ca9 cranelift: remove load_complex and store_complex (#3976)
This change removes all variants of `load*_complex` and `store*_complex`
from Cranelift; this is a breaking change to the instructions exposed by
CLIF. The complete list of instructions removed is: `load_complex`,
`store_complex`, `uload8_complex`, `sload8_complex`, `istore8_complex`,
`sload8_complex`, `uload16_complex`, `sload16_complex`,
`istore16_complex`, `uload32_complex`, `sload32_complex`,
`istore32_complex`, `uload8x8_complex`, `sload8x8_complex`,
`sload16x4_complex`, `uload16x4_complex`, `uload32x2_complex`,
`sload32x2_complex`.

The rationale for this removal is that the Cranelift backend now has the
ability to pattern-match multiple upstream additions in order to
calculate the address to access. Previously, this was not possible so
the `*_complex` instructions were needed. Over time, these instructions
have fallen out of use in this repository, making the additional
overhead of maintaining them a chore.
2022-03-31 10:05:10 -07:00
wasmtime-publish
9137b4a50e Bump Wasmtime to 0.35.0 (#3885)
[automatically-tag-and-release-this-commit]

Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-03-07 15:18:34 -06:00
Alex Crichton
709f7e0c8a Enable SSE 4.2 unconditionally (#3833)
* Enable SSE 4.2 unconditionally

Fuzzing over the weekend found that `i64x2` comparison operators
require `pcmpgtq` which is an SSE 4.2 instruction. Along the lines of #3816
this commit unconditionally enables and requires SSE 4.2 for compilation
and fuzzing. It will no longer be possible to create a compiler for
x86_64 with simd enabled if SSE 4.2 is disabled.

* Update comment
2022-02-22 13:23:51 -06:00
Chris Fallin
1c014d129a Cranelift: ensure ISA level needed for SIMD is present when SIMD is enabled. (#3816)
Addresses #3809: when we are asked to create a Cranelift backend with
shared flags that indicate support for SIMD, we should check that the
ISA level needed for our SIMD lowerings is present.
2022-02-16 17:29:30 -08:00
Chris Fallin
ca0e8d0a1d Remove incomplete/unmaintained ARM32 backend (for now). (#3799)
In #3721, we have been discussing what to do about the ARM32 backend in
Cranelift. Currently, this backend supports only 32-bit types, which is
insufficient for full Wasm-MVP; it's missing other critical bits, like
floating-point support; and it has only ever been exercised, AFAIK, via
the filetests for the individual CLIF instructions that are implemented.

We were very very thankful for the original contribution of this
backend, even in its partial state, and we had hoped at the time that we
could eventually mature it in-tree until it supported e.g. Wasm and
other use-cases. But that hasn't yet happened -- to the blame of no-one,
to be clear, we just haven't had a contributor with sufficient time.

Unfortunately, the existence of the backend and lack of active
maintainer now potentially pose a bit of a burden as we hope to make
continuing changes to the backend framework. For example, the ISLE
migration, and the use of regalloc2 that it will allow, would need all
of the existing lowering patterns in the hand-written ARM32 backend to
be rewritten as ISLE rules.

Given that we don't currently have the resources to do this, we think
it's probably best if we, sadly, for now remove this partial backend.
This is not in any way a statement of what we might accept in the
future, though. If, in the future, an ARM32 backend updated to our
latest codebase with an active maintainer were to appear, we'd be happy
to merge it (and likewise for any other architecture!). But for now,
this is probably the best path. Thanks again to the original contributor
@jmkrauz and we hope that this work can eventually be brought back and
reused if someone has the time to do so!
2022-02-14 15:03:52 -08:00
wasmtime-publish
39b88e4e9e Release Wasmtime 0.34.0 (#3768)
* Bump Wasmtime to 0.34.0

[automatically-tag-and-release-this-commit]

* Add release notes for 0.34.0

* Update release date to today

Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
2022-02-07 19:16:26 -06:00
Andrew Brown
90bfa123e0 docs: render the fcmp tables as code (#3735)
Looking at [the `fcmp`
documentation](https://docs.rs/cranelift-codegen/0.80.0/cranelift_codegen/ir/trait.InstBuilder.html#method.fcmp)--generated
from Cranelift's instruction definitions, the charts explaining the
logic for the various conditions is unreadable. Since rendering those charts
as plain text is problematic, this change wraps them as code sections
for a consistent layout.
2022-01-27 16:36:33 -08:00
Ulrich Weigand
071d3a68d0 ISLE: Fix clif.isle InstructionData entries
Attempt to match a Jump instruction in ISLE will currently lead to the
generated files not compiling.  This is because the definition of the
InstructionData enum in clif.isle does not match the actual type used
in Rust code.

Specifically, clif.isle erroneously omits the ValueList variable-length
argument entry if the format does not use a typevar operand.  This is
the case for Jump and a few other formats.  The problem is caused by
a bug in the gen_isle routine in meta/src/gen_inst.rs.
2022-01-24 12:54:16 +01:00