Commit Graph

8379 Commits

Author SHA1 Message Date
Alex Crichton
a2e71dafac ci: Don't test release binaries, nightly, or beta (#2939)
This commit attempts to slim down our CI (more from #2933) by removing
testing both in debug and release mode. I can't actually recall a
concrete issue that this has turned up on CI itself, and otherwise we're
spending quite a lot of time building all of the dev-dependencies in
release mode when testing.

Additionally it removes testing for nightly/beta channels of Rust. One
of the main benefits of this, staying on top of breakage, is already
moot because we pin to a nightly anyway. We have a few nightly
references elsewhere in CI (fuzzing/docs) so we can largely rely on that
(and upstream testing with rust-lang/rust). We in general shouldn't need
to do nightly/beta testing on all builds.

The release builders were actually the only location that MinGW and
AArch64 was tested however. This means that the old nightly/beta
builders are now replaced with AArch64 and MinGW builders. Overall, the
changes made to CI here are:

* Upgrade to QEMU 6.0.0. I thought this would make aarch64 emulation
  faster, but it didn't. Seems good to stay up to date though.
* Replace nightly/beta testing in debug mode with MinGW and AArch64 testing.
* Use `-g0` for C compilation on MinGW because otherwise `gcc` as used
  on CI generates an ICE (!!)
* Exclude `wasi-crypto` from testing. We already exclude
  `wasmtime-wasi-crypto` and it was an accident we were testing the
  `wasi-crypto` crate (which isn't even part of this workspace).
* Remove testing DWARF on the old backend step, which nowadays didn't
  actually do that.
* Remove testing on release builders, making then purely tasked with
  release builds, nothing else.
* Rename `QEMU_VERSION` to `QEMU_BUILD_VERSION` so qemu doesn't just
  immediately exit after printing its version.

Timing wise the release builds are ~20-30 minutes faster, depending on
the platform. This is not really because of testing time but rather we
have a huge dependency tree when `dev-dependencies` are considered
(criterion, tokio, proptest, ...).

MinGW tests are pretty fast since we don't run examples (we're not too
interested in doing examples there, just windows/mac/linux coverage).
AArch64 tests are run with optimizations enabled because unoptimized
tests take ~45 minutes to finish while optimized tests take ~20 minutes.
The build is naturally much faster in debug mode but apparently under
QEMU emulation the debug mode binaries are *extremely* slow compared to
the release binaries, which means that extra time we spend compiling
release tests is more than made up by faster test emulation time.

Closes #2938
2021-05-26 10:12:29 -05:00
Nick Fitzgerald
137e6e8332 Merge pull request #2937 from fitzgen/bench-api-stdio-and-repeated-instantiations
bench-api: pass in explicit stdio files, allow repeated instantiations per compilation
2021-05-25 14:44:51 -07:00
Andrew Brown
459fce3467 x64: lower i8x16.popcnt to VPOPCNTB when possible
When AVX512VL or AVX512BITALG are available, Wasm SIMD's `popcnt`
instruction can be lowered to a single x64 instruction, `VPOPCNTB`,
instead of 8+ instructions.
2021-05-25 12:16:25 -07:00
Alex Crichton
2b0649c74c ci: Remove "publish" step (#2936)
This commit removes the publish step in GitHub actions, insteading
folding all functionality into the release build steps. This avoids
having a separately scheduled job after all the release build jobs which
ends up getting delayed for quite a long time given the current
scheduling algorithm.

This involves refactoring the tarball assembly scripts and refactoring the
github asset upload script too. Tarball assembly now manages everything
internally and does platform-specific bits where necessary. The upload
script is restructured to be run in parallel (in theory) and hopefully
catches various errors and tries to not stomp over everyone else's work.
The main trickiness here is handling `dev`, which is less critical for
correctness than than tags themselves.

As a small tweak build-wise the QEMU build for cross-compiled builders
is now cached unlike before where it was unconditionally built, shaving
a minute or two off build time.
2021-05-25 12:52:41 -05:00
Nick Fitzgerald
18fabd7700 bench-api: Allow multiple instantiations per compilation
We used to allow at most one instantiation per compilation, but there is no
fundamental reason why that should be the case. Allowing multiple instantiations
per compilation allows us to, for example, benchmark repeated instantiation
within Wasmtime's pooling allocator.

This additionally switches to using host functions for WASI and for
`bench_{start,end}` rather than defining them on the linker, this way we can use
a new store for every instantiation and don't need to keep other instances alive
when instantiating new instances.

Finally, we switch all timing to be done through callback functions, rather than
having the bench API caller implicitly start/end timers around bench API
calls. This allows us to more precisely measure phases and exclude things like
file I/O performed when creating a WASI context.
2021-05-24 16:53:22 -07:00
Alex Crichton
e5ac9350b1 ci: Try other syntax for concurrency key (#2935)
After #2932 that [immediately failed][build] on the main branch so this
tries a different key to see if it'll work...

[build]: https://github.com/bytecodealliance/wasmtime/actions/runs/872766013/workflow
2021-05-24 18:26:14 -05:00
Alex Crichton
beaa07eb96 ci: Merge all doc builders into one (#2934)
Also move the gh-pages pushing step from the `publish` phase to just
this singular doc builder.

The motivation for this is to eventually remove the `publish` step since
it interacts badly with GitHub's scheduling of actions. This is
hopefully the first step towards that by removing the doc publish part
of the phase.
2021-05-24 18:26:05 -05:00
Nick Fitzgerald
ba6635dba0 bench-api: Pass in explicit stdin/stdout/stderr
Instead of inheriting stdio, pass in explicit file paths that are opened for
reading (stdin) or writing (stderr/stdout). This will allow sightglass to assert
that benchmarks produce the expected output.
2021-05-24 15:20:10 -07:00
Nick Fitzgerald
13741284b3 bench-api: Add a feature for the old x86_64 backend
This makes it easier to benchmark old vs new backends.
2021-05-24 15:20:10 -07:00
Alex Crichton
8c2413e009 Try to ease up on CI usage slightly (#2932)
* First remove `fail-fast: false` annotations to fail faster. If desired
  this could always be added in a on-off fashion to PRs.
* Next use the new `concurrency` feature to try to cancel previous
  builds, ideally meaning that if a branch is pushed to multiple times
  it only runs CI once.
2021-05-24 16:31:48 -05:00
Chris Fallin
f2fe0c669e Merge pull request #2929 from cfallin/bb-offsets
Provide BB layout info externally in terms of code offsets.
2021-05-24 14:27:53 -07:00
Chris Fallin
37ca06ad3a Merge pull request #2928 from afonso360/aarch64-i128-ops
Implement iadd,isub,imul for i128 in AArch64
2021-05-24 13:27:36 -07:00
Chris Fallin
800cf25bb5 Make the CFG metadata computation conditional on a flag. 2021-05-24 13:01:15 -07:00
Afonso Bordado
4ddbfe50ba aarch64: Implement imul for i128 operands 2021-05-24 18:23:30 +01:00
Chris Fallin
11a2ef01e7 Provide BB layout info externally in terms of code offsets.
This is sometimes useful when performing analyses on the generated
machine code: for example, some kinds of code verifiers will want to do
a control-flow analysis, and it is much easier to do this if one does
not have to recover the CFG from the machine code (doing so requires
heavyweight analysis when indirect branches are involved). If one trusts
the control-flow lowering and only needs to verify other properties of
the code, this can be very useful.
2021-05-24 09:18:06 -07:00
Afonso Bordado
a2e74b2c45 aarch64: Implement isub for i128 operands 2021-05-22 21:51:41 +01:00
Afonso Bordado
d3b525fa29 aarch64: Implement iadd for i128 operands 2021-05-22 21:21:44 +01:00
Alex Crichton
76c6b83f6a Use tarballs for Rust API docs on CI (#2922)
Looks like GitHub Actions takes 10m+ to upload the documentation and
nearly 10 minutes to download it. I suspect this has to do with the
creation of thousands of files, and using `tar` here is likely much
faster. Let's test it out!
2021-05-22 11:08:45 -05:00
Dan Gohman
b8fd632fb5 Remove test-all.sh. (#2926)
test-all.sh isn't run in CI, and is out of date with respect to what we
do run in CI, so remove it so that we don't have to awkwardly maintain it.
2021-05-22 00:02:11 -05:00
Johnnie Birch
9a5c9607e1 Vpopcnt for x64 2021-05-21 19:23:26 -07:00
Chris Fallin
65e0e20210 Merge pull request #2892 from afonso360/aarch64-multireg-args
Handle i128 arguments in the aarch64 ABI
2021-05-21 16:57:42 -07:00
Alex Crichton
7db94f5869 Don't verify publishing peepmatic crates (#2923)
Using `--no-verify` avoids building z3 which should shave at least 10
minutes off CI where the `verify-publish` builder currently takes ~30
minutes.
2021-05-21 16:26:55 -05:00
Chris Fallin
824fa69756 Merge pull request #2924 from cfallin/remove-readme-wasi-tokio
Remove reference to non-existent README.md in wasi-tokio crate.
2021-05-21 14:12:46 -07:00
Chris Fallin
ca39f954da Remove reference to non-existent README.md in wasi-tokio crate 2021-05-21 14:08:28 -07:00
Chris Fallin
95559c01aa Merge pull request from GHSA-hpqh-2wqx-7qp5
Fix spillslot reload of narrow values: zero-extend, don't sign-extend. Release v0.74.0 as security-patch release.
2021-05-21 12:01:55 -07:00
Pat Hickey
0f5bdc6497 only wasi_cap_std_sync and wasi_tokio need to define WasiCtxBuilders (#2917)
* wasmtime-wasi: re-exporting this WasiCtxBuilder was shadowing the right one

wasi-common's WasiCtxBuilder is really only useful wasi_cap_std_sync and
wasi_tokio to implement their own Builder on top of.

This re-export of wasi-common's is 1. not useful and 2. shadow's the
re-export of the right one in sync::*.

* wasi-common: eliminate WasiCtxBuilder, make the builder methods on WasiCtx instead

* delete wasi-common::WasiCtxBuilder altogether

just put those methods directly on &mut WasiCtx.

As a bonus, the sync and tokio WasiCtxBuilder::build functions
are no longer fallible!

* bench fixes

* more test fixes
2021-05-21 12:59:39 -05:00
Afonso Bordado
fbcfffdeab Handle spilling i128 arguments into the stack in aarch64 2021-05-21 17:05:41 +01:00
theduke
817d72a7b7 Implement std::fmt::Debug for InterruptHandle (#2915) 2021-05-21 10:54:47 -05:00
Alex Crichton
7d20368756 Try to fix CI (#2918)
Fixes a few issues that have been cropping up:

* Update `rustup` on Windows to latest to skip over the 1.24.1 installed
  on GitHub Actions which can fail to install.
* Remove the no-longer-needed `define-llvm-env` action
* Install generic llvm/lldb packges instead of specific ones that may
  migrate in versions over time.
2021-05-21 10:54:37 -05:00
Chris Fallin
88455007b2 Bump Wasmtime to v0.27.0 and Cranelift to v0.74.0. 2021-05-20 14:06:41 -07:00
Chris Fallin
8b9057a18f Merge pull request #2914 from abrown/fcvt_from_uint
x64: lower fcvt_from_uint to VCVTUDQ2PS when possible
2021-05-19 15:59:41 -07:00
Andrew Brown
54b45d28a3 x64: lower fcvt_from_uint to VCVTUDQ2PS when possible
When AVX512VL and AVX512F are available, use a single instruction
(`VCVTUDQ2PS`) instead of a length 9-instruction sequence. This
optimization is a port from the legacy x86 backend.
2021-05-19 12:20:11 -07:00
Chris Fallin
a1c9b06cea Fix spillslot reload of narrow values: zero-extend, don't sign-extend.
Previously, the x64 backend's ABI code would generate a sign-extending
load when loading a less-than-64-bit integer from a spillslot. This is
incorrect: e.g., for i32s > 0x80000000, this would result in all high
bits set.

This interacts poorly with another optimization. Normally, the invariant
is that the high bits of a register holding a value of a certain type,
beyond that type's bits, are undefined. However, as an optimization, we
recognize and use the fact that on x86-64, 32-bit instructions zero the
upper 32 bits. This allows us to elide a 32-to-64-bit zero-extend op
(turning it into just a move, which can then sometimes disappear
entirely due to register coalescing).

If a spill and reload happen between the production of a 32-bit value
from an instruction known to zero the upper bits and its use, then we
will rely on zero upper bits that might actually be set by a
sign-extend. This will result in incorrect execution.

As a fix, we stick to a simple invariant: we always spill and reload a
full 64 bits when handling integer registers on x64. This ensures that
no bits are mangled.
2021-05-19 12:19:19 -07:00
Till Schneidereit
3b3b126fe2 Refer to BA security policy (#2912) 2021-05-19 18:24:42 +02:00
Chris Fallin
33086493dc Merge pull request #2911 from olivierlemasle/tests
cranelift: move wasmtests in cranelift-wasm
2021-05-18 15:09:29 -07:00
Olivier Lemasle
954f7d3876 cranelift: move wasmtests in cranelift-wasm
Move test data used by cranelift-wasm's tests in
the crate directory, to make the tests autonomous.

Fixes #2910
2021-05-18 22:48:52 +02:00
Peter Huene
18c61cdfa4 Merge pull request #2900 from peterhuene/benchmark-instantiation
Implement simple benchmarks for instantiation.
2021-05-17 16:52:13 -07:00
Andrew Brown
7ef3ae2903 x64: implement vselect with variable blend instructions
This change implements `vselect` using SSE4.1's `BLENDVPS`, `BLENDVPD`,
and `PBLENDVB`. `vselect` is a lane-selecting instruction that is used
by
[simple_preopt.rs](fa1faf5d22/cranelift/codegen/src/simple_preopt.rs (L947-L999))
to lower `bitselect` to a single x86 instruction when the condition mask
is known to be boolean (all 1s or 0s, e.g., from a conversion). This is
better than `bitselect` in general, which lowers to 4-5 instructions.
The old backend had the `vselect` lowering; this simply introduces it to
the new backend.
2021-05-17 11:23:33 -07:00
Andrew Brown
0742bb4699 Update cast crate, remove cargo-deny rules (#2909)
Previously the inclusion of the `criterion` crate had brought in a
transitive dependency to `cast`, which used old versions of several
libraries. Now that https://github.com/japaric/cast.rs/pull/26 is merged
and a new version published, we can update `cast` and remove the
cargo-deny rules for the duplicated, older versions.
2021-05-17 11:40:10 -05:00
Olivier Lemasle
b5f29bd3b2 Update wasm-tools crates (#2908)
wasmparser 0.78 adds the Unknown name subsection type.
2021-05-17 10:08:17 -05:00
Andrew Brown
bc0df92137 peepmatic: rebuild peephole optimizers after cranelift/meta change 2021-05-17 06:54:45 -07:00
Andrew Brown
84b6f05971 cranelift: remove unreachable scalar lowerings of saturating arithmetic
Since `uadd_sat`, `sadd_sat`, `usub_sat`, and `ssub_sat` are now only
available to vector types, this removes the lowering code for the
scalar versions of these instructions in the arm32 and aarch64 backends.
2021-05-17 06:54:45 -07:00
Andrew Brown
1fe7676831 cranelift: only allow vector types with saturating arithmetic
This fixes #2883 by restricting which types are available to the `uadd_sat`, `sadd_sat`, `usub_sat`, and `ssub_sat` IR operations.
2021-05-17 06:54:45 -07:00
Andrew Brown
e676589b0c x64: lower i64x2.imul to VPMULLQ when possible
This adds the machinery to encode the VPMULLQ instruction which is
available in AVX512VL and AVX512DQ. When these feature sets are
available, we use this instruction instead of a lengthy 12-instruction
sequence.
2021-05-13 20:14:05 -07:00
Andrew Brown
5929a5e6ee x64: improve arithmetic filetests 2021-05-13 20:14:05 -07:00
Andrew Brown
c982d2be65 x64: move multiplication lowering
Since the lowering of `imul` complicated the other ALU operations it was
matched with and since future commits will alter the multiplication
lowering further, this change moves the `imul` lowering to its own match
block.
2021-05-13 20:14:05 -07:00
Peter Huene
1b8efa7bbd Implement simple benchmarks for instantiation.
This adds benchmarks around module instantiation using criterion.

Both the default (i.e. on-demand) and pooling allocators are tested
sequentially and in parallel using a thread pool.

Instantiation is tested with an empty module, a module with a single page
linear memory, a larger linear memory with a data initializer, and a "hello
world" Rust WASI program.
2021-05-13 19:27:39 -07:00
Chris Fallin
fa1faf5d22 Merge pull request #2749 from MaxGraey/fix-small-memset
cranelift: properly splatting bytes in emit_small_memset
2021-05-13 13:39:28 -07:00
MaxGraey
38140900f1 properly splatting bytes in emit_small_memset 2021-05-13 22:05:30 +03:00
Andrew Brown
6fb2a24c6b Temporarily ignore multiple versions of criterion's build dependencies
Until https://github.com/japaric/cast.rs/pull/26 is resolved, the `cast`
crate will pull in older versions of the `rustc_version`, `semver`, and
`semver-parser` crates. `cast` is a build dependency of `criterion`
which is used for benchmarking and is itself a dev dependency, not a
normal dependency.
2021-05-13 10:46:08 -07:00