Commit Graph

6778 Commits

Author SHA1 Message Date
Benjamin Bouvier
aa103698d4 machinst x64: extend Copysign to work for f64 inputs too; 2020-07-24 19:29:12 +02:00
Benjamin Bouvier
694af3aec2 machinst x64: implement float Floor/Ceil/Trunc/Nearest as VM calls; 2020-07-24 19:29:12 +02:00
Benjamin Bouvier
48ec806a9d machinst x64: implement Fabs/Fneg in terms of other instructions; 2020-07-24 19:29:12 +02:00
Benjamin Bouvier
03b9e1e86a machinst x64: implement float min/max with the right semantics; 2020-07-24 19:29:12 +02:00
Benjamin Bouvier
e43310a088 machinst x64: adapt conversions for saturation behaviors; 2020-07-24 19:29:12 +02:00
Benjamin Bouvier
cd54f05efd machinst x64: implement float-to-int and int-to-float conversions; 2020-07-24 19:29:12 +02:00
Chris Fallin
37a09c4ef6 Merge pull request #2066 from cfallin/machinst-timing
Add timing for several new-backend stages.
2020-07-23 10:54:17 -07:00
Yury Delendik
42127aac4e Refactor Cache logic to include debug information (#2065)
* move caching to the CompilationArtifacts

* mv cache_config from Compiler to CompiledModule

* hash isa flags

* no cache for wasm2obj

* mv caching to wasmtime crate

* account each Compiler field when hash
2020-07-23 12:10:13 -05:00
Chris Fallin
2b9fefe89a Add timing for several new-backend stages.
This PR adds a bit more granularity to the output of e.g. `clif-util
compile -T`, indicating how much time is spent in VCode lowering and
various other new-backend-specific tasks.
2020-07-23 09:54:39 -07:00
Chris Fallin
87eb4392c4 Merge pull request #2063 from jgouly/vselect
arm64: Implement Vselect opcode
2020-07-22 13:35:46 -07:00
Chris Fallin
44ef8247a9 Merge pull request #2062 from akirilov-arm/extract_lane
AArch64: Improve code generation for Extractlane + Sextend / Uextend
2020-07-22 13:35:00 -07:00
Chris Fallin
d22cefd220 Merge pull request #2058 from cfallin/aarch64-fix-bool
Aarch64 codegen: represent bool `true` as -1, not 1.
2020-07-22 13:16:12 -07:00
Chris Fallin
b8f6d53a6b Aarch64 codegen: represent bool true as -1, not 1.
It seems that this is actually the correct behavior for bool types wider
than `b1`; some of the vector instruction optimizations depend on bool
lanes representing false and true as all-zeroes and all-ones
respectively. For `b8`..`b64`, this results in an extra negation after a
`cset` when a bool is produced by an `icmp`/`fcmp`, but the most common
case (`b1`) is unaffected, because an all-ones one-bit value is just
`1`.

An example of this assumption can be seen here:

399ee0a54c/cranelift/codegen/src/simple_preopt.rs (L956)

Thanks to Joey Gouly of ARM for noting this issue while implementing
SIMD support, and digging into the source (finding the above example) to
determine the correct behavior.
2020-07-22 12:30:55 -07:00
Alex Crichton
8a04fc3cdc Refactor wasmtime's internal cache slightly (#2057)
Be more generic over the type being serialized, and remove an
intermediate struct which was only used for serialization but isn't
necessary.
2020-07-22 10:32:53 -05:00
Joey Gouly
5355c3e3d5 arm64: Implement Vselect opcode
This is implemented the same as Bitselect, as the controlling vector
is a boolean vector. A boolean vector in cranelift has elements
that are either 0 or all 1s, so it can be used to select elements
lane wise.

Copyright (c) 2020, Arm Limited.
2020-07-22 12:50:29 +01:00
Anton Kirilov
420c4f06b8 AArch64: Improve code generation for Extractlane + Sextend / Uextend
Copyright (c) 2020, Arm Limited.
2020-07-22 11:47:51 +01:00
Yury Delendik
399ee0a54c Serialize and deserialize compilation artifacts. (#2020)
* Serialize and deserialize Module
* Use bincode to serialize
* Add wasm_module_serialize; docs
* Simple tests
2020-07-21 15:05:50 -05:00
Nick Fitzgerald
c420f65214 Merge pull request #2052 from fitzgen/gc-function-in-c-api
Add a GC function to the C API
2020-07-21 10:37:20 -07:00
Nick Fitzgerald
17b99cc9c8 examples: Add a GC call to the externref Rust example 2020-07-21 09:43:01 -07:00
Nick Fitzgerald
56c517d265 examples: Add a GC call to the externref C example 2020-07-21 09:42:54 -07:00
Nick Fitzgerald
2efb46afd5 wasmtime-c-api: Add wasmtime_store_gc for GCing externrefs 2020-07-21 09:33:34 -07:00
Chris Fallin
96ef2f1a1b Fix u8::MAX -> std::u8::MAX. (#2047)
As per Carlo Kok on Zulip #cranelift, this breaks builds with stable
Rust pre-1.43, as `core::u8::MAX` was only stabilized then. We'd like to
support older versions if we can easily do so.

This PR also adds `cranelift-tools` to the crates checked on CI with
Rust 1.41.0, which pulls in all backends (including `aarch64`).
2020-07-20 14:59:15 -05:00
Dan Gohman
4c15a4daf2 Use AsRef<Path> instead of AsRef<OsStr> in yanix functions (#1950)
* Use AsRef<Path> instead of AsRef<OsStr> in yanix functions.

`AsRef<Path>` makes these more consistent with `std` interfaces, making
them easier to use outside of wasi-common.

Also, refactor the conversion to `CString` into a helper function.

* Reduce clutter from fully-qualifying names.

* rustfmt
2020-07-20 10:02:45 -07:00
Chris Fallin
784e2f1480 Merge pull request #2038 from jgouly/arith2
arm64: Enable arith2 tests
2020-07-20 09:00:10 -07:00
Ömer Sinan Ağacan
3fcf9fcf3e Cranelift README: fix markdown link syntax (#2044) 2020-07-19 19:41:53 -05:00
Chris Fallin
1b3b2dbfd0 Merge pull request #2043 from cfallin/csel-opt
Aarch64: handle csel with icmp/fcmp source without materializing the bool.
2020-07-18 19:33:47 -07:00
Chris Fallin
ea894c0eeb Merge pull request #2042 from cfallin/aarch64-fix-regshift-mask
Aarch64: mask shift-amounts incorporated into reg-reg-shift ALU insts.
2020-07-18 19:33:35 -07:00
Chris Fallin
21dac670f0 Aarch64: handle csel with icmp/fcmp source without materializing the bool.
Previously, we simply compared the input bool to 0, which forced the
value into a register (usually via a cmp and cset), zero-extended it,
etc. This patch performs the same pattern-matching that branches do to
directly perform the cmp and use its flag results with the csel.

On the `bz2` benchmark, the runtime is affected as follows (measuring
with `perf stat`, using wasmtime with its cache enabled, and taking the
second run after the first compiles and populates the cache):

pre:

       1117.232000      task-clock (msec)         #    1.000 CPUs utilized
               133      context-switches          #    0.119 K/sec
                 1      cpu-migrations            #    0.001 K/sec
             5,041      page-faults               #    0.005 M/sec
     3,511,615,100      cycles                    #    3.143 GHz
     4,272,427,772      instructions              #    1.22  insn per cycle
   <not supported>      branches
        27,980,906      branch-misses

       1.117299838 seconds time elapsed

post:

       1003.738075      task-clock (msec)         #    1.000 CPUs utilized
               121      context-switches          #    0.121 K/sec
                 0      cpu-migrations            #    0.000 K/sec
             5,052      page-faults               #    0.005 M/sec
     3,224,875,393      cycles                    #    3.213 GHz
     4,000,838,686      instructions              #    1.24  insn per cycle
   <not supported>      branches
        27,928,232      branch-misses

       1.003440004 seconds time elapsed

In other words, with this change, on `bz2`, we see a 6.3% reduction in
executed instructions.
2020-07-17 21:10:21 -07:00
Nick Fitzgerald
b35cf7cf8e Merge pull request #1960 from fitzgen/peepmatic-generic-over-ir
peepmatic: Be generic over the IR we are optimizing
2020-07-17 17:05:55 -07:00
Nick Fitzgerald
ee5982fd16 peepmatic: Be generic over the operator type
This lets us avoid the cost of `cranelift_codegen::ir::Opcode` to
`peepmatic_runtime::Operator` conversion overhead, and paves the way for
allowing Peepmatic to support non-clif optimizations (e.g. vcode optimizations).

Rather than defining our own `peepmatic::Operator` type like we used to, now the
whole `peepmatic` crate is effectively generic over a `TOperator` type
parameter. For the Cranelift integration, we use `cranelift_codegen::ir::Opcode`
as the concrete type for our `TOperator` type parameter. For testing, we also
define a `TestOperator` type, so that we can test Peepmatic code without
building all of Cranelift, and we can keep them somewhat isolated from each
other.

The methods that `peepmatic::Operator` had are now translated into trait bounds
on the `TOperator` type. These traits need to be shared between all of
`peepmatic`, `peepmatic-runtime`, and `cranelift-codegen`'s Peepmatic
integration. Therefore, these new traits live in a new crate:
`peepmatic-traits`. This crate acts as a header file of sorts for shared
trait/type/macro definitions.

Additionally, the `peepmatic-runtime` crate no longer depends on the
`peepmatic-macro` procedural macro crate, which should lead to faster build
times for Cranelift when it is using pre-built peephole optimizers.
2020-07-17 16:16:49 -07:00
Chris Fallin
9bd9c628aa Aarch64: mask shift-amounts incorporated into reg-reg-shift ALU insts.
We had previously fixed a bug in which constant shift amounts should be
masked to modulo the number of bits in the operand; however, we did not
fix the analogous case for shifts incorporated into the second register
argument of ALU instructions that support integrated shifts.  This
failure to mask resulted in illegal instructions being generated, e.g.
in https://bugzilla.mozilla.org/show_bug.cgi?id=1653502. This PR fixes
the issue by masking the amount, as the shift semantics require.
2020-07-17 14:55:23 -07:00
Nick Fitzgerald
ae95ad8733 cranelift: Don't build peepmatic-based optimizations in build.rs
Instead, when the `rebuild-peephole-optimizers` feature is enabled, rebuild them
the first time they are used. This allows peepmatic to run when Cranelift's
`Opcode` is defined and available, which paves the way forward for:

* merging `peepmatic_runtime::operator::Operator` and Cranelift's `Opcode` (we
  are wasting a bunch of cycles converting between the two of them), and

* supporting vcode optimizations in `peepmatic`.
2020-07-17 14:35:16 -07:00
Alex Crichton
978070c020 Verify crates are publish-able on CI (#2036)
This commit updates our CI to verify that all crates are publish-able at
all times on every commit. During the 0.19.0 release we found another
case where the crates as they live in this repository weren't
publish-able, so the hope is that this no longer comes up again!

The script added in this commit also takes the time/liberty to remove
the existing bump/publish scripts and instead replace them with one Rust
script originally sourced from wasm-bindgen. The intention of this
script is that it has three modes:

* `./publish bump` - bumps version numbers which are sent as a PR to get
  reviewed (probably with a changelog as well)

* `./publish verify` - run on CI on every commit, builds every crate we
  publish as if it's being published to crates.io, notably without raw
  access to other crates in the repository.

* `./publish publish` - publishes all crates to crates.io, passing the
  `--no-verify` flag to make this a much speedier process than it is
  today.
2020-07-17 16:19:35 -05:00
Johnnie Birch
a7cedf3100 Add support for 32 bit and 64 bit fcmp for the new backend
Implements commiss and commisd.
2020-07-17 13:46:54 -07:00
Alex Crichton
fbc05faa49 Fix wasm_val_copy for null funcref/externref (#2041)
This commit fixes `Clone for wasm_val_t` to avoid attempting to chase a
null pointer. It also fixes the implementation for `FuncRef` values by
cloning their internal `wasm_ref_t` as well.
2020-07-17 14:46:02 -05:00
Alex Crichton
3aeab23bf1 Fix leaking funcrefs in the C API (#2040)
This commit adds a case to the destructor of `wasm_val_t` to be sure to
deallocate the `Box<wasm_ref_t>`.
2020-07-17 14:45:55 -05:00
Alex Crichton
c3ff0754d4 Fix a panic with Func::new and reference types (#2039)
Currently `Func::new` will panic if one of the arguments of the function
is a reference type and the `Store` doesn't have reference types
enabled. This happens because cranelift isn't configure to enable stack
maps but the register allocators expects them to exist when reference
types are seen.

The fix here is to always enable reference types in cranelift for our
trampoline generation and `Func::new`. This should hopefully ensure that
trampolines are generated correctly and they'll just not be able to get
hooked up to an `Instance` because validation will prevent reference
types from being used elsewhere.
2020-07-17 12:05:42 -05:00
Nick Fitzgerald
8dd4ab2f1e Merge pull request #2022 from MaxGraey/peepmatic-bnot
peepmatic: Add bnot operation
2020-07-17 09:39:38 -07:00
Nikolay Volf
4f4edc7aef Remove spam from "do_remove_constant_phis" 2020-07-17 18:14:16 +02:00
Joey Gouly
40473dffed arm64: Enable arith2 tests
Copyright (c) 2020, Arm Limited.
2020-07-17 15:58:16 +01:00
Benjamin Bouvier
ead8a835c4 machinst x64: add more FP support 2020-07-17 15:56:44 +02:00
bjorn3
5c5a30f76c Fix review comments 2020-07-17 12:03:17 +02:00
bjorn3
7b7b1f4997 Rename sarg__ to sarg_t 2020-07-17 12:03:17 +02:00
bjorn3
4971d9ee80 Merge {make_incoming,get_outgoing}_{,struct_}arg 2020-07-17 12:03:17 +02:00
bjorn3
0d4fa6d32a Fix review comments 2020-07-17 12:03:17 +02:00
bjorn3
4431ac1108 Implement SystemV struct argument passing 2020-07-17 12:03:17 +02:00
Alex Crichton
2f368ed5d6 Fixes needed for 0.19.0 (#2035)
* Add some more wiggle crates to publish

* Fix build of wasi-common on crates.io

* Bump crates to 0.19.1 to fix crates.io build
2020-07-16 17:27:21 -05:00
MaxGraey
c653c563dd Merge branch 'main' into peepmatic-bnot 2020-07-16 22:01:18 +03:00
Chris Fallin
5e0268a542 Merge pull request #2034 from cfallin/update-regalloc
Update to regalloc.rs 0.0.28.
2020-07-16 11:36:11 -07:00
Alex Crichton
63d5b91930 Wasmtime 0.19.0 and Cranelift 0.66.0 (#2027)
This commit updates Wasmtime's version to 0.19.0, Cranelift's version to
0.66.0, and updates the release notes as well.
2020-07-16 12:46:21 -05:00