* Implement `iabs` in ISLE (AArch64)
Converts the existing implementation of `iabs` for AArch64 into ISLE,
and fixes support for `iabs` on scalar values.
Copyright (c) 2022 Arm Limited.
* Improve scalar `iabs` implementation.
Also introduces `CSNeg` instruction.
Copyright (c) 2022 Arm Limited
* Convert `scalar_to_vector` to ISLE (AArch64)
Converted the exisiting implementation of `scalar_to_vector` for AArch64 to
ISLE.
Copyright (c) 2022 Arm Limited
* Add support for floats and fix FpuExtend
- Added rules to cover `f32 -> f32x4` and `f64 -> f64x2` for
`scalar_to_vector`
- Added tests for `scalar_to_vector` on floats.
- Corrected an invalid instruction emitted by `FpuExtend` on 64-bit
values.
Copyright (c) 2022 Arm Limited
This commit augments the current translation phase of components with
extra machinery to track the type information of component items such as
instances, components, and functions. The end goal of this commit is to
enable the `Lower` instruction to know the type of the component
function being lowered. Currently during the inlining pass where
component fusion is detected the type of the lifted function is known,
but to implement fusion entirely the type of the lowered function must
be known. Note that these two types are expected to be different to
allow for the subtyping rules specified by the component model.
For now nothing is actually done with this information other than noting
its presence in the face of a lifted-then-lowered function. My hope
though was to split this out for a separate review to avoid making a
future component-adapter-compiler-containing-PR too large.
This moves them into a new `wasmtime-asm-macros` crate that can be used not just
from the `wasmtime-fibers` crate but also from other crates (e.g. we will need
them in https://github.com/bytecodealliance/wasmtime/pull/4431).
This commit fixes an issue with the initialization of element segments
when one of the elements in the element segment is `ref.func null`.
Previously the contents of a table were accidentally initialized with
the raw value of the `*mut VMCallerCheckedAnyfunc` which bypassed the
"this is initialized" encoding of function table entries that Wasmtime
uses for lazy table initialization. The fix here was to ensure that the
encoded form is used.
The impact of this issue is that a module could panic at runtime when
accessing a table element that was initialized with an element segment
containing a `ref.null func` entry. This only happens with imported
tables in a WebAssembly module where the table itself was defined on the
host. If the table was defined in another wasm module or in the local
wasm module this bug would not occur. Additionally this bug requires
enabling the reference types proposal for WebAssembly (which is enabled
by default) due to the usage of encodings for null funcrefs in element
segments.
When parsing isa specific values we were accidentally discarding the
value of the flag, and treating it always as a boolean flag.
This would cause a `clif-util` invocation such as
`cargo run -- compile -D --set has_sse41=false --target x86_64 test.clif`
to be interpreted as `--set has_sse41` and enable that feature instead
of disabling it.
Rather than sometimes using `file-per-thread-logger`.
Also remove the debug CLI flags, so that we can always just define
`RUST_LOG=...` to get logging and don't need to also do other things.
* implement wasmtime::component::flags! per #4308
This is the last macro needed to complete #4308. It supports generating a Rust
type that represents a `flags` component type, analogous to how the [bitflags
crate](https://crates.io/crates/bitflags) operates.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* wrap `format_flags` output in parens
This ensures we generate non-empty output even when no flags are set. Empty
output for a `Debug` implementation would be confusing.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* unconditionally derive `Lift` and `Lower` in wasmtime::component::flags!
Per feedback on #4414, we now derive impls for those traits unconditionally,
which simplifies the syntax of the macro.
Also, I happened to notice an alignment bug in `LowerExpander::expand_variant`,
so I fixed that and cleaned up some related code.
Finally, I used @jameysharp's trick to calculate bit masks without looping.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* fix shift overflow regression in previous commit
Jamey pointed out my mistake: I didn't consider the case when the flag count was
evenly divisible by the representation size. This fixes the problem and adds
test cases to cover it.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* Allow using jump-tables multiple times (fixes#3347)
If there are multiple `br_table` instructions using the same jump table,
then `append_jump_argument` must not modify the jump table in-place.
When this function is called, we don't know if more `br_table`
instructions might be added later. So this patch conservatively assumes
that all jump tables might be reused. If Cranelift needs to add a block
argument to a block that's the target of some jump table, then the jump
table will be unconditionally cloned.
I'm not sure if having duplicated and unused jump tables will turn out
to be a compile-time performance issue. If it is, there's discussion in
issue #3347 about ways to determine that there can't be any more uses of
a jump table, so that it's safe to modify in-place.
* Re-enable cranelift-fuzzgen fuzz target
I've been running this fuzz target for an hour without finding new bugs.
Let's see if oss-fuzz finds anything now.
As @MaxGraey pointed out (thanks!) in #4397, `round` has different
behavior from `nearest`. And it looks like the native rust
implementation is still pending stabilization.
Right now we duplicate the wasmtime implementation, merged in #2171.
However, we definitely should switch to the rust native version
when it is available.
* fuzz: add a single instruction module generator
As proposed by @cfallin in #3251, this change adds a way to generate a
Wasm module for a single instruction. It captures the necessary
parameter and result types so that fuzzing can not only choose which
instruction to check but also generate values to pass to the
instruction. Not all instructions are available yet, but a significant
portion of scalar instructions are implemented in this change.
This does not wire the generator up to any fuzz targets.
* review: use raw string in test
* review: remove once_cell, use slices
* review: refactor macros to use valtype!
* review: avoid cloning when choosing a SingleInstModule
The compile step that cranelift-fuzzgen does also triggers IR
verification. So all bugs that cranelift-fuzzgen-verify could catch are
also caught by cranelift-fuzzgen. Removing redundant fuzzers lets us
spend limited fuzz-testing CPU time budgets better.
This commit removes Wasmtime's dependency on the `region` crate. The
motivation for this came about when I was updating dependencies and saw
that `region` had a new major version at 3.0.0 as opposed to our
currently used 2.3 track. In reviewing the use cases of `region` within
Wasmtime I found two trends in particular which motivated this commit:
* Some unix-specific areas of `wasmtime_runtime` use
`rustix::mm::mprotect` instead of `region::protect` already. This
means that the usage of `region::protect` for changing virtual memory
protections was already inconsistent.
* Many uses of `region::protect` were already in unix-specific regions
which could make use of `rustix`.
Overall I opted to remove the dependency on the `region` crate to avoid
chasing its versions over time. Unix-specific changes of protections
were easily changed to `rustix::mm::mprotect`. There were two locations
where a windows/unix split is now required and I subjectively ruled
"that seems ok". Finally removing `region` also meant that the "what is
the current page size" query needed to be inlined into
`wasmtime_runtime`, which I have also subjectively ruled "that seems
fine".
Finally one final refactoring here was that the `unix.rs` and `linux.rs`
split for the pooling allocator was merged. These two files already only
differed in one function so I slapped a `cfg_if!` in there to help
reduce the duplication.
Introduce a new concept in the IR that allows a producer to create
dynamic vector types. An IR function can now contain global value(s)
that represent a dynamic scaling factor, for a given fixed-width
vector type. A dynamic type is then created by 'multiplying' the
corresponding global value with a fixed-width type. These new types
can be used just like the existing types and the type system has a
set of hard-coded dynamic types, such as I32X4XN, which the user
defined types map onto. The dynamic types are also used explicitly
to create dynamic stack slots, which have no set size like their
existing counterparts. New IR instructions are added to access these
new stack entities.
Currently, during codegen, the dynamic scaling factor has to be
lowered to a constant so the dynamic slots do eventually have a
compile-time known size, as do spill slots.
The current lowering for aarch64 just targets Neon, using a dynamic
scale of 1.
Copyright (c) 2022, Arm Limited.
Previously, much of the logic for generating the various objects needed
for fuzzing was concentrated primarily in `generators.rs`. In trying to
piece together what code does what, the size of the file and the light
documentation make it hard to discern what each part does. Since several
generator structures had been split out as separate modules in the
`generators/` directory, this change takes that refactoring further by
moving the structures in `generators.rs` to their own modules. No logic
changes were made, only the addition of documentation in a few places.
@yuyang-ok reported via zulip that i128 overflow tests were:
1. different from the interpreter implementation
2. wrong on some of the test cases
This fixes both the tests and the aarch64 implementation and adds the
interpreter to the testsuite.
* wasmtime: Rename host->wasm trampolines
As we introduce new types of trampolines, having clear names for our existing
trampolines will be helpful.
* Fix typo in docs for `VMCOMPONENT_MAGIC`
* wasmtime: Add criterion micro benchmarks for traps
* x64: port `atomic_rmw` to ISLE
This change ports `atomic_rmw` to ISLE for the x64 backend. It does not
change the lowering in any way, though it seems possible that the fixed
regs need not be as fixed and that there are opportunities for single
instruction lowerings. It does rename `inst_common::AtomicRmwOp` to
`MachAtomicRmwOp` to disambiguate with the IR enum with the same name.
* x64: remove remaining hardcoded register constraints for `atomic_rmw`
* x64: use `SyntheticAmode` in `AtomicRmwSeq`
* review: add missing reg collector for amode
* review: collect memory registers in the 'late' phase
Additionally remove the `rlib` crate type so it's possible to build the
API with LTO options if configured (otherwise Cargo ignores LTO
configuration with an `rlib` output since it would hit an error in
rustc)
Consulting oss-fuzz it looks like these fuzz targets are crashing 100%
of the time partly due to #3347 I believe. Until that's fixed this hopes
to reclaim the time used on oss-fuzz for other fuzzers to make progress.
This PR removes all minutes and agendas in `meetings/`. These were
previously hosted in this repository, but we found that it makes things
somewhat more complex with respect to CI configuration and merge
permissions to have both small, CI-less changes to the text in
`meetings/` as well as changes to everything else in one repository.
The minutes and agendas have been split out into the repository at
https://github.com/bytecodealliance/meetings/, with all history
preserved. Future agenda additions and minutes contributions should go
there as PRs.
Finally, this PR adds a small note to our "Contributing" doc to note the
existence of the meetings and invite folks to ask to join if interested.
* Cranelift: make `ir::Type` a `u16`.
* Cranelift: pack ValueData back into 64 bits.
After extending `Type` to a `u16`, `ValueData` became 12 bytes rather
than 8. This packs it back down to 8 bytes (64 bits) by stealing two
bits from the `Type` for the enum discriminant (leaving 14 bits for the
type itself).
Performance comparison (3-way between original (`ty-u8`), 16-bit `Type`
(`ty-u16`), and this PR (`ty-packed`)):
```
~/work/sightglass% target/release/sightglass-cli benchmark \
-e ~/ty-u8.so -e ~/ty-u16.so -e ~/ty-packed.so \
--iterations-per-process 10 --processes 2 \
benchmarks-next/spidermonkey/benchmark.wasm
compilation
benchmarks-next/spidermonkey/benchmark.wasm
cycles
[20654406874 21749213920.50 22958520306] /home/cfallin/ty-packed.so
[22227738316 22584704883.90 22916433748] /home/cfallin/ty-u16.so
[20659150490 21598675968.60 22588108428] /home/cfallin/ty-u8.so
nanoseconds
[5435333269 5723139427.25 6041072883] /home/cfallin/ty-packed.so
[5848788229 5942729637.85 6030030341] /home/cfallin/ty-u16.so
[5436002390 5683248226.10 5943626225] /home/cfallin/ty-u8.so
```
So, when compiling SpiderMonkey.wasm, making `Type` 16 bits regresses
performance by 4.5% (5.683s -> 5.723s), while this PR gets 14 bits for a 1.0%
cost (5.683s -> 5.723s). That's still not great, and we can likely do better,
but it's a start.
* Fix test failure: entities to/from u32 via `{from,to}_bits`, not `{from,to}_u32`.
* Update differential fuzzing configuration
This uses some new features of `wasm-smith` and additionally tweaks the
existing fuzz configuration:
* More than one function is now allowed to be generated. There's no
particular reason to limit differential execution to just one and we
may want to explore other interesting module shapes.
* More than one function type is now allowed to possibly allow more
interesting `block` types.
* Memories are now allowed to grow beyond one page, but still say small
by staying underneath 10 pages.
* Tables are now always limited in their growth to ensure consistent
behavior across engines (e.g. with the pooling allocator vs v8).
* The `export_everything` feature is used instead of specifying a
min/max number of exports.
The `wasmi` differential fuzzer was updated to still work if memory is
exported, but otherwise the v8 differential fuzzer already worked if a
function was exported but a memory wasn't. Both fuzzers continue to
execute only the first exported function.
Also notable from this update is that the `SwarmConfig` from
`wasm-smith` will now include an arbitrary `allowed_instructions`
configuration which may help explore the space of interesting modules
more effectively.
* Fix typos
OSS-fuzz long-ago discovered https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=45662
which we currently believe to be a bug in v8. I originally thought it
was going to be fixed with https://bugs.chromium.org/p/v8/issues/detail?id=12722
but that no longer appears to be the case now that the `v8` crate has
caught up and it still isn't fixed. Personally I've sort of lost an
appetite for continuing to debug these issues so I figure it's best to
just disable reference types with v8 for now and exercise the rest of
the engine, e.g. simd.
`fmin`/`fmax` are defined as returning -0.0 as smaller than 0.0.
This is not how the IEEE754 views these values and the interpreter was
returning the wrong value in these operations since it was just using the
standard IEEE754 comparisons.
This also tries to preserve NaN information by avoiding passing NaN's
through any operation that could canonicalize it.
* Bump versions of wasm-tools crates
Note that this leaves new features in the component model, outer type
aliases for core wasm types, unimplemented for now.
* Move to crates.io-based versions of tools
* ci: replace OpenVINO installer action
To test wasi-nn, we currently use an OpenVINO backend. The Wasmtime CI
must install OpenVINO using a custom GitHub action. This CI action has
not been updated in some time and in the meantime OpenVINO (and the
OpenVINO crates) have released several new versions.
https://github.com/abrown/install-openvino-action is an external action
that we plan to keep up to date with the latest releases. This change
replaces the current CI action with that one.
* wasi-nn: upgrade openvino dependency to v0.4.1
This eliminates a `lazy_static` dependency and changes a few parameters
to pass by reference. Importantly, it enables support for the latest
versions of OpenVINO (v2022.*) in wasi-nn.
* ci: update wasi-nn script to source correct env script
* ci: really use the correct path for the env script
Also, clarify which directory OpenVINO is installed in (the symlink may
not be present).
* Update tracing-core to a version which doesn't depend on lazy-static.
* Update crossbeam-utils to a version that doesn't depend on lazy-static.
* Update crossbeam-epoch to a version that doesn't depend on lazy-static.
* Update clap to a version that doesn't depend on lazy-static.
* Convert Wasmtime's own use of lazy_static to once_cell.
* Make `GDB_REGISTRATION`'s comment a doc comment.
* Fix compilation on Windows.