* Make send and remove wrapper around WasiNnCtx·
This removes the wrapper around WasiNnCtx and no longer requires borrow_mut(). Once send/sync
changes in OpenVINO crate are merged in it will allow·use by frameworks that requires this trait.
* Bump openvino to compatible version.
* BackendExecutionContext should be Send and Sync
* Fix rust format issues.
* Update Cargo.lock for openvino
* Audit changes to openvino crates.
Wiggle generates code that instruments APIs with tracing code. This is
handy for diagnosing issues at runtime, but when inspecting the output
of Wiggle, it can make the generated code difficult for a human to
decipher. This change makes tracing a default but optional feature,
allowing users to avoid tracing code with commands like `cargo expand
--no-default-features`. This should be no change for current crates
depending on `wiggle`, `wiggle-macro`, and `wiggle-generate`.
review: add 'tracing' feature to wasi-common
review: switch to using macro configuration parsing
Co-authored-by: Andrew Brown <andrew.brown@intel.com>
* aarch64: Fix incorrect masking for small types on bmask
`bmask` was accidentally relying on the uppermost bits of the register
for small types.
This was found by fuzzgen, when it generated a shift left followed by
a bmask, the shift left shifted the bits out of the range of the input
type (i8), however these are not automatically cleared since they
remained inside the 32 bits of the register.
That caused issues when the bmask tried to compare the whole register
instead of just the bottom bits. The solution here is to mask the upper
bits for small types.
* aarch64: Emit 32bit cmp on bmask
This fixes an issue where bmask was accidentally comparing the
upper bits of the register by always using a 64bit cmp.
* riscv: Mask high bits in bmask
* riscv: Add compile tests for br{z,nz}
* riscv: Use shifts to mask 32bit values
This produces less code than the AND since that version needs to
load an immediate constant from memory.
* cranelift: Update test input to hexadecimal values
This makes it a bit more clear what is being tested.
* riscv: Use addiw for masking 32 bit values
Co-authored-by: Trevor Elliott <telliott@fastly.com>
* aarch64: Update bmask rule priority
Co-authored-by: Trevor Elliott <telliott@fastly.com>
Add a new instruction uadd_overflow_trap, which is a fused version of iadd_ifcout and trapif. Adding this instruction removes a dependency on the iflags type, and would allow us to move closer to removing it entirely.
The instruction is defined for the i32 and i64 types only, and is currently only used in the legalization of heap_addr.
* wiggle: no longer need to guard wasmtime integration behind a feature
this existed so we could use wiggle in lucet, but lucet is long EOL
* replace wiggle::Trap with wiggle::wasmtime_crate::Trap
* wiggle tests: unwrap traps because we cant assert_eq on them
* wasi-common: emit a wasmtime::Trap instead of a wiggle::Trap
formally add a dependency on wasmtime here to make it obvious, though
we do now have a transitive one via wiggle no matter what (and therefore
can get rid of the default-features=false on the wiggle dep)
* wasi-nn: use wasmtime::Trap instead of wiggle::Trap
there's no way the implementation of this func is actually
a good idea, it will panic the host process on any error,
but I'll ask @mtr to fix that
* wiggle test-helpers examples: fixes
* wasi-common cant cross compile to wasm32-unknown-emscripten anymore
this was originally for the WASI polyfill for web targets. Those days
are way behind us now.
* wasmtime wont compile for armv7-unknown-linux-gnueabihf either
This adds a new field `types` to `ModuleTranslation`, so that
consumers can have access to the module type information known after
validation has finished. This change is useful when consumers want to
have access to the type information in wasmparser's terms rather than
in wasmtime_environ's equivalent types (e.g. `WasmFuncType`).
Now that we aren't trying to do overlap checking in parallel, we can
fuse the loop that generates a list of rule pairs with the loop that
checks those pairs.
Removing the intermediate vector of pairs should save a little time and
memory. But it also means we're no longer borrowing from the `by_term`
HashMap, so we can use `into_iter` instead of `values` to move ownership
out of the map. That in turn means that we can use `into_iter` on each
vector of rules as well, which turns out to offer a slightly nicer idiom
for looping over all pairs, and also means we drop allocations as soon
as possible.
I also pushed grouping by priority earlier, so the O(n^2) all-pairs loop
runs over smaller lists. If we later find we want to know about overlaps
across different priorities, the definition of the map key is an easy
place to make that change.
* Cranelift: disable egraphs in fuzzing for now.
As per [this comment], with a few recent discussions it's become clear
that we want to refactor egraphs in a way that will subsume, or make
irrelevant, some of the recent fuzzbugs that have arisen (and likely
lead to others, which we'll want to fix!). Rather than chase these down
then refactor later, it probably makes sense not to spend the human time
or fuzzing time doing so. This PR turns off egraphs support in fuzzing
configurations for now, to be re-enabled later.
[this comment]: https://github.com/bytecodealliance/wasmtime/issues/5126#issuecomment-1291222515
* Disable in cranelift-fuzzgen as well.
In particular, this was found to happen in #5099 because a `Result`
projection node was not deduplicating across two separate `isplit`s that
created it. (This is a separate issue we should also fix; `needs_dedup`
is I think overly conservative because `Result` can project out a single
value from a pure or impure node, but the projection itself should be
treated like any other pure operator.)
In any case, if we have a value `v0` and two separate `Result { value:
v0, result: N, ty }` nodes, each of these will fill in the type `ty` for
the `N`th output of `v0`, and the second will idempotently overwrite the
first; we should loosen the assert so that it allows this case.
Fixes#5099. Fixes#5100.
* Cranelift: pass iterators to `ABIMachineSpec::compute_arg_locs`
Instead of slices. This gives us more flexibility to pass custom sequences
without needing to allocate a `Vec` to hold them and pass in as a slice.
* Cranelift: Avoid cloning `ir::Signature`s in `SigData::from_func_sig`
This avoids two heap allocations per signature that are unnecessary 99% of the
time.
* fix typo
* Simplify condition in `missing_struct_return`
As discussed in the 2022/10/19 meeting, this PR removes many of the branch and select instructions that used iflags, in favor if using brz/brnz and select in their place. Additionally, it reworks selectif_spectre_guard to take an i8 input instead of an iflags input.
For reference, the removed instructions are: br_icmp, brif, brff, trueif, trueff, and selectif.
Looks like #5091 wasn't enough and some of the APIs needed updating with
changes made in the meantime. I've updated the action here and
additionally made a separate change where the release isn't continually
created and deleted but instead left alone and only the tag is updated.
This should work for the `dev` release and avoids deleting/recreating on
each PR, sending out notifications for new releases.
* cranelift: Remove iconst.i128
* bugpoint: Report Changed when only one instruction is mutated
* cranelift: Fix egraph bxor rule
* cranelift: Remove some simple_preopt opts for i128
Eliminate a few remaining instances of non-SSA code.
Remove infrastructure previously used for non-SSA code emission.
Related cleanup around flags handling.
A local github action we have has been broken for about a month now
meaning that the `dev` tag isn't getting updated or getting new
releases. This appears to be due to the publication of new versions of
these dependencies which are running into issues using one another. I
think I've figured out versions that work and have added a
`package-lock.json` to ensure we keep using the same versions.
The `libfuzzer-sys` update in #5068 included some changes to the
`fuzz_target!` macro which caused a bare `run` function to be shadowed
by the macro-defined `run` function (changed in
rust-fuzz/libfuzzer#95) which meant that some of our fuzz targets were
infinite looping or stack overflowing as the same function was called
indefinitely. This renames the top-level `run` function to something
else in the meantime.
Using rayon adds a lot of dependencies to Cranelift. The total
unparallelized time the code that uses rayon takes is less than half a
second and it runs at compile time, so there is pretty much no benefit
to parallelizing it.
This can help rustc/llvm avoid bounds checks, but more importantly I will have
future changes here that remove indexing of params, and instead hand them out as
an iterator.
The `SigData::from_func_sig` constructor will already ensure that the struct
return pointer is returned, so this is a purely unnecessary call.
Note that this is not a performance speed up, since
`ensure_struct_return_ptr_is_returned` doesn't do any significant work if the
signature is already legalized.
This fixes#5086 by addressing two separate issues:
- The `ValueDataPacked::set_type()` helper had an embarrassing bitfield-manipulation bug that would mangle the rest of a `ValueDef` when setting its type. This is not normally used, only when the egraph elaboration fills in types after-the-fact on a multi-value node.
- The lowering rules for `isplit` on aarch64 and s390x were dispatching on the first output type, rather than the input type. When only the second output is used (as in the example in #5086), the first output type actually remains `INVALID` (and this is fine because it's never used).
Remove uses of reg_mod from the s390x backend. This required moving away from using r0/r1 as the result registers from a few different pseudo instructions, standardizing instead on r2/r3. That change was necessary as regalloc2 will not correctly allocate registers that aren't listed in the allocatable set, which r0/r1 are not.
Co-authored-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
* Upgrade our github actions to "node16"
Each github actions run has a lot of warnings about using node12 so this
upgrades our repository to using node16. I'm hoping no other changes are
needed and I suspect other actions we're using are on node12 and will
need further updates, but this should help pin down what's remaining.
* Update `actions/checkout` workflow to `v3`
* Update to `actions/cache@v3`
* Update to `actions/upload-artifact@v3`
* Drop usage of `actions-rs/toolchain`
* Update to `actions/setup-python@v4`
* Update mdbook version
* fuzzgen: Test compiler flags
* cranelift: Generate `all()` function for all enum flags
This allows a user to iterate all flags that exist.
* fuzzgen: Minimize regalloc_checker compiles
* fuzzgen: Limit the amount of test case inputs
* fuzzgen: Add egraphs flag
It's finally here! 🥳
* cranelift: Add fuzzing comment to settings
* fuzzgen: Add riscv64
* fuzzgen: Unconditionally enable some flags
This commit fixes the `push-tag.yml` workflow to work with the new
`Cargo.toml` manifest since workspace inheritance was added. This
additionally fixes some warnings coming up on CI about our usage of
deprecated features on github actions.
* egraphs: a few miscellaneous compile-time optimizations.
These optimizations together are worth about a 2% compile-time
reduction, as measured on one core with spidermonkey.wasm as an input,
using `hyperfine` on `wasmtime compile`.
The changes included are:
- Some better pre-allocation (blockparams and side-effects concatenated
list vecs);
- Avoiding the indirection of storing list-of-types for every Pure and
Inst node, when almost all nodes produce only a single result;
instead, store arity and single type if it exists, and allow result
projection nodes to fill in types otherwise;
- Pack the `MemoryState` enum into one `u32` (this together with the
above removal of the type slice allows `Node` to
shrink from 48 bytes to 32 bytes);
- always-inline an accessor (`entry` on `CtxHash`) that wasn't
(`always(inline)` appears to be load-bearing, rather than just
`inline`);
- Split the update-analysis path into two hotpaths, one for the union
case and one for the new-node case (and the former can avoid
recomputing for the contained node when replacing a node with
node-and-child eclass entry).
* Review feedback.
* Fix test build.
* Fix to lowering when unused output with invalid type is present.
* func_wrap_async typechecks
* func call async
* instantiate_async
* fixes
* async engine creation for tests
* start adding a component model test for async
* fix wrong check for async support, factor out Instance::new_started to an unchecked impl
* tests: wibbles
* component::Linker::func_wrap: replace IntoComponentFunc with directly accepting a closure
We find that this makes the Linker::func_wrap type signature much easier
to read. The IntoComponentFunc abstraction was adding a lot of weight to
"splat" a set of arguments from a tuple of types into individual
arguments to the closure. Additionally, making the StoreContextMut
argument optional, or the Result<return> optional, wasn't very
worthwhile.
* Fixes for the new style of closure required by component::Linker::func_wrap
* future of result of return
* add Linker::instantiate_async and {Typed}Func::post_return_async
* fix fuzzing generator
* note optimisation opportunity
* simplify test
* Add egraphs option to Wasmtime config, and add it to fuzzing config generation.
This PR adds a wrapper method for Cranelift's `use_egraphs` setting to
Wasmtime's `Config`, named `cranelift_use_egraphs` analogously to its
existing `cranelift_opt_level`.
Eventually this should become a no-op as egraph-based optimization
becomes the default, but until then it makes sense to expose this as
another kind of optimization option.
This PR then adds the option to the `Arbitrary`-based config generation
for fuzzing, so compilation with egraphs will be fuzzed (on its own and
against other configurations and oracles).
* Don't use `NamedTempFile` on Windows
It looks like this prevents mmap-ing since the named temporary file
holds a `File` open which conflicts with the rights we're trying to open
the file for mmap-ing. Instead use a temporary directory to try to fix
this issue.
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
* component::Linker::func_wrap: replace IntoComponentFunc with directly accepting a closure
We find that this makes the Linker::func_wrap type signature much easier
to read. The IntoComponentFunc abstraction was adding a lot of weight to
"splat" a set of arguments from a tuple of types into individual
arguments to the closure. Additionally, making the StoreContextMut
argument optional, or the Result<return> optional, wasn't very
worthwhile.
* Fixes for the new style of closure required by component::Linker::func_wrap
* fix fuzzing generator
Remove the boolean types from cranelift, and the associated instructions breduce, bextend, bconst, and bint. Standardize on using 1/0 for the return value from instructions that produce scalar boolean results, and -1/0 for boolean vector elements.
Fixes#3205
Co-authored-by: Afonso Bordado <afonso360@users.noreply.github.com>
Co-authored-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
This is a simple error in the const-prop rules: uextend was not
masking iconst's u64 immediate when extending from i32 to
i64. Arguably an iconst.i32 should not have nonzero bits in the upper
32 of its immediate, but that's a separate design question. For now,
if our invariant is that the upper bits are ignored, then it is
required to mask the bits when const-evaling a `uextend`.
Fixes#5047.
* Plumb type exports in components around more
This commit adds some more plumbing for type exports to ensure that they
show up in the final compiled representation of a component. For now
they continued to be ignored for all purposes in the embedding API
itself but I found this useful to explore in `wit-bindgen` based tooling
which is leveraging the component parsing in Wasmtime.
* Add a field to `ModuleTranslation` to store the original wasm
This commit adds a field to be able to refer back to the original wasm
binary for a `ModuleTranslation`. This field is used in the upcoming
support for host generation in `wit-component` to "decompile" a
component into core wasm modules to get instantiated. This is used to
extract a core wasm module from the original component.
* FIx a build warning