wasmtime

Author	SHA1	Message	Date
Karl Meakin	208d09e9f0	cranelift: rewrite `x-1` to `ineg(x)` (#6052 ) cranelift: rewrite `x-1` to `ineg(x)` Add commuted test	2023-03-17 19:52:13 +00:00
Karl Meakin	c3f5b71b6a	craneleft: cancel `ineg` when args to `imul` (#6053 ) * craneleft: cancel `ineg`/`iabs` when args to `imul` * Remove unsound `iabs(x) * iabs(y) == x*y` rewrite	2023-03-17 19:41:20 +00:00
Nick Fitzgerald	2e48babf23	cranelift-wasm: Add a bounds-checking optimization for dynamic memories and guard pages (#6031 ) * cranelift-wasm: Add a bounds-checking optimization for dynamic memories and guard pages This is a new special case for when we know that there are enough guard pages to cover the memory access's offset and access size. The precise should-we-trap condition is index + offset + access_size > bound However, if we instead check only the partial condition index > bound then the most out of bounds that the access can be, while that partial check still succeeds, is `offset + access_size`. However, when we have a guard region that is at least as large as `offset + access_size`, we can rely on the virtual memory subsystem handling these out-of-bounds errors at runtime. Therefore, the partial `index > bound` check is sufficient for this heap configuration. Additionally, this has the advantage that a series of Wasm loads that use the same dynamic index operand but different static offset immediates -- which is a common code pattern when accessing multiple fields in the same struct that is in linear memory -- will all emit the same `index > bound` check, which we can GVN. * cranelift: Add WAT tests for accessing dynamic memories with the same index but different offsets The bounds check comparison is GVN'd but we still branch on values we should know will always be true if we get this far in the code. This is actual `br_if`s in the non-Spectre code and `select_spectre_guard`s that we should know will always go a certain way if we have Spectre mitigations enabled. Improving the non-Spectre case is pretty straightforward: walk the dominator tree and remember which values we've already branched on at this point, and therefore we can simplify any further conditional branches on those same values into direct jumps. Improving the Spectre case requires something that is morally the same, but has a few snags: * We don't have actual `br_if`s to determine whether the bounds checking condition succeeded or not. We need to instead reason about dominating `select_spectre_guard; {load, store}` instruction pairs. * We have to be SUPER careful about reasoning "through" `select_spectre_guard`s. Our general rule is never to do that, since it could break the speculative execution sandboxing that the instruction is designed for.	2023-03-17 19:06:19 +00:00
Karl Meakin	73cc433bdd	cranelift: simplify `icmp` against UMAX/SMIN/SMAX (#6037 ) * cranelift: simplify `icmp` against UMAX/SMIN/SMAX * Add tests for icmp against numeric limits	2023-03-17 18:54:29 +00:00
bjorn3	a81c206870	Various cleanups to Layout (#6042 ) * Use inst_block instead of pp_block where possible * Remove unused is_block_gap method * Remove ProgramOrder trait It only has a single implementation * Rename Layout::cmp to pp_cmp to distinguish it from Ord::cmp * Make pp_block non-generic * Use rpo_cmp_block instead of rpo_cmp in the verifier * Remove ProgramPoint * Rename ExpandedProgramPoint to ProgramPoint * Remove From<ValueDef> for ProgramPoint impl	2023-03-17 18:46:34 +00:00
Trevor Elliott	411a3eff3e	cranelift: Emit a table of opcodes in gen_inst (#6046 ) * Emit a table of opcodes in gen_inst * Remove accidental export of OPCODE_SIGNATURES * Generate `Opcode::all` instead of a table	2023-03-17 17:38:28 +00:00
bjorn3	76c6ee7363	Remove split_blocks_created field (#6044 ) This has been unused since #5731	2023-03-17 16:29:39 +00:00
Alex Crichton	5ebe53a351	x64: Elide more uextend with extractlane (#6045 ) * x64: Elide more uextend with extractlane I've confirmed locally now that `pextr{b,w,d}` all zero the upper bits of the full 64-bit register size which means that the `extractlane` operation with a zero-extend can be elided for more cases, including 8-to-64-bit casts as well as 32-to-64. This helps elide a few extra `mov`s in a loop I was looking at and had a modest corresponding increase in performance (my guess was due to the slightly decreased code size mostly as opposed to the removed `mov`s). * Remove stray file	2023-03-17 16:18:41 +00:00
Afonso Bordado	d939bdbd07	fuzzgen: Add a few SIMD arithmetic ops (#5994 )	2023-03-17 15:24:23 +00:00
Karl Meakin	b53d66e634	cranelift: simplify `x-x` to `0` (#6032 ) * cranelift: simplify `x-x` to `0` * Guard `x-x => 0` rewrite with `fits_in_64`	2023-03-17 15:14:28 +00:00
Alex Crichton	28371bfd40	Validate faulting addresses are valid to fault on (#6028 ) * Validate faulting addresses are valid to fault on This commit adds a defense-in-depth measure to Wasmtime which is intended to mitigate the impact of CVEs such as GHSA-ff4p-7xrq-q5r8. Currently Wasmtime will catch `SIGSEGV` signals for WebAssembly code so long as the instruction which faulted is an allow-listed instruction (aka has a trap code listed for it). With the recent security issue, however, the problem was that a wasm guest could exploit a compiler bug to access memory outside of its sandbox. If the access was successful there's no real way to detect that, but if the access was unsuccessful then Wasmtime would happily swallow the `SIGSEGV` and report a nominal trap. To embedders, this might look like nothing is going awry. The new strategy implemented here in this commit is to attempt to be more robust towards these sorts of failures. When a `SIGSEGV` is raised the faulting pc is recorded but additionally the address of the inaccessible location is also record. After the WebAssembly stack is unwound and control returns to Wasmtime which has access to a `Store` Wasmtime will now use this inaccessible faulting address to translate it to a wasm address. This process should be guaranteed to succeed as WebAssembly should only be able to access a well-defined region of memory for all linear memories in a `Store`. If no linear memory in a `Store` could contain the faulting address, then Wasmtime now prints a scary message and aborts the process. The purpose of this is to catch these sorts of bugs, make them very loud errors, and hopefully mitigate impact. This would continue to not mitigate the impact of a guest successfully loading data outside of its sandbox, but if a guest was doing a sort of probing strategy trying to find valid addresses then any invalid access would turn into a process crash which would immediately be noticed by embedders. While I was here I went ahead and additionally took a stab at #3120. Traps due to `SIGSEGV` will now report the size of linear memory and the address that was being accessed in addition to the bland "access out of bounds" error. While this is still somewhat bland in the context of a high level source language it's hopefully at least a little bit more actionable for some. I'll note though that this isn't a guaranteed contextual message since only the default configuration for Wasmtime generates `SIGSEGV` on out-of-bounds memory accesses. Dynamically bounds-checked configurations, for example, don't do this. Testing-wise I unfortunately am not aware of a great way to test this. The closet equivalent would be something like an `unsafe` method `Config::allow_wasm_sandbox_escape`. In lieu of adding tests, though, I can confirm that during development the crashing messages works just fine as it took awhile on macOS to figure out where the faulting address was recorded in the exception information which meant I had lots of instances of recording an address of a trap not accessible from wasm. * Fix tests * Review comments * Fix compile after refactor * Fix compile on macOS * Fix trap test for s390x s390x rounds faulting addresses to 4k boundaries.	2023-03-17 14:52:54 +00:00
ghostway0	a66b3e1ab6	cranelift: Fuzz mid-end optimizations (#5998 )	2023-03-17 00:08:00 +00:00
Chris Fallin	bf212b767b	Add note to README to encourage using the rustup method to install Rust. (#6036 ) * Add note to README to encourage using the rustup method to install Rust. This addresses the root confusion in #6035. * Update README.md Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> --------- Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>	2023-03-17 00:05:28 +00:00
Alex Crichton	8e500099b3	x64: Refactor and add extractlane special case for uextend/sextend (#6022 ) * x64: Refactor sextend/uextend rules Move much of the meaty logic from these lowering rules into the `extend_to_gpr` helper to benefit other callers of `extend_to_gpr` to elide instructions. This additionally simplifies `sextend` and `uextend` lowerings to rely on optimizations happening within the `extend_to_gpr` helper. * x64: Skip `uextend` for `pextr{b,w}` instructions These instructions are documented as automatically zeroing the upper bits so `uextend` operations can be skipped. This slightly improves codegen for the wasm `i{8x16,16x8}.extract_lane_u` instructions, for example. * Modernize an extractor pattern * Trim some superfluous match clauses Additionally rejigger priorities to be "mostly default" now. * Refactor 32-to-64 predicate to a helper Also adjust the pattern matched in the `extend_to_gpr` helper. * Slightly refactor pextr{b,w} case * Review comments	2023-03-16 22:14:59 +00:00
Karl Meakin	d479951469	cranelift: simplify `fneg(fneg(x))` to `x` (#6034 )	2023-03-16 22:14:12 +00:00
Karl Meakin	dccc2d6269	cranelift: simplify `ineg(ineg(x))` to `x` (#6033 )	2023-03-16 22:14:05 +00:00
Afonso Bordado	07136ae96d	cranelift-interpreter: Implement a bunch of SIMD arithmetic ops (#5991 ) * cranelift: Add function name to tests * cranelift: Move simd-ineg tests to separate file * cranelift: Move `avg_round` tests to separate file * cranelift: Move SIMD `fmin`/`fmax` tests to separate files * cranelift-interpreter: Implement a bunch of SIMD arithmetic ops Most of these are quite easy to adapt to be polymorphic * cranelift: Move shift tests from `simd-arithmetic.clif` into shift files	2023-03-16 18:44:16 +00:00
Alex Crichton	5ae8575296	x64: Take SIGFPE signals for divide traps (#6026 ) * x64: Take SIGFPE signals for divide traps Prior to this commit Wasmtime would configure `avoid_div_traps=true` unconditionally for Cranelift. This, for the division-based instructions, would change emitted code to explicitly trap on trap conditions instead of letting the `div` x86 instruction trap. There's no specific reason for Wasmtime, however, to specifically avoid traps in the `div` instruction. This means that the extra generated branches on x86 aren't necessary since the `div` and `idiv` instructions already trap for similar conditions as wasm requires. This commit instead disables the `avoid_div_traps` setting for Wasmtime's usage of Cranelift. Subsequently the codegen rules were updated slightly: * When `avoid_div_traps=true`, traps are no longer emitted for `div` instructions. * The `udiv`/`urem` instructions now list their trap as divide-by-zero instead of integer overflow. * The lowering for `sdiv` was updated to still explicitly check for zero but the integer overflow case is deferred to the instruction itself. * The lowering of `srem` no longer checks for zero and the listed trap for the `div` instruction is a divide-by-zero. This means that the codegen for `udiv` and `urem` no longer have any branches. The codegen for `sdiv` removes one branch but keeps the zero-check to differentiate the two kinds of traps. The codegen for `srem` removes one branch but keeps the -1 check since the semantics of `srem` mismatch with the semantics of `idiv` with a -1 divisor (specifically for INT_MIN). This is unlikely to have really all that much of a speedup but was something I noticed during #6008 which seemed like it'd be good to clean up. Plus Wasmtime's signal handling was already set up to catch `SIGFPE`, it was just never firing. * Remove the `avoid_div_traps` cranelift setting With no known users currently removing this should be possible and helps simplify the x64 backend. * x64: GC more support for avoid_div_traps Remove the `validate_sdiv_divisor` pseudo-instructions and clean up some of the ISLE rules now that `div` is allowed to itself trap unconditionally. x64: Store div trap code in instruction itself * Keep divisors in registers, not in memory Don't accidentally fold multiple traps together * Handle EXC_ARITHMETIC on macos * Update emit tests * Update winch and tests	2023-03-16 00:18:45 +00:00
Bobby Holley	5ff2824ebb	Bump cargo-vet to 0.5. (#6029 ) Aside from a few new features (notably automatic registry suggestions), this release removes the need to import description for criteria that are not directly used, and adds an explicit version to the cargo-vet instance.	2023-03-15 22:14:38 +00:00
Alex Crichton	d76f7ee52e	x64: Improve codegen for splats (#6025 ) This commit goes through the lowerings for the CLIF `splat` instruction and improves the support for each operator. Many of these lowerings are mirrored from v8/SpiderMonkey and there are a number of improvements: * AVX2 `v{p,}broadcast` instructions are added and used when available. Float-based splats are much simpler and always a single-instruction * Integer-based splats don't insert into an uninit xmm value and instead start out with a `movd` to move into an `xmm` register. This thoeretically breaks dependencies with prior instructions since `movd` creates a fresh new value in the destination register. * Loads are now sunk into all of the instructions. A new extractor, `sinkable_load_exact`, was added to sink the i8/i16 loads.	2023-03-15 21:33:56 +00:00
Afonso Bordado	a10c50afe9	cranelift: Translate `stack_*` accesses as unaligned (#6016 ) We can't currently ensure that these will be aligned, so we shouldn't mark them as such.	2023-03-15 18:05:55 +00:00
Alex Crichton	6ed90f86c8	x64: Add support for the `pblendw` instruction (#6023 ) This commit adds another case for `shuffle` lowering to the x64 backend for the `{,v}pblendw` instruction. This instruction selects 16-bit values from either of the inputs corresponding to an immediate 8-bit-mask where each bit selects the corresponding lane from the inputs.	2023-03-15 17:20:43 +00:00
Alex Crichton	fcddb9ca81	x64: Add lea-based lowering for iadd (#5986 ) * x64: Refactor `Amode` computation in ISLE This commit replaces the previous computation of `Amode` with a different set of rules that are intended to achieve the same purpose but are structured differently. The motivation for this commit is going to become more relevant in the next commit where `lea` will be used for the `iadd` instruction, possibly, on x64. When doing so it caused a stack overflow in the test suite during the compilation phase of a wasm module, namely as part of the `amode_add` function. This function is recursively defined in terms of itself and recurses as deep as the deepest `iadd`-chain in a program. A particular test in our test suite has a 10k-long chain of `iadd` which ended up causing a stack overflow in debug mode. This stack overflow is caused because the `amode_add` helper in ISLE unconditionally peels all the `iadd` nodes away and looks at all of them, even if most end up in intermediate registers along the way. Given that structure I couldn't find a way to easily abort the recursion. The new `to_amode` helper is structured in a similar fashion but attempts to instead only recurse far enough to fold items into the final `Amode` instead of recursing through items which themselves don't end up in the `Amode`. Put another way previously the `amode_add` helper might emit `x64_add` instructions, but it no longer does that. This goal of this commit is to preserve all the original `Amode` optimizations, however. For some parts, though, it relies more on egraph optimizations to run since if an `iadd` is 10k deep it doesn't try to find a constant buried 9k levels inside there to fold into the `Amode`. The hope, though, is that with egraphs having run already it's shuffled constants to the right most of the time and already folded any possible together. * x64: Add `lea`-based lowering for `iadd` This commit adds a rule for the lowering of `iadd` to use `lea` for 32 and 64-bit addition. The theoretical benefit of `lea` over the `add` instruction is that the `lea` variant can emulate a 3-operand instruction which doesn't destructively modify on of its operands. Additionally the `lea` operation can fold in other components such as constant additions and shifts. In practice, however, if `lea` is unconditionally used instead of `iadd` it ends up losing 10% performance on a local `meshoptimizer` benchmark. My best guess as to what's going on here is that my CPU's dedicated units for address computation are all overloaded while the ALUs are basically idle in a memory-intensive loop. Previously when the ALU was used for `add` and the address units for stores/loads it in theory pipelined things better (most of this is me shooting in the dark). To prevent the performance loss here I've updated the lowering of `iadd` to conditionally sometimes use `lea` and sometimes use `add` depending on how "complicated" the `Amode` is. Simple ones like `a + b` or `a + $imm` continue to use `add` (and its subsequent hypothetical extra `mov` necessary into the result). More complicated ones like `a + b + $imm` or `a + b << c + $imm` use `lea` as it can remove the need for extra instructions. Locally at least this fixes the performance loss relative to unconditionally using `lea`. One note is that this adds an `OperandSize` argument to the `MInst::LoadEffectiveAddress` variant to add an encoding for 32-bit `lea` in addition to the preexisting 64-bit encoding. * Conditionally use `lea` based on regalloc	2023-03-15 17:14:25 +00:00
Benjamin Bouvier	2e6c7bf994	perf: Create a per-process JIT dump file (#6024 )	2023-03-15 14:04:15 +00:00
Trevor Elliott	68b937d965	cranelift: Fix shift overflow when constructing BitSet (#6020 ) * Fix shift overflow when constructing the Wider constraint for integers * Clarify comment	2023-03-14 22:25:51 +00:00
Trevor Elliott	48ecb6f119	Compact `valid_for_target` using or patterns (#6019 )	2023-03-14 20:45:36 +00:00
Saúl Cabrera	80bfb35072	winch: Introduce `winch-environ` (#6017 ) This commit introduces the `winch-environ` crate. This crate's responsibility is to provide a shared implementatation of the `winch_codegen::FuncEnv` trait, which is Winch's function compilation environment, used to resolve module and runtime specific information needed by the code generation, such as resolving all the details about a callee in a WebAssembly module, or resolving specific information from the `VMContext`. As of this change, the implementation only includes the necessary pieces to resolve a function callee in a WebAssembly module. The idea is to evolve the `winch_codegen::FuncEnv` trait as we evolve Winch's code generation.	2023-03-14 19:59:15 +00:00
Trevor Elliott	e4d9bb7c5a	cranelift: Exclude the control type in narrower and wider (#6018 ) * Don't include the control type in `narrower` or `wider` constraints * Add verifier tests for instructions that use narrower and wider	2023-03-14 18:09:15 +00:00
Trevor Elliott	f5ad74e546	cranelift: Add narrower and wider constraints to the instruction DSL (#6013 ) * Add narrower and wider constraints to the instruction DSL * Add docs to narrower/wider operands * Update cranelift/codegen/meta/src/cdsl/instructions.rs Co-authored-by: Jamey Sharp <jamey@minilop.net> * Fix assertion message * Simplify upper bounds for the wider constraint * Remove additional unnecessary cases in the verifier * Remove unused variables * Remove changes to is_ctrl_typevar_candidate These changes were only necessary when the type returned by an instruction was a variable constrained by narrow or widen. As we have switched to requiring that constraints must appear on argument types and not return types, these changes were not longer necessary. --------- Co-authored-by: Jamey Sharp <jamey@minilop.net>	2023-03-14 16:34:17 +00:00
Alex Crichton	5c1b468648	x64: Migrate {s,u}{div,rem} to ISLE (#6008 ) * x64: Add precise-output tests for div traps This adds a suite of `.clif` files which are intended to test the `avoid_div_traps=true` compilation of the `{s,u}{div,rem}` instructions. x64: Remove conditional regalloc in `Div` instruction Move the 8-bit `Div` logic into a dedicated `Div8` instruction to avoid having conditionally-used registers with respect to regalloc. * x64: Migrate non-trapping, `udiv`/`urem` to ISLE * x64: Port checked `udiv` to ISLE * x64: Migrate urem entirely to ISLE * x64: Use `test` instead of `cmp` to compare-to-zero * x64: Port `sdiv` lowering to ISLE * x64: Port `srem` lowering to ISLE * Tidy up regalloc behavior and fix tests * Update docs and winch * Review comments * Reword again * More refactoring test fixes * More test fixes	2023-03-14 01:44:06 +00:00
Trevor Elliott	188f712025	Mark fpromote and fdemote as operating on float scalars (#6014 )	2023-03-14 00:47:40 +00:00
Ingvar Stepanyan	873d3b50a0	Allow to disable clocks in WasiCtx (#6007 ) Takes the approach described in #6004, but also creates a wrapper for the monotonic time that encapsulates the `creation_time` field as well, since they logically belong and are always used together. This makes it easier to configure `WasiCtx` with custom clocks as well as disable them for security or determinism reasons. Closes #6004.	2023-03-13 23:47:04 +00:00
Alex Crichton	d6ce632b5b	aarch64: Specialize constant vector shifts (#5976 ) * aarch64: Specialize constant vector shifts This commit adds special lowering rules for vector-shifts-by-constant-amounts to use dedicated instructions which cuts down on the codegen here quite a bit for constant values. * Fix codegen for 0-shift-rights * Special-case zero left-shifts as well * Remove left-shift special case	2023-03-13 22:37:59 +00:00
Nick Fitzgerald	90c9bec225	wasmtime: Option to return default values for unknown imports (#6010 ) Similar to the `--trap-unknown-imports` option, which defines unknown function imports with functions that trap when called, this new `--default-values-unknown-imports` option defines unknown function imports with a function that returns the default values for the result types (either zero or null depending on the value type).	2023-03-13 21:39:30 +00:00
Alex Crichton	e2a6fe99c2	x64: Add `shuffle` specialization for `palignr` (#5999 ) * x64: Add `shuffle` specialization for `palignr` This commit adds specializations for the `palignr` instruction to the x64 backend to specialize some more patterns of byte shuffles. * Fix tests	2023-03-13 21:01:24 +00:00
Alex Crichton	bba49646c3	Reduce VM overhead of pooling spec tests (#6006 ) This commit forces bounds checks to be used when pooling and running the spec tests to ensure that they can be run at a reasonable degree of parallelism. Otherwise currently the VM reservation required for the multi-memory tests is so large that it fails to get reserved at runtime, failing the test. Closes #6003	2023-03-13 19:56:47 +00:00
Alex Crichton	03b5dbb3e0	aarch64: Use `VCodeConstant` for f64/v128 constants (#5997 ) * aarch64: Translate float and splat lowering to ISLE I was looking into `constant_f128` and its fallback lowering into memory and to get familiar with the code I figured it'd be good to port some Rust logic to ISLE. This commit ports the `constant_{f128,f64,f32}` helpers into ISLE from Rust as well as the `splat_const` helper which ended up being closely related. Tests reflect a number of regalloc changes that happened but also namely one major difference is that in the lowering of `f32` a 32-bit immediate is created now instead of a 64-bit immediate (in a GP register before it's moved into a FP register). This semantically has no change but the generated code is slightly different in a few minor cases. * aarch64: Load f64/v128 constants from a pool This commit removes the `LoadFpuConst64` and `LoadFpuConst128` pseudo-instructions from the AArch64 backend which internally loaded a nearby constant and then jumped over it. Constants now go through the `VCodeConstant` infrastructure which gets placed at the end of the function similar to how x64 works. Some minor support was added in as well to add a new addressing mode for a `MachLabel`-relative load.	2023-03-13 19:33:52 +00:00
Alex Crichton	6ecdc2482e	x64: Improve memory support in `{insert,extract}lane` (#5982 ) * x64: Improve memory support in `{insert,extract}lane` This commit improves adds support to Cranelift to emit `pextr{b,w,d,q}` with a memory destination, merging a store-of-extract operation into one instruction. Additionally AVX support is added for the `pextr` instructions. I've additionally tried to ensure that codegen tests and runtests exist for all forms of these instructions too. Add missing commas * Fix tests	2023-03-13 19:30:44 +00:00
Afonso Bordado	5c95e6fbaf	riscv64: Codemotion cleanups to ISLE files (#5984 ) * riscv64: Fix typo in extensions * riscv64: Move converters to top of file * riscv64: Group up all imm12 rules * riscv64: Move zero_reg helpers to Physical Regs section * riscv64: Move helpers away from `clz` lowerings These were in the middle of the `clz` rules and are kind of distracting * riscv64: Move `cls` rules next to `ctz`/`clz` * cranelift: Move `u8_and` / `u32_add` to Primitive Arithmetic section * riscv64: Mark some imm12 constructors as pure * cranelift: Move `s32_add_fallible` next to `u32_add` * riscv64: Fix Typo	2023-03-13 19:20:15 +00:00
uint256_t	b50cf9bb57	cranelift-entity: more efficient `EntitySet` implementation (#5978 ) * Use usize intead of u8 * Rename 'byte's to appropriate words	2023-03-13 18:43:34 +00:00
Afonso Bordado	ad0bce3a36	riscv64: Fix regaloc panic with `bor`+`bnot` on floats (#5857 )	2023-03-13 18:29:36 +00:00
Saúl Cabrera	d03612c2d9	cranelift-codegen(x64): Expose `CallInfo` (#6005 ) This commit exposes the `CallInfo` struct, needed by Winch to emit function calls.	2023-03-13 17:50:53 +00:00
Alex Crichton	7956dc6ba2	Change CLIF `shuffle` to validate lane indices (#5995 ) * Change CLIF `shuffle` to validate lane indices Previously the CLIF `shuffle` instruction did not perform any validation on the lane shuffle mask and specified that out-of-bounds lanes always returned 0 as the value. This behavior though is not required by WebAssembly which validates that lane indices are always in-bounds. Additionally since these are static immediates even other code generators should be able to verify that the immediates are in-bounds. As a result this commit updates the definition of the `shuffle` instruction to specify that all byte immediates must be in-bounds in the range of [0, 32). The verifier has been updated and some test cases have been removed that were testing this functionality. Closes #5989 * Only generate valid shuffle immediates in fuzzer	2023-03-13 14:24:11 +00:00
Afonso Bordado	2386eee56b	fuzzgen: Add SIMD instructions supported by the interpreter (#5971 ) * fuzzgen: Add some SIMD instructions * fuzzgen: Remove `scalar_to_vector` Broken in the interpreter #5911 * fuzzgen: Remove SIMD bitcasts Broken in the interpreter #5915 * fuzzgen: Fix insert lane * fuzzgen: Remove debug code * fuzzgen: Remove vall_true This is broken in the interpreter #5916 * fuzzgen: Disable a few more ops * fuzzgen: Remove `iadd_pairwise.i64x2` Turns out it doesen't exist * fuzzgen: Remove scalar `sqmul_round_sat` #5923 * fuzzgen: Disable aligned loads to SIMD values * fuzzgen: Address Review Feedback Co-Authored-By: Jamey Sharp <jsharp@fastly.com> * fuzzgen: Rework `cmp` exclusion rules Co-Authored-By: Jamey Sharp <jsharp@fastly.com> --------- Co-authored-by: Jamey Sharp <jsharp@fastly.com>	2023-03-11 12:20:21 +00:00
Alex Crichton	af7ef8df9a	Fix some minor issues with the `explorer` command (#5988 ) This commit fixes a few minor issues that Nick and I ran into walking through some code with the `wasmtime explore` command: * When a new function is reached the address map iterator is advanced past the prior function to avoid accidentally attributing instructions across functions. * A `<` comparison was changed to `<=` to fix some off-by-one attributions from instructions to wasm instructions. * The `skipdata` option is enabled in Capstone to avoid truncating AArch64 disassemblies too early.	2023-03-11 02:31:31 +00:00
Nick Fitzgerald	9ed441e657	Introduce the `wasmtime-explorer` crate (#5975 ) This implements Godbolt Compiler Explorer-like functionality for Wasmtime and Cranelift. Given a Wasm module, it compiles the module to native code and then writes a standalone HTML file that gives a split pane view between the WAT and ASM disassemblies.	2023-03-11 00:33:06 +00:00
Chris Fallin	264089e29d	Cranelift: aarch64: fix undefined dest reg in f32x4.splat case. (#5987 ) One of the cases for a splat operation, as updated in #5370, wrote to a temp reg but then only conditionally transformed the temp into the final destination register. In another codepath, `rd` was left undefined. This causes a panic later when regalloc2 verifies SSA properties of its input (here, value not def'd before use). Fixes #5985.	2023-03-11 00:22:29 +00:00
Alex Crichton	52896e020d	aarch64: Add specialized `shuffle` lowerings (#5977 ) * aarch64: Add `shuffle` lowerings for the `uzp{1,2}` instructions This commit uses the same style of patterns in the x64 backend to start adding specific lowerings of the Cranelift `shuffle` instruction to particular AArch64 instructions. * aarch64: Add `shuffle` lowerings to the `zip{1,2}` instructions These instructions match the `punpck` family of instructions on x64 and should help provide more efficient lowerings than the current `shuffle` fallback. aarch64: Add `shuffle` lowerings for `trn{1,2}` Along the lines of prior commits adds specific patterns to lowering for individual AArch64 instructions available. * aarch64: Add a `shuffle` lowering for the `ext` instruction This instruction will more-or-less concatenate two 128-bit vector registers to create a 256-bit value, shift it right, and then take the lower 128-bits into the destination. This can be modeled with a `shuffle` of consecutive bytes so this adds a lowering rule to generate this instruction. * aarch64: Add `shuffle` special case for `dup` This commit adds special cases for Cranelift's `shuffle` on AArch64 when the lowering can be represented with a `dup` instruction which broadcasts one vector's lane into all lanes of the destination. * aarch64: Add `shuffle` specializations for `rev` instructions This commit adds shuffle mask specializations for the `rev{16,32,64}` family of instructions on AArch64 which can be used to reverse bytes, 16-bit values, or 32-bit values within larger values. * Fix tests * Add doc-comments in ISLE	2023-03-10 21:37:13 +00:00
Nick Fitzgerald	5623f7280c	Update `wasmprinter` and `wasm-mutate` deps (#5983 ) * Bump wasm-mutate and wasmprinter deps * Add wildcard audits for wasmprinter and wasm-mutate * Add wildcard audit for bumpalo	2023-03-10 20:20:57 +00:00
Ulrich Weigand	411781d2fe	s390x: Fix mistake in available_in_isa (#5981 ) The 32-bit float<->int conversion instructions are part of the VXRS_EXT2 facility, not MIE2. Fixes https://github.com/bytecodealliance/wasmtime/issues/5979.	2023-03-10 19:41:41 +00:00

... 2 3 4 5 6 ...

11191 Commits