wasmtime

Author	SHA1	Message	Date
Chris Fallin	3e516e784b	Fix lowering instruction-sinking (load-merging) bug. This fixes a subtle corner case exposed during fuzzing. If we have a bit of CLIF like: ``` v0 = load.i64 ... v1 = iadd.i64 v0, ... v2 = do_other_thing v1 v3 = load.i64 v1 ``` and if this is lowered using a machine backend that can merge loads into ALU ops, and that has an addressing mode that can look through add ops, then the following can happen: 1. We lower the load at `v3`. This looks backward at the address operand tree and finds that `v1` is `v0` plus other things; it has an addressing mode that can add `v0`'s register and the other things directly; so it calls `put_value_in_reg(v0)` and uses its register in the amode. At this point, the add producing `v1` has no references, so it will not (yet) be codegen'd. 2. We lower `do_other_thing`, which puts `v1` in a register and uses it. the `iadd` now has a reference. 3. We reach the `iadd` and, because it has a reference, lower it. Our machine has the ability to merge a load into an ALU operation. Crucially, we think the load at `v0` is mergeable because it has only one user, the add at `v1` (!). So we merge it. 4. We reach the `load` at `v0` and because it has been merged into the `iadd`, we do not separately codegen it. The register that holds `v0` is thus never written, and the use of this register by the final load (Step 1) will see an undefined value. The logic error here is that in the presence of pattern matching that looks through pure ops, we can end up with multiple uses of a value that originally had a single use (because we allow lookthrough of pure ops in all cases). In other words, the multiple-use-ness of `v1` "passes through" in some sense to `v0`. However, the load sinking logic is not aware of this. The fix, I think, is pretty simple: we disallow an effectful instruction from sinking/merging if it already has some other use when we look back at it. If we disallowed lookthrough of any op that had multiple uses, even pure ones, then we would avoid this scenario; but earlier experiments showed that to have a non-negligible performance impact, so (given that we've worked out the logic above) I think this complexity is worth it.	2020-12-03 14:59:12 -08:00
Anton Kirilov	f59b274d22	Cranelift AArch64: Further vector constant improvements Introduce support for MOVI/MVNI with 16-, 32-, and 64-bit elements, and the vector variant of FMOV. Copyright (c) 2020, Arm Limited.	2020-12-03 15:30:24 +00:00
Chris Fallin	c9a81f008d	x64 backend: fix condition-code used for part of explicit heap check. A dynamic heap address computation may create up to two conditional branches: the usual bounds-check, but also (in some cases) an offset-addition overflow check. The x64 backend had reversed the condition code for this check, resulting in an always-trapping execution for a valid offset. I'm somewhat surprised this has existed so long, but I suppose the particular conditions (large offset, small offset guard, dynamic heap) have been somewhat rare in our testing so far. Found via fuzzing in #2453.	2020-12-02 10:40:53 -08:00
Chris Fallin	d413b907b4	Merge pull request #2414 from jgouly/extend-refactor arm64: Refactor Inst::Extend handling	2020-11-25 17:22:07 -08:00
Chris Fallin	b97f07b405	x64 backend: merge loads into ALU ops when appropriate. This PR makes use of the support in #2366 for sinking effectful instructions and merging them with consumers. In particular, on x86, we want to make use of the ability of many instructions to load one operand directly from memory. That is, instead of this: ``` movq 0(%rdi), %rax addq %rax, %rbx ``` we want to generate this: ``` addq 0(%rdi), %rax ``` As described in more detail in #2366, sinking and merging the load is only possible under certain conditions. In particular, we need to ensure that the use is the only use (otherwise the load happens more than once), and we need to ensure that it does not move across other effectful ops (see #2366 for how we ensure this). This change is actually fairly simple, given that all the framework is in place: we simply pattern-match a load on one operand of an ALU instruction that takes an RMI (reg, mem, or immediate) operand, and generate the mem form when we match. Also makes a drive-by improvement in the x64 backend to use statically-monomorphized `LowerCtx` types rather than a `&mut dyn LowerCtx`. On `bz2.wasm`, this results in ~1% instruction-count reduction. More is likely possible by following up with other instructions that can merge memory loads as well.	2020-11-17 11:06:46 -08:00
Chris Fallin	712ff22492	AArch64 SIMD: pattern-match load+splat into `LD1R` instruction.	2020-11-16 15:59:28 -08:00
Joey Gouly	70cbc4ca7c	arm64: Refactor Inst::Extend handling This refactors the handling of Inst::Extend and simplifies the lowering of Bextend and Bmask, which allows the use of SBFX instructions for extensions from 1-bit booleans. Other extensions use aliases of BFM, and the code was changed to reflect that, rather than hard coding bit patterns. Also ImmLogic is now implemented, so another hard coded instruction can be removed. As part of looking at boolean handling, `normalize_boolean_result` was changed to `materialize_boolean_result`, such that it can use either CSET or CSETM. Using CSETM saves an instruction (previously CSET + SUB) for booleans bigger than 1-bit. Copyright (c) 2020, Arm Limited.	2020-11-13 16:17:25 +00:00
Chris Fallin	113d061129	Merge pull request #2369 from akirilov-arm/move_fix Cranelift AArch64: Various small fixes	2020-11-12 14:59:10 -08:00
Andrew Brown	bd93e69eb4	[machinst x64]: implement packed shifts	2020-11-12 14:21:45 -08:00
Chris Fallin	89dbc4590d	Merge pull request #2363 from cfallin/extend-only-if-abi Do value-extensions at ABI boundaries only when ABI requires it.	2020-11-12 12:26:20 -08:00
Chris Fallin	fd6433aaf5	Merge pull request #2395 from cfallin/lucet-x64-support Add support for brff/brif and icmp_sp to new x64 backend to support Lucet.	2020-11-12 12:10:52 -08:00
Anton Kirilov	edaada3f57	Cranelift AArch64: Various small fixes * Use FMOV to move 64-bit FP registers and SIMD vectors. * Add support for additional vector load types. * Fix the printing of Inst::LoadAddr. Copyright (c) 2020, Arm Limited.	2020-11-12 13:54:05 +00:00
Chris Fallin	5df8840483	Add support for brff/brif and icmp_sp to new x64 backend to support Lucet. `lucetc` currently almost, but not quite, works with the new x64 backend; the only missing piece is support for the particular instructions emitted as part of its prologue stack-check. We do not normally see `brff`, `brif`, or `ifcmp_sp` in CLIF generated by `cranelift-wasm` without the old-backend legalization rules, so these were not supported in the new x64 backend as they were not necessary for Wasm MVP support. Using them resulted in an `unimplemented!()` panic. This PR adds support for `brff` and `brif` analogously to how AArch64 implements them, by pattern-matching the `ifcmp` / `ffcmp` directly. Then `ifcmp_sp` is a straightforward variant of `ifcmp`. Along the way, this also removes the notion of "fallthrough block" from the branch-group lowering method; instead, `fallthrough` instructions are handled as normal branches to their explicitly-provided targets, which (in the original CLIF) match the fallthrough block. The reason for this is that the block reordering done as part of lowering can change the fallthrough block. We were not using `fallthrough` instructions in the output produced by `cranelift-wasm`, so this, too, was not previously caught. With these changes, the `lucetc` crate in Lucet passes all tests with the `x64` feature-flag added to its `cranelift-codegen` dependency.	2020-11-11 13:43:39 -08:00
Chris Fallin	997b654235	Merge pull request #2393 from jgouly/constant-addend arm64: Fold some constants into load instructions	2020-11-11 11:23:21 -08:00
Pat Hickey	aa259ff92a	Merge pull request #2390 from bjorn3/more_simplejit_refactors More SimpleJIT refactorings	2020-11-11 11:16:04 -08:00
Joey Gouly	a5011e8212	arm64: Fold some constants into load instructions This changes the following: mov x0, #4 ldr x0, [x1, #4] Into: ldr x0, [x1] I noticed this pattern (but with #0), in a benchmark. Copyright (c) 2020, Arm Limited.	2020-11-11 18:47:43 +00:00
Julian Seward	41e87a2f99	Support wasm `select` instruction with V128-typed operands on AArch64. * this requires upgrading to wasmparser 0.67.0. * There are no CLIF side changes because the CLIF `select` instruction is polymorphic enough. * on aarch64, there is unfortunately no conditional-move (csel) instruction on vectors. This patch adds a synthetic instruction `VecCSel` which does behave like that. At emit time, this is emitted as an if-then-else diamond (4 insns). * aarch64 implementation is otherwise straightforwards.	2020-11-11 18:45:24 +01:00
bjorn3	b7a93c2321	Remove reloc_block It isn't called and all reloc sinks either ignore it or panic when it is called.	2020-11-11 12:36:17 +01:00
Andrew Brown	c9e8889d47	Update clippy annotation to use latest version (#2375 )	2020-11-09 09:24:59 -06:00
Yury Delendik	f60c0f3ec3	cranelift: refactor unwind logic to accommodate multiple backends (#2357 ) * Make cranelift_codegen::isa::unwind::input public * Move UnwindCode's common offset field out of the structure * Make MachCompileResult::unwind_info more generic * Record initial stack pointer offset	2020-11-05 16:57:40 -06:00
Andrew Brown	83f182b390	Implement initial emission of constants This approach suffers from memory-size bloat during compile time due to the desire to de-duplicate the constants emitted and reduce runtime memory-size. As a first step, though, this provides an end-to-end mechanism for constants to be emitted in the MachBuffer islands.	2020-11-05 14:25:02 -08:00
Chris Fallin	a2bbb198de	Do value-extensions at ABI boundaries only when ABI requires it. There has been some confusion over the meaning of the "sign-extend" (`sext`) and "zero-extend" (`uext`) attributes on parameters and return values in signatures. According to the three implemented backends, these attributes indicate that a value narrower than a full register should always be extended in the way specified. However, they are much more useful if they mean "extend in this way if the ABI requires extending": only the ABI backend knows whether or not a particular ABI (e.g., x64 SysV vs. x64 Baldrdash) requires extensions, while only the frontend (CLIF generator) knows whether or not a value is signed, so the two have to work in concert. This is the result of some very helpful discussion in #2354 (thanks to @uweigand for raising the issue and @bjorn3 for helping to reason about it). This change respects the extension attributes in the above way, rather than unconditionally extending, to avoid potential performance degradation as we introduce more extension attributes on signatures.	2020-11-05 11:54:35 -08:00
Alex Crichton	e4c3fc5cf2	Update immediate and transitive dependencies I don't think this has happened in awhile but I've run a `cargo update` as well as trimming some of the duplicate/older dependencies in `Cargo.lock` by updating some of our immediate dependencies as well.	2020-11-05 08:34:09 -08:00
Alex Crichton	ab1958434a	Bump to 0.21.0 (#2359 )	2020-11-05 09:39:53 -06:00
Julian Seward	dd9bfcefaa	CL/aarch64: implement the wasm SIMD `v128.load{32,64}_zero` instructions. This patch implements, for aarch64, the following wasm SIMD extensions. v128.load32_zero and v128.load64_zero instructions https://github.com/WebAssembly/simd/pull/237 The changes are straightforward: * no new CLIF instructions. They are translated into an existing CLIF scalar load followed by a CLIF `scalar_to_vector`. * the comment/specification for CLIF `scalar_to_vector` has been changed to match the actual intended semantics, per consulation with Andrew Brown. * translation from `scalar_to_vector` to aarch64 `fmov` instruction. This has been generalised slightly so as to allow both 32- and 64-bit transfers. * special-case zero in `lower_constant_f128` in order to avoid a potentially slow call to `Inst::load_fp_constant128`. * Once "Allow loads to merge into other operations during instruction selection in MachInst backends" (https://github.com/bytecodealliance/wasmtime/issues/2340) lands, we can use that functionality to pattern match the two-CLIF pair and emit a single AArch64 instruction. * A simple filetest has been added. There is no comprehensive testcase in this commit, because that is a separate repo. The implementation has been tested, nevertheless.	2020-11-04 20:00:04 +01:00
Andrew Brown	6d50099816	Rewrite interpreter generically (#2323 ) * Rewrite interpreter generically This change re-implements the Cranelift interpreter to use generic values; this makes it possible to do abstract interpretation of Cranelift instructions. In doing so, the interpretation state is extracted from the `Interpreter` structure and is accessed via a `State` trait; this makes it possible to not only more clearly observe the interpreter's state but also to interpret using a dummy state (e.g. `ImmutableRegisterState`). This addition made it possible to implement more of the Cranelift instructions (~70%, ignoring the x86-specific instructions). * Replace macros with closures	2020-11-02 12:28:07 -08:00
Chris Fallin	d1be8dcfc0	Merge pull request #2310 from akirilov-arm/vector_constants Cranelift AArch64: Improve code generation for vector constants	2020-11-01 21:56:40 -08:00
bjorn3	23aafa1054	Fix icmp_imm.i128 The immediate splitting code contained a bug causing both low and high to be equal for i128. This is the root cause for bjorn3/rustc_codegen_cranelift#1097 and likely the only bug preventing cg_clif from bootstrapping rustc.	2020-10-31 21:11:50 +01:00
Anton Kirilov	207779fe1d	Cranelift AArch64: Improve code generation for vector constants In particular, introduce initial support for the MOVI and MVNI instructions, with 8-bit elements. Also, treat vector constants as 32- or 64-bit floating-point numbers, if their value allows it, by relying on the architectural zero extension. Finally, stop generating literal loads for 32-bit constants. Copyright (c) 2020, Arm Limited.	2020-10-30 13:16:12 +00:00
Johnnie Birch	fa66daea25	Add filetests for fcvt_from_sint.f32x4 Add portions of filetests simd-conversion-legalize.clif and simd-conversion-run.clif that test fcvt_from_sint.f32x4	2020-10-28 13:02:50 -07:00
Yury Delendik	de4af90af6	machinst x64: New backend unwind (#2266 ) Addresses unwind for experimental x64 backend. The preliminary code enables backtrace on SystemV call convension.	2020-10-23 15:19:41 -05:00
Yury Delendik	b10e027fef	Refactor UnwindInfo codes and frame_register (#2307 ) * Refactor UnwindInfo codes and frame_register * use isa word_size * fix filetests * Add comment about UnwindCode::PushRegister	2020-10-22 14:52:42 -05:00
Andrew Brown	0ba35171fb	[machinst x64]: port more CLIF filetests	2020-10-09 10:04:50 -07:00
Benjamin Bouvier	e8c2a1763a	machinst x64: avoid emitting movzx when the input is an ALU 32-bits operation;	2020-10-09 18:49:27 +02:00
Benjamin Bouvier	3980a43cda	machinst x64: use the (base,offset) addressing mode even in the presence of a uextend;	2020-10-09 18:49:27 +02:00
Andrew Brown	c8cce5d2d7	[machinst x64]: enable packed saturated arithmetic	2020-10-08 08:46:20 -07:00
Benjamin Bouvier	e32e6fb612	machinst x64: check SSE requirements for instructions against enabled features;	2020-10-08 09:21:51 +02:00
Andrew Brown	3778fa025c	Switch DataValue to use Ieee32/Ieee64 As discussed in #2251, in order to be very confident that NaN signaling bits are correctly handled by the compiler, this switches `DataValue` to use Cranelift's `Ieee32` and `Ieee64` structures. This makes it a bit more inconvenient to interpreter Cranelift FP operations but this should change to something like `rustc_apfloat` in the future.	2020-10-07 12:17:17 -07:00
Andrew Brown	6f6f79ef2b	refactor: move DataValue from cranelift-reader to cranelift-codegen This is no change to functionality; the move is necessary in order to return InstructionData immediates in a structure way (see next commit).	2020-10-07 12:17:17 -07:00
Chris Fallin	71768bb6cf	Fix AArch64 ABI to respect half-caller-save, half-callee-save vec regs. This PR updates the AArch64 ABI implementation so that it (i) properly respects that v8-v15 inclusive have callee-save lower halves, and caller-save upper halves, by conservatively approximating (to full registers) in the appropriate directions when generating prologue caller-saves and when informing the regalloc of clobbered regs across callsites. In order to prevent saving all of these vector registers in the prologue of every non-leaf function due to the above approximation, this also makes use of a new regalloc.rs feature to exclude call instructions' writes from the clobber set returned by register allocation. This is safe whenever the caller and callee have the same ABI (because anything the callee could clobber, the caller is allowed to clobber as well without saving it in the prologue). Fixes #2254.	2020-10-06 14:44:02 -07:00
Johnnie Birch	5799fd3cc0	Add file test simd-arithmetic-run to x64 backend Copies over simd-arithmetic-run from the old backend, adding several run tests including for min/max. Tests not supported are commented out.	2020-10-02 16:20:10 -07:00
Chris Fallin	b2f52910fb	Merge pull request #2224 from jgouly/sp_adjust arm64: Use SignedOffset rather than PreIndexed addressing mode for ca…	2020-10-02 09:18:00 -07:00
Andrew Brown	16a2538ecd	[machinst x64]: rename Inst::XmmUninitializedValue and document This approach is not the best but avoids an extra instruction; perhaps at some point, as mentioned in https://github.com/bytecodealliance/wasmtime/pull/2248, we will add the extra instruction or refactor things in such a way that this `Inst` variant is unnecessary.	2020-10-02 08:29:31 -07:00
Andrew Brown	3d9f3bf728	[machinst x64]: port CLIF tests related to comparison and lane operations	2020-10-02 08:29:31 -07:00
Joey Gouly	eec60c9b06	arm64: Use SignedOffset rather than PreIndexed addressing mode for callee-saved registers This also passes `fixed_frame_storage_size` (previously `total_sp_adjust`) into `gen_clobber_save` so that it can be combined with other stack adjustments. Copyright (c) 2020, Arm Limited.	2020-10-02 16:22:55 +01:00
Anton Kirilov	d18de69e5a	AArch64: Add test cases for callee-saved SIMD & FP registers Copyright (c) 2020, Arm Limited.	2020-09-30 14:19:02 +01:00
Andrew Brown	b43f4a464a	refactor: move all 'filetests/vcode' tests to 'filetests/isa'	2020-09-29 09:27:39 -07:00
Andrew Brown	452d854855	[machinst x64]: demonstrate that packed register moves are elided	2020-09-29 08:48:37 -07:00
Andrew Brown	b7217d454f	[machinst x64]: add lane-related CLIF filetests	2020-09-29 08:45:12 -07:00
Pat Hickey	b10beeee01	dep gardening (#2233 ) * wasmtime-profiling: latest object dep is 0.21.1 * latest gimli is 0.22 * bump cargo.lock	2020-09-26 00:49:28 -05:00

1 2 3 4 5 ...

860 Commits