wasmtime

Author	SHA1	Message	Date
Chris Fallin	3ee898cb2c	x64: support PC-rel symbol references using the GOT when in PIC mode.	2021-01-07 22:46:56 -08:00
Johnnie Birch	f705a72aeb	Refactor packed moves to use xmm_mov instead of xmm_rm_r Refactors previous packed move implementation to use xmm_mov instead of xmm_rm_r which looks to simplify register accounting during lowering.	2020-12-16 17:13:27 -08:00
Johnnie Birch	51973aefbb	Implements x64 SIMD loads for the new backend.	2020-12-16 17:13:27 -08:00
Chris Fallin	2cec20aa57	Merge pull request #2486 from cfallin/fix-probestack Two Lucet-related fixes to stack overflow handling.	2020-12-07 16:47:37 -08:00
Chris Fallin	3a01d14712	Two Lucet-related fixes to stack overflow handling. Lucet uses stack probes rather than explicit stack limit checks as Wasmtime does. In bytecodealliance/lucet#616, I have discovered that I previously was not running some Lucet runtime tests with the new backend, so was missing some test failures due to missing pieces in the new backend. This PR adds (i) calls to probestack, when enabled, in the prologue of every function with a stack frame larger than one page (configurable via flags); and (ii) trap metadata for every instruction on x86-64 that can access the stack, hence be the first point at which a stack overflow is detected when the stack pointer is decremented.	2020-12-07 16:08:53 -08:00
Chris Fallin	1dddba649a	x64 regalloc register order: put caller-saves (volatiles) first. The x64 backend currently builds the `RealRegUniverse` in a way that is generating somewhat suboptimal code. In many blocks, we see uses of callee-save (non-volatile) registers (r12, r13, r14, rbx) first, even in very short leaf functions where there are plenty of volatiles to use. This is leading to unnecessary spills/reloads. On one (local) test program, a medium-sized C benchmark compiled to Wasm and run on Wasmtime, I am seeing a ~10% performance improvement with this change; it will be less pronounced in programs with high register pressure (there we are likely to use all registers regardless, so the prologue/epilogue will save/restore all callee-saves), or in programs with fewer calls, but this is a clear win for small functions and in many cases removes prologue/epilogue clobber-saves altogether. Separately, I think the RA's coalescing is tripping up a bit in some cases; see e.g. the filetest touched by this commit that loads a value into %rsi then moves to %rax and returns immediately. This is an orthogonal issue, though, and should be addressed (if worthwhile) in regalloc.rs.	2020-12-06 22:37:43 -08:00
Johnnie Birch	a548516f97	Enable SIMD spec tests for f32x4_rounding and f64x4_rounding. Also address some review comments pointing out minor issues.	2020-12-02 13:44:51 -08:00
Johnnie Birch	a33e755cb2	Adds x86 SIMD support for Ceil, Floor, Trunc, and Nearest	2020-12-02 13:44:51 -08:00
Johnnie Birch	2cc501427e	Add remaining X86_64 support for pack w/ signed/unsigned saturation Adds lowering for packssdw, packusdw, packuswb	2020-11-22 23:14:29 -08:00
Johnnie Birch	124096735b	Add support for palignr for X86_64 vcode backend	2020-11-22 22:14:02 -08:00
Johnnie Birch	615a575da1	Add support for x86_64 packed move lowering for the vcode backend	2020-11-22 20:23:00 -08:00
Alex Crichton	4d64c68b05	Run rustfmt 1.48 Run rustfmt over wasmtime with the new stable release which looks like it wants to reformat a few lines.	2020-11-19 11:12:30 -08:00
Chris Fallin	073c727a74	x64 and aarch64: carry MemFlags on loads/stores; don't emit trap info unless an op can trap. This end result was previously enacted by carrying a `SourceLoc` on every load/store, which was somewhat cumbersome, and only indirectly encoded metadata about a memory reference (can it trap) by its presence or absence. We have a type for this -- `MemFlags` -- that tells us everything we might want to know about a load or store, and we should plumb it through to code emission instead. This PR attaches a `MemFlags` to an `Amode` on x64, and puts it on load and store `Inst` variants on aarch64. These two choices seem to factor things out in the nicest way: there are relatively few load/store insts on aarch64 but many addressing modes, while the opposite is true on x64.	2020-11-17 11:43:06 -08:00
Andrew Brown	8ba92853be	[machinst x64]: add punpack[hl]bw instructions	2020-11-12 14:21:45 -08:00
Andrew Brown	8131b15921	[machinst x64]: allow addressing of constants	2020-11-12 14:21:45 -08:00
Chris Fallin	4dce51096d	MachInst backends: handle SourceLocs out-of-band, not in Insts. In existing MachInst backends, many instructions -- any that can trap or result in a relocation -- carry `SourceLoc` values in order to propagate the location-in-original-source to use to describe resulting traps or relocation errors. This is quite tedious, and also error-prone: it is likely that the necessary plumbing will be missed in some cases, and in any case, it's unnecessarily verbose. This PR factors out the `SourceLoc` handling so that it is tracked during emission as part of the `EmitState`, and plumbed through automatically by the machine-independent framework. Instruction emission code that directly emits trap or relocation records can query the current location as necessary. Then we only need to ensure that memory references and trap instructions, at their (one) emission point rather than their (many) lowering/generation points, are wired up correctly. This does have the side-effect that some loads and stores that do not correspond directly to user code's heap accesses will have unnecessary but harmless trap metadata. For example, the load that fetches a code offset from a jump table will have a 'heap out of bounds' trap record attached to it; but because it is bounds-checked, and will never actually trap if the lowering is correct, this should be harmless. The simplicity improvement here seemed more worthwhile to me than plumbing through a "corresponds to user-level load/store" bit, because the latter is a bit complex when we allow for op merging. Closes #2290: though it does not implement a full "metadata" scheme as described in that issue, this seems simpler overall.	2020-11-10 15:46:53 -08:00
Yury Delendik	f60c0f3ec3	cranelift: refactor unwind logic to accommodate multiple backends (#2357 ) * Make cranelift_codegen::isa::unwind::input public * Move UnwindCode's common offset field out of the structure * Make MachCompileResult::unwind_info more generic * Record initial stack pointer offset	2020-11-05 16:57:40 -06:00
Andrew Brown	83f182b390	Implement initial emission of constants This approach suffers from memory-size bloat during compile time due to the desire to de-duplicate the constants emitted and reduce runtime memory-size. As a first step, though, this provides an end-to-end mechanism for constants to be emitted in the MachBuffer islands.	2020-11-05 14:25:02 -08:00
Andrew Brown	6725b6b129	[machinst x64]: implement bitmask	2020-10-28 15:16:36 -07:00
Andrew Brown	5b9a21e099	Add missing `SourceLoc` to newly-emitted instructions The changes in https://github.com/bytecodealliance/wasmtime/pull/2278 added `SourceLoc`s to several x64 `Inst` variants; between when that PR was last run in CI and when it was merged, new instructions were added that require this new parameter. This change adds the parameter in order to fix CI.	2020-10-28 14:33:09 -07:00
Johnnie Birch	8bbe6a25a9	Add support for packed float to signed int conversion Implements i32x4.trunc_sat_f32x4_s	2020-10-28 13:02:50 -07:00
Chris Fallin	c35904a8bf	Merge pull request #2278 from akirilov-arm/load_splat Introduce the Cranelift IR instruction `LoadSplat`	2020-10-28 12:54:03 -07:00
Yury Delendik	de4af90af6	machinst x64: New backend unwind (#2266 ) Addresses unwind for experimental x64 backend. The preliminary code enables backtrace on SystemV call convension.	2020-10-23 15:19:41 -05:00
Johnnie Birch	f27c0f3434	Adds support for signed packed integer conversion to float f32x4.convert_i32x4_s	2020-10-16 14:16:53 -07:00
Andrew Brown	d990dd4c9a	[machinst x64]: add source locations to more instruction formats In order to register traps for `load_splat`, several instruction formats need knowledge of `SourceLoc`s; however, since the x64 backend does not correctly and completely register traps for `RegMem::Mem` variants I opened https://github.com/bytecodealliance/wasmtime/issues/2290 to discuss and resolve this issue. In the meantime, the current behavior (i.e. remaining largely unaware of `SourceLoc`s) is retained.	2020-10-14 09:43:33 -07:00
Andrew Brown	1799b0947f	[machinst x64]: implement packed bitselect	2020-10-09 10:04:50 -07:00
Andrew Brown	95f0e96e62	[machinst x64]: implement packed not This begins to use `Inst` helper functions as discussed in #2252.	2020-10-09 10:04:50 -07:00
Andrew Brown	3c55523d40	[machinst x64]: implement packed and, and_not, xor, or	2020-10-09 10:04:50 -07:00
Benjamin Bouvier	e8c2a1763a	machinst x64: avoid emitting movzx when the input is an ALU 32-bits operation;	2020-10-09 18:49:27 +02:00
Andrew Brown	c8cce5d2d7	[machinst x64]: enable packed saturated arithmetic	2020-10-08 08:46:20 -07:00
Benjamin Bouvier	116acb8dcd	machinst x64: emit nop of variable sizes;	2020-10-08 10:05:57 +02:00
Benjamin Bouvier	a470f1e0cd	machinst x64: remove dead code and allow(dead_code) annotation; The BranchTarget is always used as a label, so just use a plain MachLabel in this case.	2020-10-08 10:05:57 +02:00
Benjamin Bouvier	e32e6fb612	machinst x64: check SSE requirements for instructions against enabled features;	2020-10-08 09:21:51 +02:00
Benjamin Bouvier	c5bbc87498	machinst: allow passing constant information to the instruction emitter; A new associated type Info is added to MachInstEmit, which is the immutable counterpart to State. It can't easily be constructed from an ABICallee, since it would require adding an associated type to the latter, and making so leaks the associated type in a lot of places in the code base and makes the code harder to read. Instead, the EmitInfo state can simply be passed to the `Vcode::emit` function directly.	2020-10-08 09:21:51 +02:00
Benjamin Bouvier	84ac3feef8	machinst x64: use zero-latency move instructions for f32/f64; As found by @julian-seward1, movss/movsd aren't included in the zero-latency move instructions section of the Intel optimization manual. Use MOVAPS instead for those moves.	2020-10-07 10:55:44 +02:00
Chris Fallin	71768bb6cf	Fix AArch64 ABI to respect half-caller-save, half-callee-save vec regs. This PR updates the AArch64 ABI implementation so that it (i) properly respects that v8-v15 inclusive have callee-save lower halves, and caller-save upper halves, by conservatively approximating (to full registers) in the appropriate directions when generating prologue caller-saves and when informing the regalloc of clobbered regs across callsites. In order to prevent saving all of these vector registers in the prologue of every non-leaf function due to the above approximation, this also makes use of a new regalloc.rs feature to exclude call instructions' writes from the clobber set returned by register allocation. This is safe whenever the caller and callee have the same ABI (because anything the callee could clobber, the caller is allowed to clobber as well without saving it in the prologue). Fixes #2254.	2020-10-06 14:44:02 -07:00
Benjamin Bouvier	df8f85f4bc	machinst x64: remove non_camel_case_types;	2020-10-05 17:44:31 +02:00
Benjamin Bouvier	4a10a78e33	machinst x64: remove non_snake_case;	2020-10-05 17:44:31 +02:00
Andrew Brown	16a2538ecd	[machinst x64]: rename Inst::XmmUninitializedValue and document This approach is not the best but avoids an extra instruction; perhaps at some point, as mentioned in https://github.com/bytecodealliance/wasmtime/pull/2248, we will add the extra instruction or refactor things in such a way that this `Inst` variant is unnecessary.	2020-10-02 08:29:31 -07:00
Andrew Brown	50b9399006	[machinst x64]: lower remaining lane operations--any_true, all_true, splat	2020-10-02 08:29:31 -07:00
Andrew Brown	4565582f02	[machinst x64]: clarify parameter name of Inst::xmm_rm_r_imm	2020-10-02 08:29:31 -07:00
Andrew Brown	0579e9f9de	[machinst x64]: add packed OR	2020-10-02 08:29:31 -07:00
Andrew Brown	74226d6781	[machinst x64]: add integer comparisons	2020-10-02 08:29:31 -07:00
Andrew Brown	4484a00ea5	[machinst x64]: calculate extension modes in one place	2020-09-29 14:48:59 -07:00
Andrew Brown	f50d905152	[machinst x64]: refactor using added RegMem::from(Writable<Reg>)	2020-09-29 08:45:12 -07:00
Andrew Brown	050f078f86	[machinst x64]: add saturating addition implementation	2020-09-29 08:45:12 -07:00
Andrew Brown	a64abf9b76	[machinst x64]: add shuffle implementation	2020-09-29 08:45:12 -07:00
Andrew Brown	f4836f9ca9	[machinst x64]: add extractlane implementation	2020-09-29 08:45:12 -07:00
Andrew Brown	29fa894790	[machinst x64]: add insertlane implementation	2020-09-29 08:45:12 -07:00
Andrew Brown	48cf45491d	[machinst x64]: inform the register allocator of more types of packed moves	2020-09-25 18:59:01 -07:00

1 2 3

139 Commits