wasmtime

Author	SHA1	Message	Date
Chris Fallin	c84d6be6f4	Detailed debug-info (DWARF) support in new backends (initially x64). This PR propagates "value labels" all the way from CLIF to DWARF metadata on the emitted machine code. The key idea is as follows: - Translate value-label metadata on the input into "value_label" pseudo-instructions when lowering into VCode. These pseudo-instructions take a register as input, denote a value label, and semantically are like a "move into value label" -- i.e., they update the current value (as seen by debugging tools) of the given local. These pseudo-instructions emit no machine code. - Perform a dataflow analysis at the machine-code level, tracking value-labels that propagate into registers and into [SP+constant] stack storage. This is a forward dataflow fixpoint analysis where each storage location can contain a set of value labels, and each value label can reside in a set of storage locations. (Meet function is pairwise intersection by storage location.) This analysis traces value labels symbolically through loads and stores and reg-to-reg moves, so it will naturally handle spills and reloads without knowing anything special about them. - When this analysis converges, we have, at each machine-code offset, a mapping from value labels to some number of storage locations; for each offset for each label, we choose the best location (prefer registers). Note that we can choose any location, as the symbolic dataflow analysis is sound and guarantees that the value at the value_label instruction propagates to all of the named locations. - Then we can convert this mapping into a format that the DWARF generation code (wasmtime's debug crate) can use. This PR also adds the new-backend variant to the gdb tests on CI.	2021-01-21 15:59:49 -08:00
bjorn3	81d248c057	Implement Mach-O TLS access for x64 newBE	2021-01-21 18:25:56 +01:00
Chris Fallin	0f563f786a	Add ELF TLS support in new x64 backend. This follows the implementation in the legacy x86 backend, including hardcoded sequence that is compatible with what the linker expects. We could potentially do better here, but it is likely not necessary. Thanks to @bjorn3 for a bugfix to an earlier version of this.	2021-01-17 22:48:51 -08:00
Chris Fallin	71ead6e31d	x64 backend: implement 128-bit ops and misc fixes. This implements all of the ops on I128 that are implemented by the legacy x86 backend, and includes all that are required by at least one major use-case (cg_clif rustc backend). The sequences are open-coded where necessary; for e.g. the bit operations, this can be somewhat complex, but these sequences have been tested carefully. This PR also includes a drive-by fix of clz/ctz for 8- and 16-bit cases where they were incorrect previously. Also includes ridealong fixes developed while bringing up cg_clif support, because they are difficult to completely separate due to other refactors that occurred in this PR: - fix REX prefix logic for some 8-bit instructions. When using an 8-bit register in 64-bit mode on x86-64, the REX prefix semantics are somewhat subtle: without the REX prefix, register numbers 4--7 correspond to the second-to-lowest byte of the first four registers (AH, CH, BH, DH), whereas with the REX prefix, these register numbers correspond to the usual encoding (SPL, BPL, SIL, DIL). We could always emit a REX byte for instructions with 8-bit cases (this is harmless even if unneeded), but this would unnecessarily inflate code size; instead, the usual approach is to emit it only for these registers. This logic was present in some cases but missing for some other instructions: divide, not, negate, shifts. Fixes #2508. - avoid unaligned SSE loads on some f64 ops. The implementations of several FP ops, such as fabs/fneg, used SSE instructions. This is not a problem per-se, except that load-op merging did not take alignment into account. Specifically, if an op on an f64 loaded from memory happened to merge that load, and the instruction into which it was merged was an SSE instruction, then the SSE instruction imposes stricter (128-bit) alignment requirements than the load.f64 did. This PR simply forces any instruction lowerings that could use SSE instructions to implement non-SIMD operations to take inputs in registers only, and avoid load-op merging. Fixes #2507. - two bugfixes exposed by cg_clif: urem/srem.i8, select.b1. - urem/srem.i8: the 8-bit form of the DIV instruction on x86-64 places the remainder in AH, not RDX, different from all the other width-forms of this instruction. - select.b1: we were not recognizing selects of boolean values as integer-typed operations, so we were generating XMM moves instead (!).	2021-01-14 13:45:50 -08:00
Chris Fallin	81bc811236	Merge pull request #2558 from cfallin/pic-symbol-refs x64: support PC-rel symbol references using the GOT when in PIC mode.	2021-01-08 10:03:10 -08:00
Chris Fallin	3ee898cb2c	x64: support PC-rel symbol references using the GOT when in PIC mode.	2021-01-07 22:46:56 -08:00
Chris Fallin	dbd2241b60	x64: handle tests of b1 values correctly (only LSB is defined). Previously, `select` and `brz`/`brnz` instructions, when given a `b1` boolean argument, would test whether that boolean argument was nonzero, rather than whether its LSB was nonzero. Since our invariant for mapping CLIF state to machine state is that bits beyond the width of a value are undefined, the proper lowering is to test only the LSB. (aarch64 does not have the same issue because its `Extend` pseudoinst already properly handles masking of b1 values when a zero-extend is requested, as it is for select/brz/brnz.) Found by Nathan Ringo on Zulip [1] (thanks!). [1] https://bytecodealliance.zulipchat.com/#narrow/stream/217117-cranelift/topic/bnot.20on.20b1s	2021-01-05 14:45:46 -08:00
Yury Delendik	2964023a77	[SIMD][x86_64] Add encoding for PMADDWD (#2530 ) * [SIMD][x86_64] Add encoding for PMADDWD * also for "experimental_x64"	2020-12-24 07:52:50 -06:00
Johnnie Birch	f705a72aeb	Refactor packed moves to use xmm_mov instead of xmm_rm_r Refactors previous packed move implementation to use xmm_mov instead of xmm_rm_r which looks to simplify register accounting during lowering.	2020-12-16 17:13:27 -08:00
Johnnie Birch	51973aefbb	Implements x64 SIMD loads for the new backend.	2020-12-16 17:13:27 -08:00
Chris Fallin	2cec20aa57	Merge pull request #2486 from cfallin/fix-probestack Two Lucet-related fixes to stack overflow handling.	2020-12-07 16:47:37 -08:00
Chris Fallin	3a01d14712	Two Lucet-related fixes to stack overflow handling. Lucet uses stack probes rather than explicit stack limit checks as Wasmtime does. In bytecodealliance/lucet#616, I have discovered that I previously was not running some Lucet runtime tests with the new backend, so was missing some test failures due to missing pieces in the new backend. This PR adds (i) calls to probestack, when enabled, in the prologue of every function with a stack frame larger than one page (configurable via flags); and (ii) trap metadata for every instruction on x86-64 that can access the stack, hence be the first point at which a stack overflow is detected when the stack pointer is decremented.	2020-12-07 16:08:53 -08:00
Johnnie Birch	a33e755cb2	Adds x86 SIMD support for Ceil, Floor, Trunc, and Nearest	2020-12-02 13:44:51 -08:00
Johnnie Birch	2cc501427e	Add remaining X86_64 support for pack w/ signed/unsigned saturation Adds lowering for packssdw, packusdw, packuswb	2020-11-22 23:14:29 -08:00
Johnnie Birch	124096735b	Add support for palignr for X86_64 vcode backend	2020-11-22 22:14:02 -08:00
Johnnie Birch	615a575da1	Add support for x86_64 packed move lowering for the vcode backend	2020-11-22 20:23:00 -08:00
Chris Fallin	073c727a74	x64 and aarch64: carry MemFlags on loads/stores; don't emit trap info unless an op can trap. This end result was previously enacted by carrying a `SourceLoc` on every load/store, which was somewhat cumbersome, and only indirectly encoded metadata about a memory reference (can it trap) by its presence or absence. We have a type for this -- `MemFlags` -- that tells us everything we might want to know about a load or store, and we should plumb it through to code emission instead. This PR attaches a `MemFlags` to an `Amode` on x64, and puts it on load and store `Inst` variants on aarch64. These two choices seem to factor things out in the nicest way: there are relatively few load/store insts on aarch64 but many addressing modes, while the opposite is true on x64.	2020-11-17 11:43:06 -08:00
Andrew Brown	8ba92853be	[machinst x64]: add punpack[hl]bw instructions	2020-11-12 14:21:45 -08:00
Andrew Brown	8131b15921	[machinst x64]: allow addressing of constants	2020-11-12 14:21:45 -08:00
Chris Fallin	4dce51096d	MachInst backends: handle SourceLocs out-of-band, not in Insts. In existing MachInst backends, many instructions -- any that can trap or result in a relocation -- carry `SourceLoc` values in order to propagate the location-in-original-source to use to describe resulting traps or relocation errors. This is quite tedious, and also error-prone: it is likely that the necessary plumbing will be missed in some cases, and in any case, it's unnecessarily verbose. This PR factors out the `SourceLoc` handling so that it is tracked during emission as part of the `EmitState`, and plumbed through automatically by the machine-independent framework. Instruction emission code that directly emits trap or relocation records can query the current location as necessary. Then we only need to ensure that memory references and trap instructions, at their (one) emission point rather than their (many) lowering/generation points, are wired up correctly. This does have the side-effect that some loads and stores that do not correspond directly to user code's heap accesses will have unnecessary but harmless trap metadata. For example, the load that fetches a code offset from a jump table will have a 'heap out of bounds' trap record attached to it; but because it is bounds-checked, and will never actually trap if the lowering is correct, this should be harmless. The simplicity improvement here seemed more worthwhile to me than plumbing through a "corresponds to user-level load/store" bit, because the latter is a bit complex when we allow for op merging. Closes #2290: though it does not implement a full "metadata" scheme as described in that issue, this seems simpler overall.	2020-11-10 15:46:53 -08:00
Andrew Brown	83f182b390	Implement initial emission of constants This approach suffers from memory-size bloat during compile time due to the desire to de-duplicate the constants emitted and reduce runtime memory-size. As a first step, though, this provides an end-to-end mechanism for constants to be emitted in the MachBuffer islands.	2020-11-05 14:25:02 -08:00
Andrew Brown	6725b6b129	[machinst x64]: implement bitmask	2020-10-28 15:16:36 -07:00
Johnnie Birch	8bbe6a25a9	Add support for packed float to signed int conversion Implements i32x4.trunc_sat_f32x4_s	2020-10-28 13:02:50 -07:00
Chris Fallin	c35904a8bf	Merge pull request #2278 from akirilov-arm/load_splat Introduce the Cranelift IR instruction `LoadSplat`	2020-10-28 12:54:03 -07:00
Johnnie Birch	f27c0f3434	Adds support for signed packed integer conversion to float f32x4.convert_i32x4_s	2020-10-16 14:16:53 -07:00
Andrew Brown	d990dd4c9a	[machinst x64]: add source locations to more instruction formats In order to register traps for `load_splat`, several instruction formats need knowledge of `SourceLoc`s; however, since the x64 backend does not correctly and completely register traps for `RegMem::Mem` variants I opened https://github.com/bytecodealliance/wasmtime/issues/2290 to discuss and resolve this issue. In the meantime, the current behavior (i.e. remaining largely unaware of `SourceLoc`s) is retained.	2020-10-14 09:43:33 -07:00
Andrew Brown	3c55523d40	[machinst x64]: implement packed and, and_not, xor, or	2020-10-09 10:04:50 -07:00
Andrew Brown	c8cce5d2d7	[machinst x64]: enable packed saturated arithmetic	2020-10-08 08:46:20 -07:00
Benjamin Bouvier	116acb8dcd	machinst x64: emit nop of variable sizes;	2020-10-08 10:05:57 +02:00
Benjamin Bouvier	a470f1e0cd	machinst x64: remove dead code and allow(dead_code) annotation; The BranchTarget is always used as a label, so just use a plain MachLabel in this case.	2020-10-08 10:05:57 +02:00
Benjamin Bouvier	e32e6fb612	machinst x64: check SSE requirements for instructions against enabled features;	2020-10-08 09:21:51 +02:00
Benjamin Bouvier	c5bbc87498	machinst: allow passing constant information to the instruction emitter; A new associated type Info is added to MachInstEmit, which is the immutable counterpart to State. It can't easily be constructed from an ABICallee, since it would require adding an associated type to the latter, and making so leaks the associated type in a lot of places in the code base and makes the code harder to read. Instead, the EmitInfo state can simply be passed to the `Vcode::emit` function directly.	2020-10-08 09:21:51 +02:00
Benjamin Bouvier	df8f85f4bc	machinst x64: remove non_camel_case_types;	2020-10-05 17:44:31 +02:00
Benjamin Bouvier	4a10a78e33	machinst x64: remove non_snake_case;	2020-10-05 17:44:31 +02:00
Andrew Brown	16a2538ecd	[machinst x64]: rename Inst::XmmUninitializedValue and document This approach is not the best but avoids an extra instruction; perhaps at some point, as mentioned in https://github.com/bytecodealliance/wasmtime/pull/2248, we will add the extra instruction or refactor things in such a way that this `Inst` variant is unnecessary.	2020-10-02 08:29:31 -07:00
Andrew Brown	50b9399006	[machinst x64]: lower remaining lane operations--any_true, all_true, splat	2020-10-02 08:29:31 -07:00
Andrew Brown	0579e9f9de	[machinst x64]: add packed OR	2020-10-02 08:29:31 -07:00
Andrew Brown	74226d6781	[machinst x64]: add integer comparisons	2020-10-02 08:29:31 -07:00
Andrew Brown	050f078f86	[machinst x64]: add saturating addition implementation	2020-09-29 08:45:12 -07:00
Andrew Brown	a64abf9b76	[machinst x64]: add shuffle implementation	2020-09-29 08:45:12 -07:00
Andrew Brown	f4836f9ca9	[machinst x64]: add extractlane implementation	2020-09-29 08:45:12 -07:00
Andrew Brown	29fa894790	[machinst x64]: add insertlane implementation	2020-09-29 08:45:12 -07:00
Andrew Brown	ac2bf9d246	[machinst x64]: add packed min/max implementations	2020-09-23 15:40:46 -07:00
Andrew Brown	7546d98844	[machinst x64]: add avg_round implementation	2020-09-23 15:40:46 -07:00
Andrew Brown	b202464fa0	[machinst x64]: add iabs implementation	2020-09-23 15:40:46 -07:00
Benjamin Bouvier	3849dc18b1	machinst x64: revamp integer immediate emission; In particular: - try to optimize the integer emission into a 32-bit emission, when the high bits are all zero, and stop relying on the caller of `imm_r` to ensure this. - rename `Inst::imm_r`/`Inst::Imm_R` to `Inst::imm`/`Inst::Imm`. - generate a sign-extending mov 32-bit immediate to 64-bits, whenever possible. - fix a few places where the previous commit did introduce the generation of zero-constants with xor, when calling `put_input_to_reg`, thus clobbering the flags before they were read.	2020-09-11 18:13:30 +02:00
bjorn3	9428480230	Merge SignExtendAlAh and SignExtendRaxRdx	2020-09-08 15:00:24 +02:00
bjorn3	067255ef45	x64: Implement rotl and rotr for small integers	2020-09-08 15:00:24 +02:00
bjorn3	ce033f2a0c	x64: Fix udiv and sdiv for 8bit integers	2020-09-08 15:00:24 +02:00
bjorn3	74642b166f	x64: Implement ineg and bnot	2020-09-08 15:00:24 +02:00

1 2 3 4 5

210 Commits