wasmtime

Author	SHA1	Message	Date
Chris Fallin	71768bb6cf	Fix AArch64 ABI to respect half-caller-save, half-callee-save vec regs. This PR updates the AArch64 ABI implementation so that it (i) properly respects that v8-v15 inclusive have callee-save lower halves, and caller-save upper halves, by conservatively approximating (to full registers) in the appropriate directions when generating prologue caller-saves and when informing the regalloc of clobbered regs across callsites. In order to prevent saving all of these vector registers in the prologue of every non-leaf function due to the above approximation, this also makes use of a new regalloc.rs feature to exclude call instructions' writes from the clobber set returned by register allocation. This is safe whenever the caller and callee have the same ABI (because anything the callee could clobber, the caller is allowed to clobber as well without saving it in the prologue). Fixes #2254.	2020-10-06 14:44:02 -07:00
Johnnie Birch	5799fd3cc0	Add file test simd-arithmetic-run to x64 backend Copies over simd-arithmetic-run from the old backend, adding several run tests including for min/max. Tests not supported are commented out.	2020-10-02 16:20:10 -07:00
Chris Fallin	b2f52910fb	Merge pull request #2224 from jgouly/sp_adjust arm64: Use SignedOffset rather than PreIndexed addressing mode for ca…	2020-10-02 09:18:00 -07:00
Andrew Brown	16a2538ecd	[machinst x64]: rename Inst::XmmUninitializedValue and document This approach is not the best but avoids an extra instruction; perhaps at some point, as mentioned in https://github.com/bytecodealliance/wasmtime/pull/2248, we will add the extra instruction or refactor things in such a way that this `Inst` variant is unnecessary.	2020-10-02 08:29:31 -07:00
Andrew Brown	3d9f3bf728	[machinst x64]: port CLIF tests related to comparison and lane operations	2020-10-02 08:29:31 -07:00
Joey Gouly	eec60c9b06	arm64: Use SignedOffset rather than PreIndexed addressing mode for callee-saved registers This also passes `fixed_frame_storage_size` (previously `total_sp_adjust`) into `gen_clobber_save` so that it can be combined with other stack adjustments. Copyright (c) 2020, Arm Limited.	2020-10-02 16:22:55 +01:00
Anton Kirilov	d18de69e5a	AArch64: Add test cases for callee-saved SIMD & FP registers Copyright (c) 2020, Arm Limited.	2020-09-30 14:19:02 +01:00
Andrew Brown	b43f4a464a	refactor: move all 'filetests/vcode' tests to 'filetests/isa'	2020-09-29 09:27:39 -07:00
Andrew Brown	452d854855	[machinst x64]: demonstrate that packed register moves are elided	2020-09-29 08:48:37 -07:00
Andrew Brown	b7217d454f	[machinst x64]: add lane-related CLIF filetests	2020-09-29 08:45:12 -07:00
Benjamin Bouvier	e2c286deeb	machinst x64: enable clif testing This adds a new feature experimental_x64 for CLIF tests. A test is run in the new x64 backend iff: - either the test doesn't have an x86_64 target requirement, signaling it must be target agnostic or not run on this target. - or the test does require the x86_64 target, and the test is marked with the `experimental_x64` feature. This required one workaround in the parser. The reason is that the parser will try to use information not provided by the TargetIsa adapter for the Mach backends, like register names. In particular, parsing test may fail before the test runner realizes that the test must not be run. In this case, we early return an almost-empty TestFile from the parser, under the same conditions as above, so that the caller may filter out the test properly. This also copies two tests from the test suite using the new backend, for demonstration purposes.	2020-09-25 11:12:21 +02:00
bjorn3	5c5a30f76c	Fix review comments	2020-07-17 12:03:17 +02:00
bjorn3	7b7b1f4997	Rename sarg__ to sarg_t	2020-07-17 12:03:17 +02:00
bjorn3	4431ac1108	Implement SystemV struct argument passing	2020-07-17 12:03:17 +02:00
Andrew Brown	f0b083c6ad	Legalize `[u\|s]widen_high` for x86 Use `x86_palignr` and `[u\|s]widen_low` for legalizing this instruction.	2020-07-15 11:32:08 -07:00
Andrew Brown	c8ddf8a34c	Encode `[u\|s]widen_low` for x86	2020-07-15 11:32:08 -07:00
Andrew Brown	fafef7db77	Add `x86_palignr` instructions This instruction is necessary for implementing `[s\|u]widen_high`.	2020-07-15 11:32:08 -07:00
Andrew Brown	c5a69cee9f	Add x86 legalization for fcvt_to_uint_sat.i32x4 This converts an `f32x4` into an `i32x4` (unsigned) with rounding by using a long sequence of SSE4.1 compatible instructions.	2020-07-08 10:20:01 -07:00
Peter Huene	3a33749404	Remove 'set frame pointer' unwind code from Windows x64 unwind. This commit removes the "set frame pointer" unwind code and frame pointer information from Windows x64 unwind information. In Windows x64 unwind information, a "frame pointer" is actually the base address of the static part of the local frame and would be at some negative offset to RSP upon establishing the frame pointer. Currently Cranelift uses a "traditional" notion of a frame pointer, one that is the highest address in the local frame (i.e. pointing at the previous frame pointer on the stack). Windows x64 unwind doesn't describe such frame pointers and only needs one described if the frame contains a dynamic stack allocation. Fixes #1967.	2020-07-06 14:22:57 -07:00
Andrew Brown	057c93b64e	Add `unarrow` instruction with x86 implementation Adds a shared `unarrow` instruction in order to lower the Wasm SIMD specification's unsigned narrowing (see https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#integer-to-integer-narrowing). Additionally, this commit implements the instruction for x86 using PACKUSWB and PACKUSDW for the applicable encodings.	2020-07-02 09:35:45 -07:00
Andrew Brown	65e6de2344	Replace `x86_packss` with `snarrow` Since the Wasm specification contains narrowing instructions (see https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#integer-to-integer-narrowing) that lower to PACKSS*, the x86-specific instruction is not necessary in the CLIF IR.	2020-07-02 09:35:45 -07:00
Chris Fallin	a351fa52b5	Merge pull request #1930 from cfallin/spectre-heap Spectre mitigation on heap access overflow checks.	2020-07-01 09:23:04 -07:00
Chris Fallin	e694fb1312	Spectre mitigation on heap access overflow checks. This PR adds a conditional move following a heap bounds check through which the address to be accessed flows. This conditional move ensures that even if the branch is mispredicted (access is actually out of bounds, but speculation goes down in-bounds path), the acually accessed address is zero (a NULL pointer) rather than the out-of-bounds address. The mitigation is controlled by a flag that is off by default, but can be set by the embedding. Note that in order to turn it on by default, we would need to add conditional-move support to the current x86 backend; this does not appear to be present. Once the deprecated backend is removed in favor of the new backend, IMHO we should turn this flag on by default. Note that the mitigation is unneccessary when we use the "huge heap" technique on 64-bit systems, in which we allocate a range of virtual address space such that no 32-bit offset can reach other data. Hence, this only affects small-heap configurations.	2020-07-01 08:36:09 -07:00
Andrew Brown	737cf1d605	Implement `iabs` for x86 SIMD This only covers the types necessary for implementing the Wasm SIMD spec--`i8x16`, `i16x8`, `i32x4`.	2020-06-30 14:00:17 -07:00
Andrew Brown	c9d573d841	Provide spec-compliant legalization for SIMD floating point min/max	2020-06-25 14:48:16 -07:00
Andrew Brown	3675f95bb2	Legalize fcvt_to_sint_sat.i32x4 on x86 Use a lengthy sequence involving CVTTPS2DQ to quiet NaNs and saturate overflow.	2020-06-18 11:39:38 -07:00
Andrew Brown	01d34e71b9	Add x86 legalization for fcvt_from_uint.f32x4 This converts an `i32x4` into an `f32x4` with some rounding either by using an AVX512VL/F instruction--VCVTUDQ2PS--or a long sequence of SSE4.1 compatible instructions.	2020-06-12 15:06:22 -07:00
Andrew Brown	772ce73f7f	Add x86_pblendw instruction This instruction is necessary for lowering `fcvt_from_uint`.	2020-06-12 15:06:22 -07:00
Andrew Brown	546fc9ddf1	Add x86_vcvtudq2ps instruction This instruction converts i32x4 to f32x4 in several AVX512 feature sets.	2020-06-12 15:06:22 -07:00
whitequark	3796164642	x86_32: legalize br{z,nz}.i64.	2020-06-08 12:52:13 -07:00
Andrew Brown	1ea09088be	Add x86 legalization for imul.i64x2 for non-AVX CPUs The `convert_i64x2_imul` custom legalization checks the ISA flags for AVX512DQ or AVX512VL support and legalizes `imul.i64x2` to an `x86_pmullq` in this case; if not, it uses a lengthy SSE2-compatible instruction sequence.	2020-06-03 16:27:57 -07:00
Andrew Brown	5a32500518	Remove non-existent x86 encoding for sshr_imm.i64x2 This instruction does not exist in the SSE2 feature set; it can be added later with an VEX/EVEX encoding.	2020-06-03 16:27:57 -07:00
Andrew Brown	df171f01b5	Add x86_pmuludq This instruction multiplies the lower 32 bits of two 64x2 unsigned integers into an i64x2; this is necessary for lowering Wasm's i64x2.mul.	2020-06-03 16:27:57 -07:00
Andrew Brown	9ba9fd0f64	Add x86-specific instruction for i64x2 multiplication Without this special instruction, legalizing to the AVX512 instruction AND the SSE instruction sequence is impossible. This extra instruction would be rendered unnecessary by the x64 backend.	2020-06-03 16:27:57 -07:00
Nick Fitzgerald	7c68a10ed6	Merge pull request #1670 from teapotd/win64-pass-by-ref Implement passing arguments by ref for win64 ABI	2020-06-01 11:13:30 -07:00
Andrew Brown	7d6e94b952	Replace InsertLane format with TernaryImm8 The InsertLane format has an ordering (`value().imm().value()`) and immediate name (`"lane"`) that make it awkward to use for other instructions. This changes the ordering (`value().value().imm()`) and uses the default name (`"imm"`) throughout the codebase.	2020-05-29 19:56:27 -07:00
teapotd	e430984ac4	Improve bitselect codegen with knowledge of operand origin (#1783 ) * Encode vselect using BLEND instructions on x86 * Legalize vselect to bitselect * Optimize bitselect to vselect for some operands * Add run tests for bitselect-vselect optimization * Address review feedback	2020-05-29 19:53:11 -07:00
teapotd	759cc3e751	Implement passing arguments by ref for win64 ABI	2020-05-29 20:12:41 +02:00
Nick Fitzgerald	94380bf2b7	Merge pull request #1510 from teapotd/abi-i128-fix Always check if struct-return parameter is needed	2020-05-29 10:02:16 -07:00
whitequark	a180b5b393	x86_32: fix stack_addr encoding. Consider this testcase: target i686 function u0:0() -> i32 system_v { ss0 = explicit_slot 0 block0: v2 = stack_addr.i32 ss0 return v2 } Before this commit, in 32-bit mode the x86 backend would generate incorrect code for stack addresses: 0: 55 push ebp 1: 89 e5 mov ebp, esp 3: 83 ec 08 sub esp, 8 6: 8d 44 24 00 lea eax, [esp] a: 00 00 add byte ptr [eax], al c: 00 83 c4 08 5d c3 add byte ptr [ebx - 0x3ca2f73c], al This happened because the ModRM byte indicated a disp8 encoding, but the instruction actually used a disp32 encoding. After this commit, correct code is generated: 0: 55 push ebp 1: 89 e5 mov ebp, esp 3: 83 ec 08 sub esp, 8 6: 8d 84 24 00 00 00 00 lea eax, [esp] d: 83 c4 08 add esp, 8 10: 5d pop ebp 11: c3 ret	2020-05-29 09:17:36 -07:00
whitequark	880e692fd4	x86: add encoding for bnot.b1. Fixes #1743. Co-authored-by: iximeow <git@iximeow.net>	2020-05-28 08:43:25 -07:00
teapotd	fbac2e53f9	Make vconst BxN match specification	2020-05-27 09:37:13 -07:00
teapotd	b18846057f	Add system_v legalizer tests for i128 args	2020-05-25 20:03:24 +02:00
teapotd	0f55bb4b8d	Always check if struct-return parameter is needed	2020-05-25 20:03:24 +02:00
Peter Huene	78c3091e84	Fix FPR saving and shadow space allocation for Windows x64. This commit fixes both how FPR callee-saved registers are saved and how the shadow space allocation occurs when laying out the stack for Windows x64 calling convention. Importantly, this commit removes the compiler limitation of stack size for Windows x64 that was imposed because FPR saves previously couldn't always be represented in the unwind information. The FPR saves are now performed without using stack slots, much like how the callee-saved GPRs are saved. The total CSR space is given to `layout_stack` so that it is included in the frame size and to offset the layout of spills and explicit slots. The FPR saves are now done via an RSP offset (post adjustment) and they always follow the GPR saves on the stack. A simpler calculation can now be made to determine the proper offsets of the FPR saves for representing the unwind information. Additionally, the shadow space is no longer treated as an incoming argument, but an explicit stack slot that gets laid out at the lowest address possible in the local frame. This prevents `layout_stack` from putting a spill or explicit slot in this reserved space. In the future, `layout_stack` should take advantage of the caller-provided shadow space for spills, but this commit does not attempt to address that. The shadow space is now omitted from the local frame for leaf functions. Fixes #1728. Fixes #1587. Fixes #1475.	2020-05-20 15:37:30 -07:00
Nick Fitzgerald	52c6ece5f3	peepmatic: Make peepmatic optional to enable Rather than outright replacing parts of our existing peephole optimizations passes, this makes peepmatic an optional cargo feature that can be enabled. This allows us to take a conservative approach with enabling peepmatic everywhere, while also allowing us to get it in-tree and make it easier to collaborate on improving it quickly.	2020-05-14 07:52:23 -07:00
Nick Fitzgerald	090d1c2d32	cranelift: Port most of `simple_preopt.rs` over to the `peepmatic` DSL This ports all of the identity, no-op, simplification, and canonicalization related optimizations over from being hand-coded to the `peepmatic` DSL. This does not handle the branch-to-branch optimizations or most of the divide-by-constant optimizations.	2020-05-14 07:52:23 -07:00
whitequark	4ec16fa057	Legalize 64 bit shifts on x86_32 using PSLLQ/PSRLQ. Co-authored-by: iximeow <git@iximeow.net>	2020-05-09 03:28:19 -07:00
whitequark	162fcd3d75	Legalize [su]extend.i64 to iconst/sshr_imm + iconcat. This was already done for [su]extend.i128, and is necessary for codegen for 32-bit x86.	2020-05-05 16:08:58 -07:00
whitequark	14bdaf3ce3	Legalize ireduce.iN.i2N to isplit.	2020-05-05 14:13:30 -07:00

1 2 3 4 5

210 Commits