wasmtime

Author	SHA1	Message	Date
Benjamin Bouvier	46093f6119	Bump regalloc.rs to 0.0.26; And adapt to regalloc.rs API change to provide the exact number of vregs.	2020-06-10 18:23:04 +02:00
Anton Kirilov	d941034c2e	Enable the wast::Cranelift::spec::simd::simd_load_splat test for AArch64 Copyright (c) 2020, Arm Limited.	2020-06-10 15:01:37 +01:00
Chris Fallin	ac87ed12bd	Merge pull request #1847 from akirilov-arm/simd_load_extend Enable the wast::Cranelift::spec::simd::simd_load_extend test for AArch64	2020-06-09 12:29:06 -07:00
Joey Gouly	df2b031b6a	arm64: Implement Icmp for I16X8 and I32X4 Copyright (c) 2020, Arm Limited.	2020-06-09 11:07:43 -07:00
Anton Kirilov	7ac19af498	Enable the wast::Cranelift::spec::simd::simd_load_extend test for AArch64 Copyright (c) 2020, Arm Limited.	2020-06-09 18:05:38 +01:00
Chris Fallin	8da71a145c	Merge pull request #1802 from akirilov-arm/simd_align Enable the wast::Cranelift::spec::simd::simd_align test for AArch64	2020-06-09 09:58:26 -07:00
Chris Fallin	02ae1b4464	Merge pull request #1846 from julian-seward1/better-phis Rewrite `lower_edge` to produce better phi-translations:	2020-06-09 09:56:52 -07:00
Anton Kirilov	51a551fb39	Implement vector element extensions for AArch64 This commit also includes load and extend operations. Both are prerequisites for enabling further SIMD spec tests. Copyright (c) 2020, Arm Limited.	2020-06-09 12:28:49 +01:00
Julian Seward	6d25759c8e	Rewrite `lower_edge` to produce better phi-translations: * ensure that all const assignments are placed at the end of the sequence. This minimises live ranges. * for the non-const assignments, ignore self-assignments. This can dramatically reduce the total number of moves generated, because any self-assignments trigger the overlap-case handling, hence invoking the double-copy behaviour in cases where it's not necessary. It's worth pointing out that self-assignments are common, and are not due to deficiencies in CLIR optimisation. Rather, they occur whenever a loop back edge doesn't modify all loop-carried values. This can easily happen if the loop has multiple "early" back-edges -- "continues" in C parlance. Eg: loop_header(a, b, c, d, e, f): ... a_new = ... b_new = ... if (..) goto loop_header(a_new, b_new, c, d, e, f) ... c_new = ... d_new = ... if (..) goto loop_header(a_new, b_new, c_new, d_new, e, f) etc For functions with many live values, this can dramatically reduce the number of spill moves we throw into the register allocator. In terms of compilation costs, this ranges from neutral for functions which spill not at all, or minimally (joey_small, joey_med) to a 7.1% reduction in insn count. In terms of run costs, for one spill-heavy test (bz2 w/ custom timing harness), instruction counts are reduced by 4.3%, data reads by 12.3% and data writes by 18.5%. Note those last two figures include all reads and writes made by the generated code, not just spills/reloads, so the proportional reduction in spill/reload traffic must be greater.	2020-06-09 10:36:32 +02:00
Nick Fitzgerald	fb9f39ce17	Merge pull request #1824 from fitzgen/test-stack-maps cranelift: Better document and test stack maps	2020-06-08 15:58:20 -07:00
Nick Fitzgerald	6aac4c891e	cranelift: Better document and test stack maps	2020-06-08 15:05:20 -07:00
Chris Fallin	e3d89c8a92	Merge pull request #1825 from cfallin/spidermonkey-fixes Three fixes to various SpiderMonkey-related issues	2020-06-08 13:54:13 -07:00
Chris Fallin	fc2a6f273b	Three fixes to various SpiderMonkey-related issues: - Properly mask constant values down to appropriate width when generating a constant value directly in aarch64 backend. This was a miscompilation introduced in the new-isel refactor. In combination with failure to respect NarrowValueMode, this resulted in a very subtle bug when an `i32` constant was used in bit-twiddling logic. - Add support for `iadd_ifcout` in aarch64 backend as used in explicit heap-check mode. With this change, we no longer fail heap-related tests with the huge-heap-region mode disabled. - Remove a panic that was occurring in some tests that are currently ignored on aarch64, by simply returning empty/default information in `value_label` functionality rather than touching unimplemented APIs. This is not a bugfix per-se, but removes confusing panic messages from `cargo test` output that might otherwise mislead.	2020-06-08 13:02:00 -07:00
whitequark	bc555468a7	cranelift: add i64.{ishl,ushr,ashr} libcalls. These libcalls are useful for 32-bit platforms. On x86_32 in particular, commit `4ec16fa0` added support for legalizing 64-bit shifts through SIMD operations. However, that legalization requires SIMD to be enabled and SSE 4.1 to be supported, which is not acceptable as a hard requirement.	2020-06-05 12:13:49 -07:00
Chris Fallin	00abfcd943	Merge pull request #1817 from cfallin/issue-1809 Avoid touching encodings in `EncCursor` if using a MachInst backend.	2020-06-04 12:50:39 -07:00
Yury Delendik	6f37204f82	Upgrade gimli to 0.21 (#1819 ) * Use gimli 0.21 * rm CFI w Expression * Don't write .debug_frame twice	2020-06-04 14:34:05 -05:00
Chris Fallin	63a335b7d4	Avoid touching encodings in `EncCursor` if using a MachInst backend. `EncCursor` is a variant of `Cursor` that allows updating CLIF while keeping its encodings up to date, given a particular ISA. However, new (MachInst) backends don't use the encodings, and the `TargetIsaAdapter` shim will panic if any encoding-related method is called. This PR avoids those panics. Fixes #1809.	2020-06-04 10:53:45 -07:00
Andrew Brown	1ea09088be	Add x86 legalization for imul.i64x2 for non-AVX CPUs The `convert_i64x2_imul` custom legalization checks the ISA flags for AVX512DQ or AVX512VL support and legalizes `imul.i64x2` to an `x86_pmullq` in this case; if not, it uses a lengthy SSE2-compatible instruction sequence.	2020-06-03 16:27:57 -07:00
Andrew Brown	df171f01b5	Add x86_pmuludq This instruction multiplies the lower 32 bits of two 64x2 unsigned integers into an i64x2; this is necessary for lowering Wasm's i64x2.mul.	2020-06-03 16:27:57 -07:00
Andrew Brown	40f31375a5	Add TargetIsa::as_any for downcasting to specific ISA implementations This is necessary when we would like to check specific ISA flags, e.g.	2020-06-03 16:27:57 -07:00
Andrew Brown	9ba9fd0f64	Add x86-specific instruction for i64x2 multiplication Without this special instruction, legalizing to the AVX512 instruction AND the SSE instruction sequence is impossible. This extra instruction would be rendered unnecessary by the x64 backend.	2020-06-03 16:27:57 -07:00
Chris Fallin	fe97659813	Address review comments.	2020-06-03 13:31:34 -07:00
Chris Fallin	615362068f	Multi-value return support.	2020-06-03 13:31:34 -07:00
Chris Fallin	9fec933056	Merge pull request #1801 from jgouly/cmp-rebase arm64: add support for I8X16 ICmp	2020-06-02 09:35:41 -07:00
Joey Gouly	90a421193f	arm64: add support for I8X16 ICmp Copyright (c) 2020, Arm Limited.	2020-06-02 16:58:09 +01:00
Benjamin Bouvier	67c7a3ed19	mach backend: reduce the size of the Inst enum down to 32 bytes;	2020-06-02 16:29:05 +02:00
Benjamin Bouvier	e227608510	mach backend: use vectors instead of sets to remember set of uses/defs for calls; This avoids the set uniqueness (hashing) test, reduces memory churn when re-mapping virtual register onto real registers, and is generally more memory-efficient.	2020-06-02 16:29:05 +02:00
Benjamin Bouvier	cfa0527794	mach backend: have mem_finalize return a SmallVec; This avoids a spurious reallocation of the SmallVec containing the load_constants result to a Vec, which appeared in dhat profiles.	2020-06-02 16:29:05 +02:00
Nick Fitzgerald	7c68a10ed6	Merge pull request #1670 from teapotd/win64-pass-by-ref Implement passing arguments by ref for win64 ABI	2020-06-01 11:13:30 -07:00
Andrew Brown	0dd77d36f8	Rename BinaryImm format to BinaryImm64	2020-05-29 19:56:27 -07:00
Andrew Brown	a27a079d65	Replace ExtractLane format with BinaryImm8 Like https://github.com/bytecodealliance/wasmtime/pull/1762, this change the name of the `ExtractLane` format to the more-general `BinaryImm8` and renames its immediate argument from `lane` to `imm`.	2020-05-29 19:56:27 -07:00
Andrew Brown	7d6e94b952	Replace InsertLane format with TernaryImm8 The InsertLane format has an ordering (`value().imm().value()`) and immediate name (`"lane"`) that make it awkward to use for other instructions. This changes the ordering (`value().value().imm()`) and uses the default name (`"imm"`) throughout the codebase.	2020-05-29 19:56:27 -07:00
teapotd	e430984ac4	Improve bitselect codegen with knowledge of operand origin (#1783 ) * Encode vselect using BLEND instructions on x86 * Legalize vselect to bitselect * Optimize bitselect to vselect for some operands * Add run tests for bitselect-vselect optimization * Address review feedback	2020-05-29 19:53:11 -07:00
teapotd	759cc3e751	Implement passing arguments by ref for win64 ABI	2020-05-29 20:12:41 +02:00
Nick Fitzgerald	94380bf2b7	Merge pull request #1510 from teapotd/abi-i128-fix Always check if struct-return parameter is needed	2020-05-29 10:02:16 -07:00
Nick Fitzgerald	2e7b3ba8de	cranelift: Implement serialize/deserialize for stack maps When the `enable-serde` feature is set.	2020-05-28 11:34:58 -07:00
teapotd	9e70a64728	Legalize sret call arguments	2020-05-25 20:03:24 +02:00
teapotd	0f55bb4b8d	Always check if struct-return parameter is needed	2020-05-25 20:03:24 +02:00
Anton Kirilov	8a928830ac	Enable the wast::Cranelift::spec::simd::simd_store test for AArch64 Copyright (c) 2020, Arm Limited.	2020-05-24 22:53:07 +01:00
Chris Fallin	73537e72c0	Merge pull request #1732 from jgouly/copysign-fpu arm64: Use FPU instrctions for Fcopysign	2020-05-22 17:25:33 -07:00
Peter Huene	f36539130b	Merge pull request #1734 from peterhuene/fix-saved-fprs Cranelift: Fix FPR saving and shadow space allocation for Windows x64.	2020-05-22 12:06:37 -07:00
whitequark	b2e8ed4dc9	cranelift: add i64.[us]{div,rem} libcalls. These libcalls are useful for 32-bit platforms.	2020-05-22 11:41:56 +00:00
Peter Huene	ce5f3e153b	Only update XMM save unwind operation offsets when using a FP. This commit prevents updating the XMM save unwind operation offsets when a frame pointer is not used, even though currently Cranelift always uses a frame pointer. This will prevent incorrect unwind information in the future when we start omitting frame pointers.	2020-05-21 16:46:30 -07:00
Peter Huene	2cd5ed1880	Address code review feedback.	2020-05-21 15:57:11 -07:00
Joey Gouly	02c3f238f8	arm64: Use FPU instrctions for Fcopysign Copyright (c) 2020, Arm Limited.	2020-05-21 18:14:12 +01:00
Peter Huene	78c3091e84	Fix FPR saving and shadow space allocation for Windows x64. This commit fixes both how FPR callee-saved registers are saved and how the shadow space allocation occurs when laying out the stack for Windows x64 calling convention. Importantly, this commit removes the compiler limitation of stack size for Windows x64 that was imposed because FPR saves previously couldn't always be represented in the unwind information. The FPR saves are now performed without using stack slots, much like how the callee-saved GPRs are saved. The total CSR space is given to `layout_stack` so that it is included in the frame size and to offset the layout of spills and explicit slots. The FPR saves are now done via an RSP offset (post adjustment) and they always follow the GPR saves on the stack. A simpler calculation can now be made to determine the proper offsets of the FPR saves for representing the unwind information. Additionally, the shadow space is no longer treated as an incoming argument, but an explicit stack slot that gets laid out at the lowest address possible in the local frame. This prevents `layout_stack` from putting a spill or explicit slot in this reserved space. In the future, `layout_stack` should take advantage of the caller-provided shadow space for spills, but this commit does not attempt to address that. The shadow space is now omitted from the local frame for leaf functions. Fixes #1728. Fixes #1587. Fixes #1475.	2020-05-20 15:37:30 -07:00
Chris Fallin	c9e3b71c39	Merge pull request #1729 from cfallin/machinst-branch-opt Fix MachBuffer branch optimization.	2020-05-20 14:43:57 -07:00
Chris Fallin	13e12908a6	MachBuffer branch opts: comments approximating a semi-formal correctness proof.	2020-05-20 14:12:19 -07:00
Chris Fallin	80ab154d04	Update from review comments.	2020-05-20 12:35:36 -07:00
Benjamin Bouvier	1f620e1b46	cranelift: bump regalloc.rs to 0.0.24 and adapt to latest API changes;	2020-05-20 15:37:15 +02:00

... 27 28 29 30 31 ...

1806 Commits