wasmtime

Author	SHA1	Message	Date
whitequark	880e692fd4	x86: add encoding for bnot.b1. Fixes #1743. Co-authored-by: iximeow <git@iximeow.net>	2020-05-28 08:43:25 -07:00
Andrew Brown	b017844bef	Fix interpreter semantics of 'irsub_imm' Previously it used `arg - imm` but the functionality should be a wrapping `imm - arg` (see `cranelift/codegen/meta/src/shared/instructions.rs`).	2020-05-28 16:28:27 +02:00
teapotd	fbac2e53f9	Make vconst BxN match specification	2020-05-27 09:37:13 -07:00
Andrew Brown	628a9f0eaa	Print more detailed `test run` failures (#1764 )	2020-05-27 09:04:46 -05:00
Andrew Brown	4e016afca3	Add trace-level logging to interpreter	2020-05-26 18:45:25 +02:00
Andrew Brown	ca0c24e346	Avoid recursion in `Interpreter::block`	2020-05-26 18:45:25 +02:00
Chris Fallin	6ead7527af	Merge pull request #1748 from akirilov-arm/simd_store Enable the wast::Cranelift::spec::simd::simd_store test for AArch64	2020-05-26 09:21:36 -07:00
Ömer Sinan Ağacan	c619136752	Remove Eq bound of ReservedValue trait A full Eq implementation is no needed for ReservedValue, as we only need to check whether a value is the reserved one. For entities (defined with `entity_impl!`) this doesn't make much difference, but for more complicated types this avoids generating redundant `Eq`s.	2020-05-26 10:27:55 +02:00
bjorn3	eeb1e141ba	Add some assertions to cranelift_frontend	2020-05-26 10:17:08 +02:00
Andrew Brown	6e7276e48d	Replace single use of `Frame::with_parameters` with `Frame::set_all`	2020-05-26 09:56:58 +02:00
Andrew Brown	d73cb48c29	Add logging to frame operations	2020-05-26 09:56:58 +02:00
Andrew Brown	c92917de15	Fix typo in sadd_sat instruction definition	2020-05-26 09:55:26 +02:00
Anton Kirilov	8a928830ac	Enable the wast::Cranelift::spec::simd::simd_store test for AArch64 Copyright (c) 2020, Arm Limited.	2020-05-24 22:53:07 +01:00
Chris Fallin	51f9ac2150	Merge pull request #1741 from cfallin/filetest-vcode-compile Merge `vcode` filetest mode into `compile`.	2020-05-22 18:57:21 -07:00
Chris Fallin	48573b52b2	Merge `vcode` filetest mode into `compile`. I hadn't realized before that the filetest backend for `test vcode` is doing essentially what `compile` is doing, but for new (`MachInst`) backends: it is just getting a disassembly and running it through filecheck. There's no reason not to reuse `test compile` for the AArch64 tests as well. This was motivated by the desire to have "this IR compiles successfully" tests work on both x86 and AArch64. It seems this should work fine by adding multiple `target` directives when a test case should be compile-tested on multiple architectures.	2020-05-22 17:28:48 -07:00
Chris Fallin	73537e72c0	Merge pull request #1732 from jgouly/copysign-fpu arm64: Use FPU instrctions for Fcopysign	2020-05-22 17:25:33 -07:00
Peter Huene	f36539130b	Merge pull request #1734 from peterhuene/fix-saved-fprs Cranelift: Fix FPR saving and shadow space allocation for Windows x64.	2020-05-22 12:06:37 -07:00
whitequark	b2e8ed4dc9	cranelift: add i64.[us]{div,rem} libcalls. These libcalls are useful for 32-bit platforms.	2020-05-22 11:41:56 +00:00
Peter Huene	ce5f3e153b	Only update XMM save unwind operation offsets when using a FP. This commit prevents updating the XMM save unwind operation offsets when a frame pointer is not used, even though currently Cranelift always uses a frame pointer. This will prevent incorrect unwind information in the future when we start omitting frame pointers.	2020-05-21 16:46:30 -07:00
Peter Huene	2cd5ed1880	Address code review feedback.	2020-05-21 15:57:11 -07:00
Joey Gouly	02c3f238f8	arm64: Use FPU instrctions for Fcopysign Copyright (c) 2020, Arm Limited.	2020-05-21 18:14:12 +01:00
Peter Huene	78c3091e84	Fix FPR saving and shadow space allocation for Windows x64. This commit fixes both how FPR callee-saved registers are saved and how the shadow space allocation occurs when laying out the stack for Windows x64 calling convention. Importantly, this commit removes the compiler limitation of stack size for Windows x64 that was imposed because FPR saves previously couldn't always be represented in the unwind information. The FPR saves are now performed without using stack slots, much like how the callee-saved GPRs are saved. The total CSR space is given to `layout_stack` so that it is included in the frame size and to offset the layout of spills and explicit slots. The FPR saves are now done via an RSP offset (post adjustment) and they always follow the GPR saves on the stack. A simpler calculation can now be made to determine the proper offsets of the FPR saves for representing the unwind information. Additionally, the shadow space is no longer treated as an incoming argument, but an explicit stack slot that gets laid out at the lowest address possible in the local frame. This prevents `layout_stack` from putting a spill or explicit slot in this reserved space. In the future, `layout_stack` should take advantage of the caller-provided shadow space for spills, but this commit does not attempt to address that. The shadow space is now omitted from the local frame for leaf functions. Fixes #1728. Fixes #1587. Fixes #1475.	2020-05-20 15:37:30 -07:00
Chris Fallin	c9e3b71c39	Merge pull request #1729 from cfallin/machinst-branch-opt Fix MachBuffer branch optimization.	2020-05-20 14:43:57 -07:00
Chris Fallin	13e12908a6	MachBuffer branch opts: comments approximating a semi-formal correctness proof.	2020-05-20 14:12:19 -07:00
Chris Fallin	80ab154d04	Update from review comments.	2020-05-20 12:35:36 -07:00
Benjamin Bouvier	1f620e1b46	cranelift: bump regalloc.rs to 0.0.24 and adapt to latest API changes;	2020-05-20 15:37:15 +02:00
Chris Fallin	e11094b28b	Fix MachBuffer branch optimization. This patch fixes a subtle bug that occurred in the MachBuffer branch optimization: in tracking labels at the current buffer tail using a sorted-by-offset array, the code did not update this array properly when redirecting labels. As a result, the dead-branch removal was unsafe, because not every label pointing to a branch is guaranteed to be redirected properly first. Discovered while doing performance testing: bz2 silently took a wrong branch and exited compression early. (Eek!) To address this problem, this patch adopts a slightly simpler data structure: we only track the labels at the current buffer tail, and at the start of each branch, and we're careful to update these appropriately to maintain the invariants. I'm pretty confident that this is correct now, but we should (still) fuzz it a bunch, because wrong control flow scares me a nonzero amount. I should probably also actually write out a formal proof that these data-structure updates are correct. The optimizations are important for performance (removing useless empty blocks, and taking advantage of any fallthrough opportunities at all), so I don't think we would want to drop them entirely.	2020-05-19 18:09:18 -07:00
Nick Fitzgerald	9d2100e54a	Limit the size of automaton keys in the `peepmatic_simple_automata` fuzz target Fixes https://oss-fuzz.com/testcase-detail/5742905129172992	2020-05-19 09:12:50 -07:00
Chris Fallin	d8d6fbe58c	Merge pull request #1718 from cfallin/machinst-codebuffer Rework of MachInst isel, branch fixups and lowering, and block ordering.	2020-05-19 07:17:22 -07:00
Nick Fitzgerald	28d6df0db6	Limit the size of automaton keys in the `peepmatic_fst_diff` fuzz target (#1724 ) This should avoid timeouts caused by large keys. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=22251	2020-05-18 21:27:00 -05:00
Chris Fallin	bdd2873c8c	Address review comments.	2020-05-18 16:25:26 -07:00
Chris Fallin	687aca00fe	Update x64 backend to use new lowering APIs.	2020-05-18 16:25:15 -07:00
Chris Fallin	72e6be9342	Rework of MachInst isel, branch fixups and lowering, and block ordering. This patch includes: - A complete rework of the way that CLIF blocks and edge blocks are lowered into VCode blocks. The new mechanism in `BlockLoweringOrder` computes RPO over the CFG, but with a twist: it merges edge blocks intto heads or tails of original CLIF blocks wherever possible, and it does this without ever actually materializing the full nodes-plus-edges graph first. The backend driver lowers blocks in final order so there's no need to reshuffle later. - A new `MachBuffer` that replaces the `MachSection`. This is a special version of a code-sink that is far more than a humble `Vec<u8>`. In particular, it keeps a record of label definitions and label uses, with a machine-pluggable `LabelUse` trait that defines various types of fixups (basically internal relocations). Importantly, it implements some simple peephole-style branch rewrites inline in the emission pass, without any separate traversals over the code to use fallthroughs, swap taken/not-taken arms, etc. It tracks branches at the tail of the buffer and can (i) remove blocks that are just unconditional branches (by redirecting the label), (ii) understand a conditional/unconditional pair and swap the conditional polarity when it's helpful; and (iii) remove branches that branch to the fallthrough PC. The `MachBuffer` also implements branch-island support. On architectures like AArch64, this is needed to allow conditional branches within plausibly-attainable ranges (+/- 1MB on AArch64 specifically). It also does this inline while streaming through the emission, without any sort of fixpoint algorithm or later moving of code, by simply tracking outstanding references and "deadlines" and emitting an island just-in-time when we're in danger of going out of range. - A rework of the instruction selector driver. This is largely following the same algorithm as before, but is cleaned up significantly, in particular in the API: the machine backend can ask for an input arg and get any of three forms (constant, register, producing instruction), indicating it needs the register or can merge the constant or producing instruction as appropriate. This new driver takes special care to emit constants right at use-sites (and at phi inputs), minimizing their live-ranges, and also special-cases the "pinned register" to avoid superfluous moves. Overall, on `bz2.wasm`, the results are: wasmtime full run (compile + runtime) of bz2: baseline: 9774M insns, 9742M cycles, 3.918s w/ changes: 7012M insns, 6888M cycles, 2.958s (24.5% faster, 28.3% fewer insns) clif-util wasm compile bz2: baseline: 2633M insns, 3278M cycles, 1.034s w/ changes: 2366M insns, 2920M cycles, 0.923s (10.7% faster, 10.1% fewer insns) All numbers are averages of two runs on an Ampere eMAG.	2020-05-16 23:08:22 -07:00
Y-Nak	0393d101b1	Fix typo in peepmatic (#1712 )	2020-05-15 09:47:16 -05:00
Nick Fitzgerald	01f46d0238	Merge pull request #1692 from fitzgen/update-to-wasmparser-0.55.0 Update to using `wasmparser` 0.55.0	2020-05-14 14:00:24 -07:00
Chris Fallin	df4028749e	Merge pull request #1699 from jgouly/inst-size Reduce arm64 Inst enum size	2020-05-14 13:44:46 -07:00
Nick Fitzgerald	1a4f3fb2df	Update deps and tests for `anyref` --> `externref` * Update to using `wasmparser` 0.55.0 * Update wasmprinter to 0.2.5 * Update `wat` to 1.0.18, and `wast` to 17.0.0	2020-05-14 12:47:37 -07:00
Nick Fitzgerald	3c0b64fef7	Merge pull request #1710 from fitzgen/remove-unused-lhs-member peepmatic: remove unused member from `PeepholeOptimizer`	2020-05-14 12:03:31 -07:00
Nick Fitzgerald	e9ef8ea3d5	peepmatic: remove unused member from `PeepholeOptimizer` This is dead code, left over from an earlier design.	2020-05-14 11:08:59 -07:00
Nick Fitzgerald	fb7a690efc	Merge pull request #1687 from fitzgen/sign-extend-immediates cranelift: Sign extend `Imm64` immediates	2020-05-14 10:09:53 -07:00
Nick Fitzgerald	c093dee79e	cranelift: Let lifetime elision elide lifetimes	2020-05-14 07:52:23 -07:00
Nick Fitzgerald	923a73be7b	deps: Bump `z3` to 0.5.1 This fixes Windows builds.	2020-05-14 07:52:23 -07:00
Nick Fitzgerald	8d7ed0fd13	deps: Update `wast` to 15.0.0 This also updates `wat` in the lockfile so that the SIMD spec tests are passing again.	2020-05-14 07:52:23 -07:00
Nick Fitzgerald	22a070ed4f	peepmatic: Apply some review suggestions from @bjorn3	2020-05-14 07:52:23 -07:00
Nick Fitzgerald	fd4f08e75f	peepmatic: rustfmt	2020-05-14 07:52:23 -07:00
Nick Fitzgerald	52c6ece5f3	peepmatic: Make peepmatic optional to enable Rather than outright replacing parts of our existing peephole optimizations passes, this makes peepmatic an optional cargo feature that can be enabled. This allows us to take a conservative approach with enabling peepmatic everywhere, while also allowing us to get it in-tree and make it easier to collaborate on improving it quickly.	2020-05-14 07:52:23 -07:00
Nick Fitzgerald	6e135b3aea	peepmatic: Fix a failed assertion due to extra iterations after fixed point After replacing an instruction with an alias to an earlier value, trying to further optimize that value is unnecessary, since we've already processed it, and also was triggering an assertion.	2020-05-14 07:52:23 -07:00
Nick Fitzgerald	eb2dab0aa4	peepmatic: Save RHS actions as a boxed slice, not vec A boxed slice is only two words, while a vec is three words. This should cut down on the memory size of our automata and improve cache usage.	2020-05-14 07:52:23 -07:00
Nick Fitzgerald	210b036320	peepmatic: Represent various id types with `u16` These ids end up in the automaton, so making them smaller should give us better data cache locality and also smaller serialized sizes.	2020-05-14 07:52:23 -07:00
Nick Fitzgerald	469104c4d3	peepmatic: Make the results of match operations a smaller and more cache friendly	2020-05-14 07:52:23 -07:00

1 2 3 4 5 ...

2116 Commits