wasmtime

Author	SHA1	Message	Date
Chris Fallin	21dac670f0	Aarch64: handle csel with icmp/fcmp source without materializing the bool. Previously, we simply compared the input bool to 0, which forced the value into a register (usually via a cmp and cset), zero-extended it, etc. This patch performs the same pattern-matching that branches do to directly perform the cmp and use its flag results with the csel. On the `bz2` benchmark, the runtime is affected as follows (measuring with `perf stat`, using wasmtime with its cache enabled, and taking the second run after the first compiles and populates the cache): pre: 1117.232000 task-clock (msec) # 1.000 CPUs utilized 133 context-switches # 0.119 K/sec 1 cpu-migrations # 0.001 K/sec 5,041 page-faults # 0.005 M/sec 3,511,615,100 cycles # 3.143 GHz 4,272,427,772 instructions # 1.22 insn per cycle <not supported> branches 27,980,906 branch-misses 1.117299838 seconds time elapsed post: 1003.738075 task-clock (msec) # 1.000 CPUs utilized 121 context-switches # 0.121 K/sec 0 cpu-migrations # 0.000 K/sec 5,052 page-faults # 0.005 M/sec 3,224,875,393 cycles # 3.213 GHz 4,000,838,686 instructions # 1.24 insn per cycle <not supported> branches 27,928,232 branch-misses 1.003440004 seconds time elapsed In other words, with this change, on `bz2`, we see a 6.3% reduction in executed instructions.	2020-07-17 21:10:21 -07:00
Nick Fitzgerald	8dd4ab2f1e	Merge pull request #2022 from MaxGraey/peepmatic-bnot peepmatic: Add bnot operation	2020-07-17 09:39:38 -07:00
Nikolay Volf	4f4edc7aef	Remove spam from "do_remove_constant_phis"	2020-07-17 18:14:16 +02:00
Benjamin Bouvier	ead8a835c4	machinst x64: add more FP support	2020-07-17 15:56:44 +02:00
bjorn3	5c5a30f76c	Fix review comments	2020-07-17 12:03:17 +02:00
bjorn3	7b7b1f4997	Rename sarg__ to sarg_t	2020-07-17 12:03:17 +02:00
bjorn3	4971d9ee80	Merge {make_incoming,get_outgoing}_{,struct_}arg	2020-07-17 12:03:17 +02:00
bjorn3	0d4fa6d32a	Fix review comments	2020-07-17 12:03:17 +02:00
bjorn3	4431ac1108	Implement SystemV struct argument passing	2020-07-17 12:03:17 +02:00
MaxGraey	c653c563dd	Merge branch 'main' into peepmatic-bnot	2020-07-16 22:01:18 +03:00
Chris Fallin	5e0268a542	Merge pull request #2034 from cfallin/update-regalloc Update to regalloc.rs 0.0.28.	2020-07-16 11:36:11 -07:00
Alex Crichton	63d5b91930	Wasmtime 0.19.0 and Cranelift 0.66.0 (#2027 ) This commit updates Wasmtime's version to 0.19.0, Cranelift's version to 0.66.0, and updates the release notes as well.	2020-07-16 12:46:21 -05:00
Chris Fallin	756e8b8ea2	Update to regalloc.rs 0.0.28. This version of regalloc.rs includes several bugfixes for reference-types support used by the new backend framework and the aarch64 backend (bytecodealliance/regalloc.rs#85 and bytecodealliance/regalloc.rs#86).	2020-07-16 09:42:09 -07:00
Benjamin Bouvier	bab337fc32	Address review comments;	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	5a55646fc3	machinst x64: support out-of-bounds memory accesses;	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	ea33ce9116	machinst x64: basic support for baldrdash + fix multi-value support	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	00b38c91f6	machinst x64: fix generation of RegMemImm immediate operands;	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	1430c5e436	machinst x64: fix index handling of jump table; The index should be truncated to 32 bits before being used for the jump table entry computation.	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	55b9059954	machinst x64: remove spurious assertion about FP offset requiring to be 16-bytes aligned	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	3905a1b17b	machinst x64: implement SymbolValue and FuncAddr with a movabsq+reloc;	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	cfa0a0c4e8	machinst x64: lower resumable_trap as trap;	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	311027869b	machinst x64: implement popcnt.i64	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	d9310e8d90	machinst x64: fix checked div sequence - it should mark as clobbering (def) rdx, not modifying it - the signed-div check requires a temporary to compare against int64_min	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	f932bccaf8	machinst x64: fix sign-extension at boundary	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	6f5403a94b	machinst x64: lower Ctz using the Bsf x86 instruction	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	33e0d05645	machinst x64: have cmov modify its destination operand; This is tricky: the control flow implicitly implied by the operand makes it so that the output register may be undefined, if we mark it only as a "def". Make it a "mod" instead, which matches our usage in the codebase, and will make it crash if the output operand isn't unconditionally defined before the instruction.	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	aa7db7fd7b	machinst x64: fix JmpUnknown register mapping;	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	fe7dd41435	machinst x64: fix iconst emission	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	ec2209665a	machinst x64: implement bsr and lower Clz;	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	eda2d143ed	machinst x64: add support for umulhi/smulhi;	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	571061fe4c	machinst x64: add support for rotations;	2020-07-16 18:21:06 +02:00
Benjamin Bouvier	22892466e7	machinst x64: fix implementation of *reduce; They should just generate a plain move, since the high bits are then ignored, and not an extended move.	2020-07-16 18:21:06 +02:00
MaxGraey	4564c396d2	Merge branch 'main' into peepmatic-bnot	2020-07-16 16:13:28 +03:00
MaxGraey	657aea5286	remove rule and tests	2020-07-16 14:56:11 +03:00
Andrew Brown	f0b083c6ad	Legalize `[u\|s]widen_high` for x86 Use `x86_palignr` and `[u\|s]widen_low` for legalizing this instruction.	2020-07-15 11:32:08 -07:00
Andrew Brown	c8ddf8a34c	Encode `[u\|s]widen_low` for x86	2020-07-15 11:32:08 -07:00
Andrew Brown	fafef7db77	Add `x86_palignr` instructions This instruction is necessary for implementing `[s\|u]widen_high`.	2020-07-15 11:32:08 -07:00
Andrew Brown	0e5e8a62c8	Add `DerivedFunction` for doubling lane widths and halving the number of lanes (i.e. merging) Certain operations (e.g. widening) will have operands with types like `NxM` but will return results with types like `(N*2)x(M/2)` (double the lane width, halve the number of lanes; maintain the same number of vector bits). This is equivalent to applying two `DerivedFunction`s to the type: `DerivedFunction::DoubleWidth` then `DerivedFunction::HalfVector`. Since there is no easy way to apply multiple `DerivedFunction`s (e.g. most of the logic is one-level deep, `1d5a678124/cranelift/codegen/meta/src/gen_inst.rs (L618-L621)`), I added `DerivedFunction::MergeLanes` to do the necessary type conversion.	2020-07-15 11:32:08 -07:00
Chris Fallin	12a31c88d7	Merge pull request #2021 from akirilov-arm/VectorSize AArch64: Introduce an enum to specify vector instruction operand sizes	2020-07-15 09:43:18 -07:00
MaxGraey	67b785d241	refactor: use different sections for this rule	2020-07-15 17:11:27 +03:00
Benjamin Bouvier	abf157bd69	machinst x64: Only use the feature flag to enable the x64 new backend; Before this patch, running the x64 new backend would require both compiling with --features experimental_x64 and running with `use_new_backend`. This patches changes this behavior so that the runtime flag is not needed anymore: using the feature flag will enforce usage of the new backend everywhere, making using and testing it much simpler: cargo run --features experimental_x64 ;; other CLI options/flags This also gives a hint at what the meta language generation would look like after switching to the new backend. Compiling only with the x64 codegen flag gives a nice compile time speedup.	2020-07-15 13:11:28 +02:00
MaxGraey	5b38857e7f	add bnot to peepmatic + transform rule	2020-07-15 13:46:25 +03:00
Anton Kirilov	95b0b05af2	AArch64: Introduce an enum to specify vector instruction operand sizes Copyright (c) 2020, Arm Limited.	2020-07-14 21:37:44 +01:00
Anton Kirilov	400639245c	AArch64: Remove show_freg_sized() It provides the same functionality as show_vreg_scalar(). Copyright (c) 2020, Arm Limited.	2020-07-14 11:27:46 -07:00
Chris Fallin	4ba3ee3368	Merge pull request #2016 from jgouly/saturating-math arm64: Implement saturating SIMD arithmetic	2020-07-14 11:24:10 -07:00
Joey Gouly	aa84a4173c	arm64: Implement saturating SIMD arithmetic Copyright (c) 2020, Arm Limited.	2020-07-14 18:19:11 +01:00
Chris Fallin	26529006e0	Address review comments.	2020-07-14 10:17:29 -07:00
Chris Fallin	08353fcc14	Reftypes part two: add support for stackmaps. This commit adds support for generating stackmaps at safepoints to the new backend framework and to the AArch64 backend in particular. It has been tested to work with SpiderMonkey.	2020-07-14 10:17:27 -07:00
Chris Fallin	b93e8c296d	Initial reftype support in aarch64, modulo safepoints. This commit adds the inital support to allow reftypes to flow through the program when targetting aarch64. It also adds a fix to the `ModuleTranslationState` needed to send R32/R64 types over from the SpiderMonkey embedding. This commit does not include any support for safepoints in aarch64 or the `MachInst` infrastructure; that is in the next commit. This commit also makes a drive-by improvement to `Bint`, avoiding an unneeded zero-extension op when the extended value comes directly from a conditional-set (which produces a full-width 0 or 1).	2020-07-14 10:14:18 -07:00
Anton Kirilov	79dfac5514	Refactor the InstSize enum in the AArch64 backend The main issue with the InstSize enum was that it was used both for GPR and SIMD & FP operands, even though machine instructions do not mix them in general (as in a destination register is either a GPR or not). As a result it had methods such as sf_bit() that made sense only for one type of operand. Another issue was that the enum name was not reflecting its purpose accurately - it was meant to represent an instruction operand size, not an instruction size, which is fixed in A64 (always 4 bytes). Now the enum is split into one for GPR operands and another for scalar SIMD & FP operands. Copyright (c) 2020, Arm Limited.	2020-07-14 15:04:35 +01:00

1 2 3 4 5 ...

842 Commits