wasmtime

Author	SHA1	Message	Date
Chris Fallin	39b5736727	Remove LoadSplat opcode, in preparation for pattern-matching Load+Splat. This was added as an incremental step to improve AArch64 code quality in PR #2278. At the time, we did not have a way to pattern-match the load + splat opcode sequence that the relevant Wasm opcodes lowered to. However, now with PR #2366, we can merge effectful instructions such as loads into other ops, and so we can do this pattern matching directly. The pattern-matching update will come in a subsequent commit.	2020-11-16 15:31:56 -08:00
Anton Kirilov	e0b911a4df	Introduce the Cranelift IR instruction `LoadSplat` It corresponds to WebAssembly's `load*_splat` operations, which were previously represented as a combination of `Load` and `Splat` instructions. However, there are architectures such as Armv8-A that have a single machine instruction equivalent to the Wasm operations. In order to generate it, it is necessary to merge the `Load` and the `Splat` in the backend, which is not possible because the load may have side effects. The new IR instruction works around this limitation. The AArch64 backend leverages the new instruction to improve code generation. Copyright (c) 2020, Arm Limited.	2020-10-14 13:07:13 +01:00
Andrew Brown	f0b083c6ad	Legalize `[u\|s]widen_high` for x86 Use `x86_palignr` and `[u\|s]widen_low` for legalizing this instruction.	2020-07-15 11:32:08 -07:00
Andrew Brown	c5a69cee9f	Add x86 legalization for fcvt_to_uint_sat.i32x4 This converts an `f32x4` into an `i32x4` (unsigned) with rounding by using a long sequence of SSE4.1 compatible instructions.	2020-07-08 10:20:01 -07:00
Andrew Brown	65e6de2344	Replace `x86_packss` with `snarrow` Since the Wasm specification contains narrowing instructions (see https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#integer-to-integer-narrowing) that lower to PACKSS*, the x86-specific instruction is not necessary in the CLIF IR.	2020-07-02 09:35:45 -07:00
Andrew Brown	c9d573d841	Provide spec-compliant legalization for SIMD floating point min/max	2020-06-25 14:48:16 -07:00
Andrew Brown	3675f95bb2	Legalize fcvt_to_sint_sat.i32x4 on x86 Use a lengthy sequence involving CVTTPS2DQ to quiet NaNs and saturate overflow.	2020-06-18 11:39:38 -07:00
Andrew Brown	01d34e71b9	Add x86 legalization for fcvt_from_uint.f32x4 This converts an `i32x4` into an `f32x4` with some rounding either by using an AVX512VL/F instruction--VCVTUDQ2PS--or a long sequence of SSE4.1 compatible instructions.	2020-06-12 15:06:22 -07:00
Andrew Brown	1ea09088be	Add x86 legalization for imul.i64x2 for non-AVX CPUs The `convert_i64x2_imul` custom legalization checks the ISA flags for AVX512DQ or AVX512VL support and legalizes `imul.i64x2` to an `x86_pmullq` in this case; if not, it uses a lengthy SSE2-compatible instruction sequence.	2020-06-03 16:27:57 -07:00
Andrew Brown	b3a6985cd5	Re-organize transform groups for x86 legalization	2020-06-03 16:27:57 -07:00
Andrew Brown	9ba9fd0f64	Add x86-specific instruction for i64x2 multiplication Without this special instruction, legalizing to the AVX512 instruction AND the SSE instruction sequence is impossible. This extra instruction would be rendered unnecessary by the x64 backend.	2020-06-03 16:27:57 -07:00
Andrew Brown	7d6e94b952	Replace InsertLane format with TernaryImm8 The InsertLane format has an ordering (`value().imm().value()`) and immediate name (`"lane"`) that make it awkward to use for other instructions. This changes the ordering (`value().value().imm()`) and uses the default name (`"imm"`) throughout the codebase.	2020-05-29 19:56:27 -07:00
teapotd	e430984ac4	Improve bitselect codegen with knowledge of operand origin (#1783 ) * Encode vselect using BLEND instructions on x86 * Legalize vselect to bitselect * Optimize bitselect to vselect for some operands * Add run tests for bitselect-vselect optimization * Address review feedback	2020-05-29 19:53:11 -07:00
whitequark	4ec16fa057	Legalize 64 bit shifts on x86_32 using PSLLQ/PSRLQ. Co-authored-by: iximeow <git@iximeow.net>	2020-05-09 03:28:19 -07:00
Andrew Brown	cd49ed9582	Add x86 legalization for sshr.i64x2	2020-05-05 12:01:46 -07:00
Andrew Brown	d24f23285b	Legalize i8x16.sshr using pack/unpack instructions Due to arithmetic shift behavior, this legalization cannot easily use the masks for i8x16.ushr or i8x16.ishl	2020-04-23 10:55:54 -07:00
Andrew Brown	5f0286696c	Add x86 implentation of 8x16 `ishl` This involves some large mask tables that may hurt code size but reduce the number of instructions. See https://github.com/WebAssembly/simd/issues/117 for a more in-depth discussion on this.	2020-04-23 10:55:54 -07:00
Andrew Brown	3f47291f2e	Add x86 implentation of 8x16 `ushr` This involves some large mask tables that may hurt code size but reduce the number of instructions. See https://github.com/WebAssembly/simd/issues/117 for a more in-depth discussion on this.	2020-04-17 11:59:47 -07:00
Andrew Brown	fa7481a681	Add x86 implementation of SIMD swizzle instruction	2020-03-06 15:49:53 -08:00
Andrew Brown	442edf5c84	Refactor SIMD legalizations to separate define* function See https://github.com/bytecodealliance/wasmtime/issues/1168	2020-03-06 14:57:11 -08:00
bjorn3	0a1bb3ba6c	Add TLS support for ELF and MachO (#1174 ) * Add TLS support * Add binemit and legalize tests * Spill all caller-saved registers when necessary	2020-02-25 17:50:04 -08:00
Andrew Brown	3ae1af1ad2	Add new Cranelift instructions for integer min/max This includes legalizations to the previously-existing x86 SIMD integer min/max.	2020-02-21 09:33:43 -08:00
Peter Delevoryas	18b40d1101	Add ineg legalization for scalar integer types (#1385 )	2020-02-14 13:16:02 -08:00
Andrew Brown	6fe86bcb61	Fix SIMD float comparison encoding (#1285 ) The Intel manual uses `CMPNLT` and `CMPNLE` to denote not-less-than and not-less-than-or-equals. These were translated previously to `FloatCC::GreaterThan` and `FloatCC::GreaterThanOrEqual` but should be correctly translated to `FloatCC::UnorderedOrGreaterThanOrEqual` and `FloatCC::UnorderedOrGreaterThan`. This change adds the necessary legalizations to make use of these new encodings.	2020-01-08 09:28:05 -08:00
Philip Craig	86b66e8ede	Fix build failure in cranelift-codegen (#1294 ) error[E0425]: cannot find value `ones` in this scope --> cranelift-codegen/meta/src/isa/x86/legalize.rs:564:33 \| 564 \| def!(c = vconst(ones)), \| ^^^^ not found in this scope	2019-12-16 19:38:09 -08:00
Andrew Brown	4433ad2858	Fix legalization of `icmp ugt` (#1278 ) Previously, the same pattern (pmax + pcmpeq) as `uge` was used but this logic was incorrect for operands with equal values.	2019-12-16 14:14:51 -07:00
Andrew Brown	6181f20326	Fix legalization of SIMD `fneg` (#1286 ) Previously `fsub` was used but this fails when negating -0.0 and +0.0 in the SIMD spec tests; using more instructions, this change uses shifts to create a constant for flipping the most significant bit of each lane with `bxor`.	2019-12-16 10:32:08 -08:00
Andrew Brown	91d29c09d0	Add x86 SIMD floating-point absolute value	2019-11-15 13:45:25 -08:00
Andrew Brown	6519a43b08	Add x86 SIMD floating-point negation	2019-11-15 13:45:25 -08:00
Andrew Brown	04db2a9f39	Bind constant vectors to vconst; fixes #1052 (#1217 )	2019-11-12 15:57:59 -08:00
Benjamin Bouvier	9080a02e10	Replace CraneStation by bytecodealliance everywhere; (#1221 )	2019-11-12 10:09:31 -08:00
Andrew Brown	af4637aff6	Add x86 SIMD legalizations for icmp less-than	2019-11-05 16:42:34 -08:00
Andrew Brown	feffed85d2	Add x86 SIMD legalizations for integer greater-than This includes `icmp ugt`, `icmp sge`, and `icmp uge` for vectors with lanes of I8, I16, and I32.	2019-11-05 16:42:34 -08:00
Andrew Brown	e3a20d67b2	Add x86 SIMD legalization of `icmp ne`	2019-11-05 16:42:34 -08:00
Peter Huene	9f506692c2	Fix clippy warnings. This commit fixes the current set of (stable) clippy warnings in the repo.	2019-10-24 17:20:12 -07:00
Andrew Brown	879ccf871a	Add x86 SIMD vall_true In order to implement SIMD's all_true (https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#all-lanes-true), we must legalize some instruction (I chose `vall_true`) to a comparison against 0 and a similar reduction as vany_true using `PTEST` and `SETNZ`. Since `icmp` only allows integers but `vall_true` could allow more vector types, `raw_bitcast` is used to convert the lane types into integers, e.g. b32x4 to i32x4. To do so without runtime type-checking, the `raw_bitcast` instruction (which emits no instruction) can now bitcast from any vector type to the same type, e.g. i32x4 to i32x4.	2019-10-22 11:01:05 -07:00
Andrew Brown	186effc420	Add x86 SIMD vany_true and x86_ptest In order to implement SIMD's any_true (https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#any-lane-true), we must legalize some instruction (I chose `vany_true`) to a sequence of `PTEST` and `SETNZ`. To emit `PTEST` I added the new CLIF instruction `x86_ptest` and used CLIF's `trueif ne` for `SETNZ`.	2019-10-22 11:01:05 -07:00
Andrew Brown	b927c55511	Add SIMD bitselect instruction and x86 legalization This new instructions matches the `bitselect` behavior described in the WASM SIMD spec (https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#bitwise-select)	2019-10-17 15:49:29 -07:00
Andrew Brown	f1904bffea	Add x86 SIMD sshr and ushr Only the shifts with applicable SSE2 instructions are implemented here: PSRL* (for ushr) only has 16-64 bit instructions and PSRA* (for sshr) only has 16-32 bit instructions.	2019-10-15 15:51:50 -07:00
Andrew Brown	6460fe705f	Add x86 SIMD ishl Only the shifts with applicable SSE2 instructions (i.e. 16-64 bit width) are implemented here.	2019-10-15 15:51:50 -07:00
Benjamin Bouvier	350b3b2406	[meta] Avoid unwrapping instructions several times during legalization; This avoids doing multiple unpacking of the InstructionData for a single legalization, improving readability and reducing size of the generated code. For instance, icmp had to unpack the format once per IntCC condition code.	2019-10-15 11:37:48 +02:00
Andrew Brown	1f728c1797	Add x86 legalization for SIMD bnot	2019-10-11 11:05:24 -07:00
Andrew Brown	6d690e5275	Allow binding immediates to instructions (#1012 ) This change should make the code more clear (and less code) when adding encodings for instructions with specific immediates; e.g., a constant with a 0 immediate could be encoded as an XOR with something like `const.bind(...)` without explicitly creating the necessary predicates. It has several parts: * Introduce Bindable trait to instructions * Convert all instruction bindings to use Bindable::bind() * Add ability to bind immediates to BoundInstruction This is an attempt to reduce some of the issues in #955.	2019-10-10 08:54:46 -07:00
Andrew Brown	ba393afd4d	Add x86 legalization for SIMD ineg	2019-09-30 13:54:30 -07:00
Andrew Brown	a3db30d97e	Add x86 encoding for SIMD `icmp eq` Also adds a predicate for matching the `eq` IntCC code (TODO this should be replaced by something more general)	2019-09-24 09:33:07 -07:00
Andrew Brown	af1499ce99	Add x86 implementation of shuffle	2019-09-19 10:53:40 -07:00
Ujjwal Sharma	3418fb6e18	[codegen] reintroduce support for carry and borrow instructions in RI… (#1005 ) Reintroduce support for iadd carry variants and isub borrow variants for RISC ISAs which had been removed in https://github.com/CraneStation/cranelift/pull/961 and https://github.com/CraneStation/cranelift/pull/962 because of the lack of a proper flags register in RISC architectures.	2019-09-13 17:27:49 +02:00
Andrew Brown	295b2ef614	Avoid extra register movement when lowering an x86 insertlane to a float vector	2019-09-10 10:45:12 -07:00
Andrew Brown	00bedca274	Avoid extra register movement when lowering the x86 extractlane of a float vector This commit is based on the assumption that floats are already stored in XMM registers in x86. When extracting a lane, cranelift was moving the float to a regular register and back to an XMM register; this change avoids this by shuffling the float value to the lowest bits of the XMM register. It also assumes that the upper bits can be left as is (instead of zeroing them out).	2019-09-10 10:45:12 -07:00
Andrew Brown	ebc783e49b	Use raw_bitcast when legalizing splat raw_bitcast matches the intent of this legalization more clearly (to simply change the CLIF type without changing any bits) and the additional null encodings added are necessary for later instructions	2019-09-10 10:45:12 -07:00

1 2

59 Commits