Commit Graph

30 Commits

Author SHA1 Message Date
Andrew Brown
737cf1d605 Implement iabs for x86 SIMD
This only covers the types necessary for implementing the Wasm SIMD spec--`i8x16`, `i16x8`, `i32x4`.
2020-06-30 14:00:17 -07:00
Andrew Brown
3740772176 Add encoding for x86 CVTTPS2DQ
This reuses the `x86_cvtt2si` instruction since the packed and scalar versions seem to group together well.
2020-06-18 11:39:38 -07:00
Andrew Brown
772ce73f7f Add x86_pblendw instruction
This instruction is necessary for lowering `fcvt_from_uint`.
2020-06-12 15:06:22 -07:00
Andrew Brown
546fc9ddf1 Add x86_vcvtudq2ps instruction
This instruction converts i32x4 to f32x4 in several AVX512 feature sets.
2020-06-12 15:06:22 -07:00
Andrew Brown
5db384cd76 Rename opcode: PMULLQ to VPMULLQ 2020-06-03 16:27:57 -07:00
Andrew Brown
df171f01b5 Add x86_pmuludq
This instruction multiplies the lower 32 bits of two 64x2 unsigned integers into an i64x2; this is necessary for lowering Wasm's i64x2.mul.
2020-06-03 16:27:57 -07:00
teapotd
e430984ac4 Improve bitselect codegen with knowledge of operand origin (#1783)
* Encode vselect using BLEND instructions on x86

* Legalize vselect to bitselect

* Optimize bitselect to vselect for some operands

* Add run tests for bitselect-vselect optimization

* Address review feedback
2020-05-29 19:53:11 -07:00
Andrew Brown
fb6e8f784d Add x86 pack instructions 2020-04-23 10:55:54 -07:00
Andrew Brown
f5fc09f64a Add x86 unpack instructions 2020-04-23 10:55:54 -07:00
Andrew Brown
54398156ea Add x86 implementation of SIMD load_extend instructions 2020-03-31 11:35:26 -07:00
Andrew Brown
444d021ede Add x86 implementation of fcvt_from_sint 2020-03-17 10:52:03 -07:00
Andrew Brown
7f7196a655 Add i64x2 integer multiplication using AVX512DQ 2020-03-06 10:53:22 -08:00
Andrew Brown
032e81fd6f Add x86 SIMD average rounding 2020-02-24 09:48:38 -08:00
Andrew Brown
1f17e35e95 Add x86 SIMD immediate shifts 2019-11-15 13:45:25 -08:00
Andrew Brown
c8eb4e9612 Add x86 SIMD floating-point arithmetic 2019-11-12 17:05:39 -08:00
Andrew Brown
d32301854d Add x86 SIMD implementation of float comparison 2019-11-08 14:06:53 -08:00
Andrew Brown
0ab5760fd7 Add x86 SIMD instructions for min and max
Only the I8, I16, and I32 versions are included since Cranelift lacks support for AVX.
2019-11-05 16:42:34 -08:00
Andrew Brown
c454c3c771 Add x86 SIMD encoding for icmp sgt 2019-11-05 16:42:34 -08:00
Andrew Brown
186effc420 Add x86 SIMD vany_true and x86_ptest
In order to implement SIMD's any_true (https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#any-lane-true), we must legalize some instruction (I chose `vany_true`) to a sequence of `PTEST` and `SETNZ`. To emit `PTEST` I added the new CLIF instruction `x86_ptest` and used CLIF's `trueif ne` for `SETNZ`.
2019-10-22 11:01:05 -07:00
Andrew Brown
8f74333662 Add x86 SIMD band_not 2019-10-17 15:49:29 -07:00
Andrew Brown
f1904bffea Add x86 SIMD sshr and ushr
Only the shifts with applicable SSE2 instructions are implemented here: PSRL* (for ushr) only has 16-64 bit instructions and PSRA* (for sshr) only has 16-32 bit instructions.
2019-10-15 15:51:50 -07:00
Andrew Brown
6460fe705f Add x86 SIMD ishl
Only the shifts with applicable SSE2 instructions (i.e. 16-64 bit width) are implemented here.
2019-10-15 15:51:50 -07:00
Andrew Brown
4cdc1e76a4 Add x86 SIMD band 2019-10-11 11:05:24 -07:00
Andrew Brown
96d51cb1e8 Switch x86 SIMD bor from ORPS to POR encoding
There are two reasons for this change:
 1. it reduces confusion; using the `POR` encoding will match the future encodings of `band` and `bxor` and the `ORPS` encoding may be confusing as it is intended for floating-point operations
 2. `POR` has slightly more throughput: it only has to wait 0.33 cycles to execute again on all Intel architectures above Core whereas `ORPS` must wait 1 cycle on architectures older than Skylake (Intel Optimization Reference Manual, C.3)

`POR` does add one additional byte to the encoding and requires SSE2 so the `ORPS` opcode is left in for future use.
2019-10-11 11:05:24 -07:00
Andrew Brown
90c49a2f7c Add saturating subtraction with a SIMD encoding
This includes the new instructions `ssub_sat` and `usub_sat` and only encodes the i8x16 and i16x8 types; these are what is needed for implementing the SIMD spec (see https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#saturating-integer-subtraction).
2019-09-30 13:54:30 -07:00
Andrew Brown
21144068d4 Add saturating addition with a SIMD encoding
This includes the new instructions `sadd_sat` and `uadd_sat` and only encodes the i8x16 and i16x8 types; these are what is needed for implementing the SIMD spec (see https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#saturating-integer-addition).
2019-09-30 13:54:30 -07:00
Andrew Brown
630cb3ee62 Add x86 encoding for SIMD imul
Only i16x8 and i32x4 are encoded in this commit mainly because i8x16 and i64x2 do not have simple encodings in x86. i64x2 is not required by the SIMD spec and there is discussion (https://github.com/WebAssembly/simd/pull/98#issuecomment-530092217) about removing i8x16.
2019-09-30 13:54:30 -07:00
Andrew Brown
168ad7fda3 Fix 16-bit x86_pextr encoding
The x86 ISA has (at least) two encodings for PEXTRW:
 1. in the SSE2 opcode (66 0f c5) the XMM operand uses r/m and the GPR operand uses reg
 2. in the SSE4.1 opcode (66 0f 3a 15) the XMM operand uses reg and the GPR operand uses r/m

This changes the 16-bit x86_pextr encoding from 1 to 2 to match the other PEXTR* implementations (all #2 style).
2019-09-30 13:54:30 -07:00
Andrew Brown
ca1df499a0 Add x86 encoding for isub 2019-09-30 13:54:30 -07:00
Sean Stangl
3d5346a90b Name opcodes statically in isa/x86. Closes #1051 (#1079) 2019-09-25 19:59:49 -06:00