x64: Add most remaining AVX lowerings (#5819)

* x64: Add most remaining AVX lowerings

This commit goes through `inst.isle` and adds a corresponding AVX
lowering for most SSE lowerings. I opted to skip instructions where the
SSE lowering didn't read/modify a register, such as `roundps`. I think
that AVX will benefit these instructions when there's load-merging since
AVX doesn't require alignment, but I've deferred that work to a future
PR.

Otherwise though in this PR I think all (or almost all) of the 3-operand
forms of AVX instructions are supported with their SSE counterparts.
This should ideally improve codegen slightly by removing register
pressure and the need for `movdqa` between registers. I've attempted to
ensure that there's at least one codegen test for all the new instructions.

As a side note, the recent capstone integration into `precise-output`
tests helped me catch a number of encoding bugs much earlier than
otherwise, so I've found that incredibly useful in tests!

* Move `vpinsr*` instructions to their own variant

Use true `XmmMem` and `GprMem` types in the instruction as well to get
more type-level safety for what goes where.

* Remove `Inst::produces_const` accessor

Instead of conditionally defining regalloc and various other operations
instead add dedicated `MInst` variants for operations which are intended
to produce a constant to have more clear interactions with regalloc and
printing and such.

* Fix tests

* Register traps in `MachBuffer` for load-folding ops

This adds a missing `add_trap` to encoding of VEX instructions with
memory operands to ensure that if they cause a segfault that there's
appropriate metadata for Wasmtime to understand that the instruction
could in fact trap. This fixes a fuzz test case found locally where v8
trapped and Wasmtime didn't catch the signal and crashed the fuzzer.
This commit is contained in:
Alex Crichton
2023-02-20 09:11:52 -06:00
committed by GitHub
parent ad128b6811
commit c26a65a854
16 changed files with 4145 additions and 466 deletions

View File

@@ -617,13 +617,6 @@ impl RegMemImm {
}
}
pub(crate) fn to_reg(&self) -> Option<Reg> {
match self {
Self::Reg { reg } => Some(*reg),
_ => None,
}
}
pub(crate) fn with_allocs(&self, allocs: &mut AllocationConsumer<'_>) -> Self {
match self {
Self::Reg { reg } => Self::Reg {
@@ -726,12 +719,6 @@ impl RegMem {
RegMem::Mem { addr, .. } => addr.get_operands(collector),
}
}
pub(crate) fn to_reg(&self) -> Option<Reg> {
match self {
RegMem::Reg { reg } => Some(*reg),
_ => None,
}
}
pub(crate) fn with_allocs(&self, allocs: &mut AllocationConsumer<'_>) -> Self {
match self {
@@ -1510,10 +1497,108 @@ impl AvxOpcode {
| AvxOpcode::Vfmadd213ps
| AvxOpcode::Vfmadd213pd => smallvec![InstructionSet::FMA],
AvxOpcode::Vminps
| AvxOpcode::Vorps
| AvxOpcode::Vminpd
| AvxOpcode::Vmaxps
| AvxOpcode::Vmaxpd
| AvxOpcode::Vandnps
| AvxOpcode::Vandnpd
| AvxOpcode::Vpandn
| AvxOpcode::Vcmpps
| AvxOpcode::Vpsrld => {
| AvxOpcode::Vcmppd
| AvxOpcode::Vpsrlw
| AvxOpcode::Vpsrld
| AvxOpcode::Vpsrlq
| AvxOpcode::Vpaddb
| AvxOpcode::Vpaddw
| AvxOpcode::Vpaddd
| AvxOpcode::Vpaddq
| AvxOpcode::Vpaddsb
| AvxOpcode::Vpaddsw
| AvxOpcode::Vpaddusb
| AvxOpcode::Vpaddusw
| AvxOpcode::Vpsubb
| AvxOpcode::Vpsubw
| AvxOpcode::Vpsubd
| AvxOpcode::Vpsubq
| AvxOpcode::Vpsubsb
| AvxOpcode::Vpsubsw
| AvxOpcode::Vpsubusb
| AvxOpcode::Vpsubusw
| AvxOpcode::Vpavgb
| AvxOpcode::Vpavgw
| AvxOpcode::Vpand
| AvxOpcode::Vandps
| AvxOpcode::Vandpd
| AvxOpcode::Vpor
| AvxOpcode::Vorps
| AvxOpcode::Vorpd
| AvxOpcode::Vpxor
| AvxOpcode::Vxorps
| AvxOpcode::Vxorpd
| AvxOpcode::Vpmullw
| AvxOpcode::Vpmulld
| AvxOpcode::Vpmulhw
| AvxOpcode::Vpmulhd
| AvxOpcode::Vpmulhrsw
| AvxOpcode::Vpmulhuw
| AvxOpcode::Vpmuldq
| AvxOpcode::Vpmuludq
| AvxOpcode::Vpunpckhwd
| AvxOpcode::Vpunpcklwd
| AvxOpcode::Vunpcklps
| AvxOpcode::Vaddps
| AvxOpcode::Vaddpd
| AvxOpcode::Vsubps
| AvxOpcode::Vsubpd
| AvxOpcode::Vmulps
| AvxOpcode::Vmulpd
| AvxOpcode::Vdivps
| AvxOpcode::Vdivpd
| AvxOpcode::Vpcmpeqb
| AvxOpcode::Vpcmpeqw
| AvxOpcode::Vpcmpeqd
| AvxOpcode::Vpcmpeqq
| AvxOpcode::Vpcmpgtb
| AvxOpcode::Vpcmpgtw
| AvxOpcode::Vpcmpgtd
| AvxOpcode::Vpcmpgtq
| AvxOpcode::Vblendvps
| AvxOpcode::Vblendvpd
| AvxOpcode::Vpblendvb
| AvxOpcode::Vmovlhps
| AvxOpcode::Vpminsb
| AvxOpcode::Vpminsw
| AvxOpcode::Vpminsd
| AvxOpcode::Vpminub
| AvxOpcode::Vpminuw
| AvxOpcode::Vpminud
| AvxOpcode::Vpmaxsb
| AvxOpcode::Vpmaxsw
| AvxOpcode::Vpmaxsd
| AvxOpcode::Vpmaxub
| AvxOpcode::Vpmaxuw
| AvxOpcode::Vpmaxud
| AvxOpcode::Vpunpcklbw
| AvxOpcode::Vpunpckhbw
| AvxOpcode::Vpacksswb
| AvxOpcode::Vpackssdw
| AvxOpcode::Vpackuswb
| AvxOpcode::Vpackusdw
| AvxOpcode::Vpalignr
| AvxOpcode::Vpinsrb
| AvxOpcode::Vpinsrw
| AvxOpcode::Vpinsrd
| AvxOpcode::Vpinsrq
| AvxOpcode::Vpmaddwd
| AvxOpcode::Vpmaddubsw
| AvxOpcode::Vinsertps
| AvxOpcode::Vpshufb
| AvxOpcode::Vshufps
| AvxOpcode::Vpsllw
| AvxOpcode::Vpslld
| AvxOpcode::Vpsllq
| AvxOpcode::Vpsraw
| AvxOpcode::Vpsrad => {
smallvec![InstructionSet::AVX]
}
}