x64: Add most remaining AVX lowerings (#5819)

* x64: Add most remaining AVX lowerings

This commit goes through `inst.isle` and adds a corresponding AVX
lowering for most SSE lowerings. I opted to skip instructions where the
SSE lowering didn't read/modify a register, such as `roundps`. I think
that AVX will benefit these instructions when there's load-merging since
AVX doesn't require alignment, but I've deferred that work to a future
PR.

Otherwise though in this PR I think all (or almost all) of the 3-operand
forms of AVX instructions are supported with their SSE counterparts.
This should ideally improve codegen slightly by removing register
pressure and the need for `movdqa` between registers. I've attempted to
ensure that there's at least one codegen test for all the new instructions.

As a side note, the recent capstone integration into `precise-output`
tests helped me catch a number of encoding bugs much earlier than
otherwise, so I've found that incredibly useful in tests!

* Move `vpinsr*` instructions to their own variant

Use true `XmmMem` and `GprMem` types in the instruction as well to get
more type-level safety for what goes where.

* Remove `Inst::produces_const` accessor

Instead of conditionally defining regalloc and various other operations
instead add dedicated `MInst` variants for operations which are intended
to produce a constant to have more clear interactions with regalloc and
printing and such.

* Fix tests

* Register traps in `MachBuffer` for load-folding ops

This adds a missing `add_trap` to encoding of VEX instructions with
memory operands to ensure that if they cause a segfault that there's
appropriate metadata for Wasmtime to understand that the instruction
could in fact trap. This fixes a fuzz test case found locally where v8
trapped and Wasmtime didn't catch the signal and crashed the fuzzer.
This commit is contained in:
Alex Crichton
2023-02-20 09:11:52 -06:00
committed by GitHub
parent ad128b6811
commit c26a65a854
16 changed files with 4145 additions and 466 deletions

View File

@@ -891,12 +891,11 @@ impl Context for IsleContext<'_, '_, MInst, X64Backend> {
None
};
let dividend_hi = self.lower_ctx.alloc_tmp(types::I64).only_reg().unwrap();
self.lower_ctx.emit(MInst::alu_rmi_r(
OperandSize::Size32,
AluRmiROpcode::Xor,
RegMemImm::reg(dividend_hi.to_reg()),
dividend_hi,
));
self.lower_ctx.emit(MInst::AluConstOp {
op: AluRmiROpcode::Xor,
size: OperandSize::Size32,
dst: WritableGpr::from_reg(Gpr::new(dividend_hi.to_reg()).unwrap()),
});
self.lower_ctx.emit(MInst::checked_div_or_rem_seq(
kind.clone(),
size,