x64: implement vselect with variable blend instructions
This change implements `vselect` using SSE4.1's `BLENDVPS`, `BLENDVPD`,
and `PBLENDVB`. `vselect` is a lane-selecting instruction that is used
by
[simple_preopt.rs](fa1faf5d22/cranelift/codegen/src/simple_preopt.rs (L947-L999))
to lower `bitselect` to a single x86 instruction when the condition mask
is known to be boolean (all 1s or 0s, e.g., from a conversion). This is
better than `bitselect` in general, which lowers to 4-5 instructions.
The old backend had the `vselect` lowering; this simply introduces it to
the new backend.
This commit is contained in:
@@ -3432,6 +3432,18 @@ fn test_x64_emit() {
|
||||
"blendvpd %xmm15, %xmm4",
|
||||
));
|
||||
|
||||
insns.push((
|
||||
Inst::xmm_rm_r(SseOpcode::Blendvps, RegMem::reg(xmm2), w_xmm3),
|
||||
"660F3814DA",
|
||||
"blendvps %xmm2, %xmm3",
|
||||
));
|
||||
|
||||
insns.push((
|
||||
Inst::xmm_rm_r(SseOpcode::Pblendvb, RegMem::reg(xmm12), w_xmm13),
|
||||
"66450F3810EC",
|
||||
"pblendvb %xmm12, %xmm13",
|
||||
));
|
||||
|
||||
// ========================================================
|
||||
// XMM_RM_R: Integer Packed
|
||||
|
||||
|
||||
Reference in New Issue
Block a user