x64: implement vselect with variable blend instructions
This change implements `vselect` using SSE4.1's `BLENDVPS`, `BLENDVPD`,
and `PBLENDVB`. `vselect` is a lane-selecting instruction that is used
by
[simple_preopt.rs](fa1faf5d22/cranelift/codegen/src/simple_preopt.rs (L947-L999))
to lower `bitselect` to a single x86 instruction when the condition mask
is known to be boolean (all 1s or 0s, e.g., from a conversion). This is
better than `bitselect` in general, which lowers to 4-5 instructions.
The old backend had the `vselect` lowering; this simply introduces it to
the new backend.
This commit is contained in:
@@ -1441,6 +1441,7 @@ pub(crate) fn emit(
|
||||
SseOpcode::Andpd => (LegacyPrefixes::_66, 0x0F54, 2),
|
||||
SseOpcode::Andnps => (LegacyPrefixes::None, 0x0F55, 2),
|
||||
SseOpcode::Andnpd => (LegacyPrefixes::_66, 0x0F55, 2),
|
||||
SseOpcode::Blendvps => (LegacyPrefixes::_66, 0x0F3814, 3),
|
||||
SseOpcode::Blendvpd => (LegacyPrefixes::_66, 0x0F3815, 3),
|
||||
SseOpcode::Cvttps2dq => (LegacyPrefixes::_F3, 0x0F5B, 2),
|
||||
SseOpcode::Cvtdq2ps => (LegacyPrefixes::None, 0x0F5B, 2),
|
||||
@@ -1480,6 +1481,7 @@ pub(crate) fn emit(
|
||||
SseOpcode::Pandn => (LegacyPrefixes::_66, 0x0FDF, 2),
|
||||
SseOpcode::Pavgb => (LegacyPrefixes::_66, 0x0FE0, 2),
|
||||
SseOpcode::Pavgw => (LegacyPrefixes::_66, 0x0FE3, 2),
|
||||
SseOpcode::Pblendvb => (LegacyPrefixes::_66, 0x0F3810, 3),
|
||||
SseOpcode::Pcmpeqb => (LegacyPrefixes::_66, 0x0F74, 2),
|
||||
SseOpcode::Pcmpeqw => (LegacyPrefixes::_66, 0x0F75, 2),
|
||||
SseOpcode::Pcmpeqd => (LegacyPrefixes::_66, 0x0F76, 2),
|
||||
|
||||
Reference in New Issue
Block a user