x64: implement vselect with variable blend instructions
This change implements `vselect` using SSE4.1's `BLENDVPS`, `BLENDVPD`,
and `PBLENDVB`. `vselect` is a lane-selecting instruction that is used
by
[simple_preopt.rs](fa1faf5d22/cranelift/codegen/src/simple_preopt.rs (L947-L999))
to lower `bitselect` to a single x86 instruction when the condition mask
is known to be boolean (all 1s or 0s, e.g., from a conversion). This is
better than `bitselect` in general, which lowers to 4-5 instructions.
The old backend had the `vselect` lowering; this simply introduces it to
the new backend.
This commit is contained in:
@@ -15,6 +15,16 @@ block0:
|
||||
; nextln: por %xmm1, %xmm0
|
||||
; not: movdqa
|
||||
|
||||
function %vselect_i16x8() -> i16x8 {
|
||||
block0:
|
||||
v0 = vconst.b16x8 [false true false true false true false true]
|
||||
v1 = vconst.i16x8 [0 0 0 0 0 0 0 0]
|
||||
v2 = vconst.i16x8 [0 0 0 0 0 0 0 0]
|
||||
v3 = vselect v0, v1, v2
|
||||
return v3
|
||||
}
|
||||
; check: pblendvb %xmm1, %xmm2
|
||||
|
||||
|
||||
|
||||
; 8x16 shifts: these lower to complex sequences of instructions
|
||||
|
||||
@@ -10,6 +10,17 @@ block0(v0: i8x16, v1: i8x16, v2: i8x16):
|
||||
; Remember that bitselect accepts: 1) the selector vector, 2) the "if true" vector, and 3) the "if false" vector.
|
||||
; run: %bitselect_i8x16([0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 255], [127 0 0 0 0 0 0 0 0 0 0 0 0 0 0 42], [42 0 0 0 0 0 0 0 0 0 0 0 0 0 0 127]) == [42 0 0 0 0 0 0 0 0 0 0 0 0 0 0 42]
|
||||
|
||||
function %vselect_i32x4(i32x4, i32x4) -> i32x4 {
|
||||
block0(v1: i32x4, v2: i32x4):
|
||||
; `make_trampoline` still does not know how to convert boolean vector types
|
||||
; so we load the value directly here.
|
||||
v0 = vconst.b32x4 [true true false false]
|
||||
v3 = vselect v0, v1, v2
|
||||
return v3
|
||||
}
|
||||
; Remember that vselect accepts: 1) the selector vector, 2) the "if true" vector, and 3) the "if false" vector.
|
||||
; run: %vselect_i8x16([1 2 -1 -1], [-1 -1 3 4]) == [1 2 3 4]
|
||||
|
||||
|
||||
|
||||
; shift left
|
||||
|
||||
Reference in New Issue
Block a user