Implement Unarrow, Uunarrow, and Snarrow for the interpreter

Implemented the following Opcodes for the Cranelift interpreter:
- `Unarrow` to combine two SIMD vectors into a new vector with twice
the lanes but half the width, with signed inputs which are clamped to
`0x00`.
- `Uunarrow` to perform the same operation as `Unarrow` but treating
inputs as unsigned.
- `Snarrow` to perform the same operation as `Unarrow` but treating
both inputs and outputs as signed, and saturating accordingly.

Note that all 3 instructions saturate at the type boundaries.

Copyright (c) 2021, Arm Limited
This commit is contained in:
dheaton-arm
2021-09-08 17:04:05 +01:00
parent 2412e8d784
commit 83c3bc5b9d
9 changed files with 161 additions and 10 deletions

View File

@@ -0,0 +1,11 @@
test interpret
test run
target aarch64
; x86_64 considers the case `i64x2` -> `i32x4` to be 'unreachable'
function %snarrow_i64x2(i64x2, i64x2) -> i32x4 {
block0(v0: i64x2, v1: i64x2):
v2 = snarrow v0, v1
return v2
}
; run: %snarrow_i64x2([65535 -100000], [5000000000 73]) == [65535 -100000 2147483647 73]

View File

@@ -0,0 +1,19 @@
test interpret
test run
target aarch64
set enable_simd
target x86_64
function %snarrow_i16x8(i16x8, i16x8) -> i8x16 {
block0(v0: i16x8, v1: i16x8):
v2 = snarrow v0, v1
return v2
}
; run: %snarrow_i16x8([1 127 128 15 32767 -32 48 0], [8 255 -100 100 -32768 73 80 42]) == [1 127 127 15 127 -32 48 0 8 127 -100 100 -128 73 80 42]
function %snarrow_i32x4(i32x4, i32x4) -> i16x8 {
block0(v0: i32x4, v1: i32x4):
v2 = snarrow v0, v1
return v2
}
; run: %snarrow_i32x4([32767 1048575 -70000 -5], [268435455 73 268435455 42]) == [32767 32767 -32768 -5 32767 73 32767 42]

View File

@@ -0,0 +1,11 @@
test interpret
test run
target aarch64
; x86_64 considers the case `i64x2 -> i32x4` to be 'unreachable'
function %unarrow_i64x2(i64x2, i64x2) -> i32x4 {
block0(v0: i64x2, v1: i64x2):
v2 = unarrow v0, v1
return v2
}
; run: %unarrow_i64x2([65535 -100000], [5000000000 73]) == [65535 0 4294967295 73]

View File

@@ -0,0 +1,19 @@
test interpret
test run
target aarch64
set enable_simd
target x86_64
function %unarrow_i16x8(i16x8, i16x8) -> i8x16 {
block0(v0: i16x8, v1: i16x8):
v2 = unarrow v0, v1
return v2
}
; run: %unarrow_i16x8([1 127 128 15 65535 -32 48 0], [8 255 -100 100 65534 73 80 42]) == [1 127 128 15 0 0 48 0 8 255 0 100 0 73 80 42]
function %unarrow_i32x4(i32x4, i32x4) -> i16x8 {
block0(v0: i32x4, v1: i32x4):
v2 = unarrow v0, v1
return v2
}
; run: %unarrow_i32x4([65535 1048575 -70000 -5], [268435455 73 268435455 42]) == [65535 65535 0 0 65535 73 65535 42]

View File

@@ -0,0 +1,26 @@
test interpret
test run
target aarch64
; x86_64 panics: `Did not match fcvt input!
; thread 'worker #0' panicked at 'register allocation: Analysis(EntryLiveinValues([v2V]))', cranelift/codegen/src/machinst/compile.rs:96:10`
function %uunarrow_i16x8(i16x8, i16x8) -> i8x16 {
block0(v0: i16x8, v1: i16x8):
v2 = uunarrow v0, v1
return v2
}
; run: %uunarrow_i16x8([1 127 128 15 65535 -32 48 0], [8 255 -100 100 65534 73 80 42]) == [1 127 128 15 255 255 48 0 8 255 255 100 255 73 80 42]
function %uunarrow_i32x4(i32x4, i32x4) -> i16x8 {
block0(v0: i32x4, v1: i32x4):
v2 = uunarrow v0, v1
return v2
}
; run: %uunarrow_i32x4([65535 1048575 -70000 -5], [268435455 73 268435455 42]) == [65535 65535 65535 65535 65535 73 65535 42]
function %uunarrow_i64x2(i64x2, i64x2) -> i32x4 {
block0(v0: i64x2, v1: i64x2):
v2 = uunarrow v0, v1
return v2
}
; run: %uunarrow_i64x2([65535 -100000], [5000000000 73]) == [65535 4294967295 4294967295 73]