aarch64: Migrate uextend/sextend to ISLE

This commit migrates the sign/zero extension instructions from `lower_inst.rs` to ISLE. There's actually a fair amount going on in this migration since a few other pieces needed touching up along the way as well: * First is the actual migration of `uextend` and `sextend`. These instructions are relatively simple but end up having a number of special cases. I've attempted to replicate all the cases here but double-checks would be good. * This commit actually fixes a few issues where if the result of a vector extraction is sign/zero-extended into i128 that actually results in panics in the current backend. * This commit adds exhaustive testing for extension-of-a-vector-extraction is a noop wrt extraction. * A bugfix around ISLE glue was required to get this commit working, notably the case where the `RegMapper` implementation was trying to map an input to an output (meaning ISLE was passing through an input unmodified to the output) wasn't working. This requires a `mov` instruction to be generated and this commit updates the glue to do this. At the same time this commit updates the ISLE glue to share more infrastructure between x64 and aarch64 so both backends get this fix instead of just aarch64. Overall I think that the translation to ISLE was a net benefit for these instructions. It's relatively obvious what all the cases are now unlike before where it took a few reads of the code and some boolean switches to figure out which path was taken for each flavor of input. I think there's still possible improvements here where, for example, the `put_in_reg_{s,z}ext64` helper doesn't use this logic so technically those helpers could also pattern match the "well atomic loads and vector extractions automatically do this for us" but that's a possible future improvement for later (and shouldn't be too too hard with some ISLE refactoring).
2021-11-30 09:40:58 -08:00
parent 20e090b114
commit d89410ec4e
11 changed files with 937 additions and 391 deletions
--- a/cranelift/filetests/filetests/isa/aarch64/extend-op.clif
+++ b/cranelift/filetests/filetests/isa/aarch64/extend-op.clif
@@ -106,3 +106,211 @@ block0(v0: i8):
 ; check: sxtb x0, w0
 ; nextln: asr x1, x0, #63
 ; nextln: ret
+
+function %i8x16_uextend_i16(i8x16) -> i16 {
+block0(v0: i8x16):
+    v1 = extractlane v0, 1
+    v2 = uextend.i16 v1
+    return v2
+}
+
+; check:  umov w0, v0.b[1]
+; nextln: ret
+
+function %i8x16_uextend_i32(i8x16) -> i32 {
+block0(v0: i8x16):
+    v1 = extractlane v0, 1
+    v2 = uextend.i32 v1
+    return v2
+}
+
+; check:  umov w0, v0.b[1]
+; nextln: ret
+
+function %i8x16_uextend_i64(i8x16) -> i64 {
+block0(v0: i8x16):
+    v1 = extractlane v0, 1
+    v2 = uextend.i64 v1
+    return v2
+}
+
+; check:  umov w0, v0.b[1]
+; nextln: ret
+
+function %i8x16_uextend_i128(i8x16) -> i128 {
+block0(v0: i8x16):
+    v1 = extractlane v0, 1
+    v2 = uextend.i128 v1
+    return v2
+}
+
+; check:  umov w0, v0.b[1]
+; nextln: movz x1, #0
+; nextln: ret
+
+function %i8x16_sextend_i16(i8x16) -> i16 {
+block0(v0: i8x16):
+    v1 = extractlane v0, 1
+    v2 = sextend.i16 v1
+    return v2
+}
+
+; check:  smov w0, v0.b[1]
+; nextln: ret
+
+function %i8x16_sextend_i32(i8x16) -> i32 {
+block0(v0: i8x16):
+    v1 = extractlane v0, 1
+    v2 = sextend.i32 v1
+    return v2
+}
+
+; check:  smov w0, v0.b[1]
+; nextln: ret
+
+function %i8x16_sextend_i64(i8x16) -> i64 {
+block0(v0: i8x16):
+    v1 = extractlane v0, 1
+    v2 = sextend.i64 v1
+    return v2
+}
+
+; check:  smov x0, v0.b[1]
+; nextln: ret
+
+function %i8x16_sextend_i128(i8x16) -> i128 {
+block0(v0: i8x16):
+    v1 = extractlane v0, 1
+    v2 = sextend.i128 v1
+    return v2
+}
+
+; check:  smov x0, v0.b[1]
+; nextln: asr x1, x0, #63
+; nextln: ret
+
+function %i16x8_uextend_i32(i16x8) -> i32 {
+block0(v0: i16x8):
+    v1 = extractlane v0, 1
+    v2 = uextend.i32 v1
+    return v2
+}
+
+; check:  umov w0, v0.h[1]
+; nextln: ret
+
+function %i16x8_uextend_i64(i16x8) -> i64 {
+block0(v0: i16x8):
+    v1 = extractlane v0, 1
+    v2 = uextend.i64 v1
+    return v2
+}
+
+; check:  umov w0, v0.h[1]
+; nextln: ret
+
+function %i16x8_uextend_i128(i16x8) -> i128 {
+block0(v0: i16x8):
+    v1 = extractlane v0, 1
+    v2 = uextend.i128 v1
+    return v2
+}
+
+; check:  umov w0, v0.h[1]
+; nextln: movz x1, #0
+; nextln: ret
+
+function %i16x8_sextend_i32(i16x8) -> i32 {
+block0(v0: i16x8):
+    v1 = extractlane v0, 1
+    v2 = sextend.i32 v1
+    return v2
+}
+
+; check:  smov w0, v0.h[1]
+; nextln: ret
+
+function %i16x8_sextend_i64(i16x8) -> i64 {
+block0(v0: i16x8):
+    v1 = extractlane v0, 1
+    v2 = sextend.i64 v1
+    return v2
+}
+
+; check:  smov x0, v0.h[1]
+; nextln: ret
+
+function %i16x8_sextend_i128(i16x8) -> i128 {
+block0(v0: i16x8):
+    v1 = extractlane v0, 1
+    v2 = sextend.i128 v1
+    return v2
+}
+
+; check:  smov x0, v0.h[1]
+; nextln: asr x1, x0, #63
+; nextln: ret
+
+function %i32x4_uextend_i64(i32x4) -> i64 {
+block0(v0: i32x4):
+    v1 = extractlane v0, 1
+    v2 = uextend.i64 v1
+    return v2
+}
+
+; check:  mov w0, v0.s[1]
+; nextln: ret
+
+function %i32x4_uextend_i128(i32x4) -> i128 {
+block0(v0: i32x4):
+    v1 = extractlane v0, 1
+    v2 = uextend.i128 v1
+    return v2
+}
+
+; check:  mov w0, v0.s[1]
+; nextln: movz x1, #0
+; nextln: ret
+
+function %i32x4_sextend_i64(i32x4) -> i64 {
+block0(v0: i32x4):
+    v1 = extractlane v0, 1
+    v2 = sextend.i64 v1
+    return v2
+}
+
+; check:  smov x0, v0.s[1]
+; nextln: ret
+
+function %i32x4_sextend_i128(i32x4) -> i128 {
+block0(v0: i32x4):
+    v1 = extractlane v0, 1
+    v2 = sextend.i128 v1
+    return v2
+}
+
+; check:  smov x0, v0.s[1]
+; nextln: asr x1, x0, #63
+; nextln: ret
+
+function %i64x2_uextend_i128(i64x2) -> i128 {
+block0(v0: i64x2):
+    v1 = extractlane v0, 1
+    v2 = uextend.i128 v1
+    return v2
+}
+
+; check:  mov x0, v0.d[1]
+; nextln: movz x1, #0
+; nextln: ret
+
+function %i64x2_sextend_i128(i64x2) -> i128 {
+block0(v0: i64x2):
+    v1 = extractlane v0, 1
+    v2 = sextend.i128 v1
+    return v2
+}
+
+; check:  mov x0, v0.d[1]
+; nextln: asr x1, x0, #63
+; nextln: ret