ABI: implement register arguments with constraints. (#4858)

* ABI: implement register arguments with constraints. Currently, Cranelift's ABI code emits a sequence of moves from physical registers into vregs at the top of the function body, one for every register-carried argument. For a number of reasons, we want to move to operand constraints instead, and remove the use of explicitly-named "pinned vregs"; this allows for better regalloc in theory, as it removes the need to "reverse-engineer" the sequence of moves. This PR alters the ABI code so that it generates a single "args" pseudo-instruction as the first instruction in the function body. This pseudo-inst defs all register arguments, and constrains them to the appropriate registers at the def-point. Subsequently the regalloc can move them wherever it needs to. Some care was taken not to have this pseudo-inst show up in post-regalloc disassemblies, but the change did cause a general regalloc "shift" in many tests, so the precise-output updates are a bit noisy. Sorry about that! A subsequent PR will handle the other half of the ABI code, namely, the callsite case, with a similar preg-to-constraint conversion. * Update based on review feedback. * Review feedback.
2022-09-08 20:03:14 -05:00
parent 13c7846815
commit 2986f6b0ff
101 changed files with 2688 additions and 2441 deletions
--- a/cranelift/filetests/filetests/isa/aarch64/floating-point.clif
+++ b/cranelift/filetests/filetests/isa/aarch64/floating-point.clif
@@ -309,8 +309,8 @@ block0(v0: f32, v1: f32):
 }

 ; block0:
-;   ushr v6.2s, v1.2s, #31
-;   sli v0.2s, v0.2s, v6.2s, #31
+;   ushr v5.2s, v1.2s, #31
+;   sli v0.2s, v0.2s, v5.2s, #31
 ;   ret

 function %f32(f64, f64) -> f64 {
@@ -320,8 +320,8 @@ block0(v0: f64, v1: f64):
 }

 ; block0:
-;   ushr d6, d1, #63
-;   sli d0, d0, d6, #63
+;   ushr d5, d1, #63
+;   sli d0, d0, d5, #63
 ;   ret

 function %f33(f32) -> i32 {
@@ -951,8 +951,8 @@ block0(v0: f32x2, v1: f32x2):
 }

 ; block0:
-;   ushr v6.2s, v1.2s, #31
-;   sli v0.2s, v0.2s, v6.2s, #31
+;   ushr v5.2s, v1.2s, #31
+;   sli v0.2s, v0.2s, v5.2s, #31
 ;   ret

 function %f82(f32x4, f32x4) -> f32x4 {
@@ -962,8 +962,8 @@ block0(v0: f32x4, v1: f32x4):
 }

 ; block0:
-;   ushr v6.4s, v1.4s, #31
-;   sli v0.4s, v0.4s, v6.4s, #31
+;   ushr v5.4s, v1.4s, #31
+;   sli v0.4s, v0.4s, v5.4s, #31
 ;   ret

 function %f83(f64x2, f64x2) -> f64x2 {
@@ -973,7 +973,7 @@ block0(v0: f64x2, v1: f64x2):
 }

 ; block0:
-;   ushr v6.2d, v1.2d, #63
-;   sli v0.2d, v0.2d, v6.2d, #63
+;   ushr v5.2d, v1.2d, #63
+;   sli v0.2d, v0.2d, v5.2d, #63
 ;   ret