aarch64: Migrate uextend/sextend to ISLE

This commit migrates the sign/zero extension instructions from
`lower_inst.rs` to ISLE. There's actually a fair amount going on in this
migration since a few other pieces needed touching up along the way as
well:

* First is the actual migration of `uextend` and `sextend`. These
  instructions are relatively simple but end up having a number of special
  cases. I've attempted to replicate all the cases here but
  double-checks would be good.

* This commit actually fixes a few issues where if the result of a vector
  extraction is sign/zero-extended into i128 that actually results in
  panics in the current backend.

* This commit adds exhaustive testing for
  extension-of-a-vector-extraction is a noop wrt extraction.

* A bugfix around ISLE glue was required to get this commit working,
  notably the case where the `RegMapper` implementation was trying to
  map an input to an output (meaning ISLE was passing through an input
  unmodified to the output) wasn't working. This requires a `mov`
  instruction to be generated and this commit updates the glue to do
  this. At the same time this commit updates the ISLE glue to share more
  infrastructure between x64 and aarch64 so both backends get this fix
  instead of just aarch64.

Overall I think that the translation to ISLE was a net benefit for these
instructions. It's relatively obvious what all the cases are now unlike
before where it took a few reads of the code and some boolean switches
to figure out which path was taken for each flavor of input. I think
there's still possible improvements here where, for example, the
`put_in_reg_{s,z}ext64` helper doesn't use this logic so technically
those helpers could also pattern match the "well atomic loads and vector
extractions automatically do this for us" but that's a possible future
improvement for later (and shouldn't be too too hard with some ISLE
refactoring).
This commit is contained in:
Alex Crichton
2021-11-30 09:40:58 -08:00
parent 20e090b114
commit d89410ec4e
11 changed files with 937 additions and 391 deletions

View File

@@ -502,3 +502,89 @@
(result Reg (alu_rrrr (ALUOp3.MSub64) div y64 x64))
)
(value_reg result)))
;;;; Rules for `uextend` ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; General rule for extending input to an output which fits in a single
;; register.
(rule (lower (has_type (fits_in_64 out) (uextend x @ (value_type in))))
(value_reg (extend (put_in_reg x) $false (ty_bits in) (ty_bits out))))
;; Extraction of a vector lane automatically extends as necessary, so we can
;; skip an explicit extending instruction.
(rule (lower (has_type (fits_in_64 out)
(uextend (def_inst (extractlane vec @ (value_type in)
(u8_from_uimm8 lane))))))
(value_reg (mov_from_vec (put_in_reg vec) lane (vector_size in))))
;; Atomic loads will also automatically zero their upper bits so the `uextend`
;; instruction can effectively get skipped here.
(rule (lower (has_type (fits_in_64 out)
(uextend (and (value_type in) (sinkable_atomic_load addr)))))
(value_reg (load_acquire in (sink_atomic_load addr))))
;; Conversion to 128-bit needs a zero-extension of the lower bits and the upper
;; bits are all zero.
(rule (lower (has_type $I128 (uextend x)))
(value_regs (put_in_reg_zext64 x) (imm $I64 0)))
;; Like above where vector extraction automatically zero-extends extending to
;; i128 only requires generating a 0 constant for the upper bits.
(rule (lower (has_type $I128
(uextend (def_inst (extractlane vec @ (value_type in)
(u8_from_uimm8 lane))))))
(value_regs (mov_from_vec (put_in_reg vec) lane (vector_size in)) (imm $I64 0)))
;;;; Rules for `sextend` ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; General rule for extending input to an output which fits in a single
;; register.
(rule (lower (has_type (fits_in_64 out) (sextend x @ (value_type in))))
(value_reg (extend (put_in_reg x) $true (ty_bits in) (ty_bits out))))
;; Extraction of a vector lane automatically extends as necessary, so we can
;; skip an explicit extending instruction.
(rule (lower (has_type (fits_in_64 out)
(sextend (def_inst (extractlane vec @ (value_type in)
(u8_from_uimm8 lane))))))
(value_reg (mov_from_vec_signed (put_in_reg vec)
lane
(vector_size in)
(size_from_ty out))))
;; 64-bit to 128-bit only needs to sign-extend the input to the upper bits.
(rule (lower (has_type $I128 (sextend x)))
(let (
(lo Reg (put_in_reg_sext64 x))
(hi Reg (alu_rr_imm_shift (ALUOp.Asr64) lo (imm_shift_from_u8 63)))
)
(value_regs lo hi)))
;; Like above where vector extraction automatically zero-extends extending to
;; i128 only requires generating a 0 constant for the upper bits.
;;
;; Note that `mov_from_vec_signed` doesn't exist for i64x2, so that's
;; specifically excluded here.
(rule (lower (has_type $I128
(sextend (def_inst (extractlane vec @ (value_type in @ (not_i64x2))
(u8_from_uimm8 lane))))))
(let (
(lo Reg (mov_from_vec_signed (put_in_reg vec)
lane
(vector_size in)
(size_from_ty $I64)))
(hi Reg (alu_rr_imm_shift (ALUOp.Asr64) lo (imm_shift_from_u8 63)))
)
(value_regs lo hi)))
;; Extension from an extraction of i64x2 into i128.
(rule (lower (has_type $I128
(sextend (def_inst (extractlane vec @ (value_type $I64X2)
(u8_from_uimm8 lane))))))
(let (
(lo Reg (mov_from_vec (put_in_reg vec)
lane
(VectorSize.Size64x2)))
(hi Reg (alu_rr_imm_shift (ALUOp.Asr64) lo (imm_shift_from_u8 63)))
)
(value_regs lo hi)))