riscv64: Improve ctz/clz/cls codegen (#5854)

* cranelift: Add extra runtests for `clz`/`ctz`

* riscv64: Restrict lowering rules for `ctz`/`clz`

* cranelift: Add `u64` isle helpers

* riscv64: Improve `ctz` codegen

* riscv64: Improve `clz` codegen

* riscv64: Improve `cls` codegen

* riscv64: Improve `clz.i128` codegen

Instead of checking if we have 64 zeros in the top half. Check
if it *is* 0, that way we avoid loading the `64` constant.

* riscv64: Improve `ctz.i128` codegen

Instead of checking if we have 64 zeros in the bottom half. Check
if it *is* 0, that way we avoid loading the `64` constant.

* riscv64: Use extended value in `lower_cls`

* riscv64: Use pattern matches on `bseti`
This commit is contained in:
Afonso Bordado
2023-03-21 23:15:14 +00:00
committed by GitHub
parent ff6f17ca52
commit 7a3df7dcc0
14 changed files with 617 additions and 167 deletions

View File

@@ -327,14 +327,14 @@
;;;; Rules for `ctz` ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(rule (lower (has_type ty (ctz x)))
(rule (lower (has_type (fits_in_64 ty) (ctz x)))
(lower_ctz ty x))
(rule 1 (lower (has_type $I128 (ctz x)))
(lower_ctz_128 x))
;;;; Rules for `clz` ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(rule (lower (has_type ty (clz x)))
(rule (lower (has_type (fits_in_64 ty) (clz x)))
(lower_clz ty x))
(rule 1 (lower (has_type $I128 (clz x)))
@@ -342,7 +342,7 @@
;;;; Rules for `cls` ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(rule (lower (has_type (fits_in_64 ty) (cls x)))
(lower_cls x ty))
(lower_cls ty x))
(rule 1 (lower (has_type $I128 (cls x)))
(lower_cls_i128 x))