This enables more runtests to be executed on s390x. Doing so
uncovered a two back-end bugs, which are fixed as well:
- The result of cls was always off by one.
- The result of popcnt.i16 has uninitialized high bits.
In addition, I found a bug in the load-op-store.clif test case:
v3 = heap_addr.i64 heap0, v1, 4
v4 = iconst.i64 42
store.i32 v4, v3
This was clearly intended to perform a 32-bit store, but
actually performs a 64-bit store (it seems the type annotation
of the store opcode is ignored, and the type of the operand
is used instead). That bug did not show any noticable symptoms
on little-endian architectures, but broke on big-endian.
This commit migrates these existing instructions to ISLE from the manual
lowerings implemented today. This was mostly straightforward but while I
was at it I fixed what appeared to be broken translations for I{8,16}
for `clz`, `cls`, and `ctz`. Previously the lowerings would produce
results as-if the input was 32-bits, but now I believe they all
correctly account for the bit-width.
Implemented for the Cranelift interpreter:
- `Bitrev` to reverse the order of the bits in an integer.
- `Cls` to count the leading bits which are the same as the sign bit in
an integer, yielding one less than the size of the integer for 0 and -1.
- `Clz` to count the number of leading zeros in the bitwise representation of the
integer.
- `Ctz` to count the number of trailing zeros in the bitwise representation of the
integer.
- `Popcnt` to count the number of ones in the bitwise representation of the
integer.
Copyright (c) 2021, Arm Limited