Aarch64 codegen: represent bool true as -1, not 1.

It seems that this is actually the correct behavior for bool types wider
than `b1`; some of the vector instruction optimizations depend on bool
lanes representing false and true as all-zeroes and all-ones
respectively. For `b8`..`b64`, this results in an extra negation after a
`cset` when a bool is produced by an `icmp`/`fcmp`, but the most common
case (`b1`) is unaffected, because an all-ones one-bit value is just
`1`.

An example of this assumption can be seen here:

399ee0a54c/cranelift/codegen/src/simple_preopt.rs (L956)

Thanks to Joey Gouly of ARM for noting this issue while implementing
SIMD support, and digging into the source (finding the above example) to
determine the correct behavior.
This commit is contained in:
Chris Fallin
2020-07-21 14:04:23 -07:00
parent c420f65214
commit b8f6d53a6b
5 changed files with 108 additions and 70 deletions

View File

@@ -1014,6 +1014,24 @@ pub(crate) fn lower_fcmp_or_ffcmp_to_flags<C: LowerCtx<I = Inst>>(ctx: &mut C, i
}
}
/// Convert a 0 / 1 result, such as from a conditional-set instruction, into a 0
/// / -1 (all-ones) result as expected for bool operations.
pub(crate) fn normalize_bool_result<C: LowerCtx<I = Inst>>(
ctx: &mut C,
insn: IRInst,
rd: Writable<Reg>,
) {
// A boolean is 0 / -1; if output width is > 1, negate.
if ty_bits(ctx.output_ty(insn, 0)) > 1 {
ctx.emit(Inst::AluRRR {
alu_op: ALUOp::Sub64,
rd,
rn: zero_reg(),
rm: rd.to_reg(),
});
}
}
//=============================================================================
// Lowering-backend trait implementation.