Previously `fsub` was used but this fails when negating -0.0 and +0.0 in the SIMD spec tests; using more instructions, this change uses shifts to create a constant for flipping the most significant bit of each lane with `bxor`.
This crate contains the metaprogram used by cranelift-codegen. It's not useful on its own.