Switch x86 SIMD bor from ORPS to POR encoding
There are two reasons for this change: 1. it reduces confusion; using the `POR` encoding will match the future encodings of `band` and `bxor` and the `ORPS` encoding may be confusing as it is intended for floating-point operations 2. `POR` has slightly more throughput: it only has to wait 0.33 cycles to execute again on all Intel architectures above Core whereas `ORPS` must wait 1 cycle on architectures older than Skylake (Intel Optimization Reference Manual, C.3) `POR` does add one additional byte to the encoding and requires SSE2 so the `ORPS` opcode is left in for future use.
This commit is contained in:
@@ -307,6 +307,9 @@ pub static POP_REG: [u8; 1] = [0x58];
|
||||
/// Returns the count of number of bits set to 1.
|
||||
pub static POPCNT: [u8; 3] = [0xf3, 0x0f, 0xb8];
|
||||
|
||||
/// Bitwise OR of xmm2/m128 and xmm1 (SSE2).
|
||||
pub static POR: [u8; 3] = [0x66, 0x0f, 0xeb];
|
||||
|
||||
/// Shuffle bytes in xmm1 according to contents of xmm2/m128 (SSE3).
|
||||
pub static PSHUFB: [u8; 4] = [0x66, 0x0f, 0x38, 0x00];
|
||||
|
||||
|
||||
Reference in New Issue
Block a user