This change doesn't really affect performance and is preparatory
work for AVX-512, where the memory operand size is required to compute
the compressed displacement.
This is realized by two changes: first, GP and vector operand size are
completely separated using one extra bit. If the operand size of an
instruction is derived from VEX.L (or EVEX.L'L), then the "opsize" bits
indicate how to derive a smaller vector size (half/quarter/eighth).
Now, an instruction cannot refer to the GP operand size and the vector
operand size at the same time. This isn't necessary, all necessary
distinguishing could also be achieved manually using W0/W1/66 selectors.
There is no instruction that uses an implicit register and an
VEX-encoded register at the same time. Thus, we can merge vexreg and
zeroreg in the instruction descriptor; the zeroreg value will be added
to the vex-operand (which is zero unless set by a VEX prefix).
This also frees 4 descriptor bits for use with AVX-512 (which will
probably need 1-2 additional unused bits, probably from the type).
This is an *experimental* (read: unstable) API which exposes encoding
functionality as one function per instruction. This makes the encoding
process itself significantly faster, at the cost of a much larger binary
size (~1 MiB of code, no data) and much higher compilation time.
Some instructions honor an address-size override or a segment override,
even in the absence of a directly encoded memory operand.
These annotations are not yet used, but may be used in future to
optimize the size of encoded instructions.
To avoid GCC warnings when building with `-Os`:
warning: inlining failed in call to 'table_walk': call is
unlikely and code size would grow [-Winline]
I don't know if this causes a performance regression when optimizing for
speed instead of size, but perhaps there's a different way we can help
the compiler make this decision in such cases.
Where we'd end up losing the upper bits.
GCC catches this and emits a warning such as:
warning: result of '7 << 31' requires 35 bits to represent, but
'long int' only has 32 bits [-Wshift-overflow=]
Where size_t is only 32 bits wide, and we end up losing the upper bits.
GCC catches this and emits a warning such as:
warning: left shift count >= width of type [-Wshift-count-overflow]