Now, an instruction cannot refer to the GP operand size and the vector
operand size at the same time. This isn't necessary, all necessary
distinguishing could also be achieved manually using W0/W1/66 selectors.
There is no instruction that uses an implicit register and an
VEX-encoded register at the same time. Thus, we can merge vexreg and
zeroreg in the instruction descriptor; the zeroreg value will be added
to the vex-operand (which is zero unless set by a VEX prefix).
This also frees 4 descriptor bits for use with AVX-512 (which will
probably need 1-2 additional unused bits, probably from the type).
This is an *experimental* (read: unstable) API which exposes encoding
functionality as one function per instruction. This makes the encoding
process itself significantly faster, at the cost of a much larger binary
size (~1 MiB of code, no data) and much higher compilation time.
Some instructions honor an address-size override or a segment override,
even in the absence of a directly encoded memory operand.
These annotations are not yet used, but may be used in future to
optimize the size of encoded instructions.
To avoid GCC warnings when building with `-Os`:
warning: inlining failed in call to 'table_walk': call is
unlikely and code size would grow [-Winline]
I don't know if this causes a performance regression when optimizing for
speed instead of size, but perhaps there's a different way we can help
the compiler make this decision in such cases.
Where we'd end up losing the upper bits.
GCC catches this and emits a warning such as:
warning: result of '7 << 31' requires 35 bits to represent, but
'long int' only has 32 bits [-Wshift-overflow=]
Where size_t is only 32 bits wide, and we end up losing the upper bits.
GCC catches this and emits a warning such as:
warning: left shift count >= width of type [-Wshift-count-overflow]
The AMD64 instructions VPERMIL2PS and VPERMIL2PD (currently not
supported) encode a fifth immediate operand in the lower bits of the
re-purposed immediate. Expose this value in any case so that no
information gets lost during decoding.
This changes the instruction description format:
- Use Intel/AMD terminology for describing operands (where applicable)
- Group instructions by ISA extension
- Indicate read/written status flags