This is an *experimental* (read: unstable) API which exposes encoding
functionality as one function per instruction. This makes the encoding
process itself significantly faster, at the cost of a much larger binary
size (~1 MiB of code, no data) and much higher compilation time.
Some instructions honor an address-size override or a segment override,
even in the absence of a directly encoded memory operand.
These annotations are not yet used, but may be used in future to
optimize the size of encoded instructions.
To avoid GCC warnings when building with `-Os`:
warning: inlining failed in call to 'table_walk': call is
unlikely and code size would grow [-Winline]
I don't know if this causes a performance regression when optimizing for
speed instead of size, but perhaps there's a different way we can help
the compiler make this decision in such cases.
Where we'd end up losing the upper bits.
GCC catches this and emits a warning such as:
warning: result of '7 << 31' requires 35 bits to represent, but
'long int' only has 32 bits [-Wshift-overflow=]
Where size_t is only 32 bits wide, and we end up losing the upper bits.
GCC catches this and emits a warning such as:
warning: left shift count >= width of type [-Wshift-count-overflow]
The AMD64 instructions VPERMIL2PS and VPERMIL2PD (currently not
supported) encode a fifth immediate operand in the lower bits of the
re-purposed immediate. Expose this value in any case so that no
information gets lost during decoding.
This changes the instruction description format:
- Use Intel/AMD terminology for describing operands (where applicable)
- Group instructions by ISA extension
- Indicate read/written status flags
As the formatter no longer demands a null-terminated string, mnemonics
can arbitarily overlap and therefore save space.
This is the shortest superstring problem, which is NP-hard. This is
currently approximated with a greedy heuristic.
These instructions have plenty of corner cases and some instructions
have a different usage of the memory operand. Given that MPX is already
deprecated by Intel, it seems that the better option is to decode these
(rarely occuring) instructions as NOPs.