The Intel documentation is, well, inconsistent about this: at one point,
they say that the VEX.W prefix is ignored entirely in 32-bit mode, but
the instruction description indicates that a VEX.W can be required in
32-bit/compatibility mode as well.
Some instructions, e.g. VZEROUPPER, have a prefix table but no
associated byte for that. Fix this by removing the prefix handling from
the table walking loop.
It is possible to configure the build process such that decoding of 32
bit and 64 bit instructions can be chosen at runtime using an additional
parameter of the decode function. The header file is now entirely
architecture-independent and no longer required any previous defines.
Decoding x86-64 still requires a 64-bit pointer size.
As there is not much difference between the two mnemonic tables, it is
possible to unify them. As a consequence, the instruction types no
longer differ between 32 and 64 bit decodings.
This is mainly needed to handle the new control flow enforcement
extensions, making 3E a "notrack" prefix for indirect calls and jumps.
This is not (yet) modeled, and requires additional information on the
order of the prefixes, as 3E_66 (16-bit in ds segment) has a different
meaning than 66_3E (16-bit notrack). Before implementing this, an
analysis of the performance impact when decoding more prefix information
is probably required to avoid degrading overall performance for very few
and (as of now) seldomly used corner cases.
This allows to decode x86-32 machine code on a 64-bit platform (but
not vice versa). As a side-effect, we also get rid of pointer-size
detection for architecture selection.