This is an *experimental* (read: unstable) API which exposes encoding
functionality as one function per instruction. This makes the encoding
process itself significantly faster, at the cost of a much larger binary
size (~1 MiB of code, no data) and much higher compilation time.
Thanks to William Woodruff for pointing out that -Wcast-align=strict is
a GCC-only option, which causes build errors (instead of just
complaining about an unsupported warning option).
It is possible to configure the build process such that decoding of 32
bit and 64 bit instructions can be chosen at runtime using an additional
parameter of the decode function. The header file is now entirely
architecture-independent and no longer required any previous defines.
Decoding x86-64 still requires a 64-bit pointer size.
This allows to decode x86-32 machine code on a 64-bit platform (but
not vice versa). As a side-effect, we also get rid of pointer-size
detection for architecture selection.