125 Commits

Author SHA1 Message Date
Alexis Engelke
09d3886577 parseinstrs: Move regtype encoding to InstrDesc 2021-01-23 13:25:23 +01:00
Alexis Engelke
d6278de812 parseinstrs: Use tuples/ints for indexing in trie
This avoids useless internal string formatting.
2021-01-23 13:25:23 +01:00
Alexis Engelke
1390bae341 parseinstr: Create optype string in descriptor
The raw encoding representation is now only used in InstrDesc.
2021-01-23 13:25:23 +01:00
Alexis Engelke
801fe4bc43 parseinstrs: Generalize immediate size computation 2021-01-23 13:25:23 +01:00
Alexis Engelke
62018556a1 parseinstrs: Simplify operand kind parsing 2021-01-23 13:25:23 +01:00
Alexis Engelke
bd611902b0 parseinstrs: Add separate ModRM indicator to desc
Some instructions have no ModRM operand and no extended opcode but still
consume a ModRM byte.
2021-01-23 13:25:23 +01:00
Alexis Engelke
80df5ff47c instrs: Add reserved NOP/PREFETCH as weak opcodes 2021-01-10 16:53:27 +01:00
Alexis Engelke
1458bf9673 encode: Support VEX-encoded instructions 2021-01-10 16:03:40 +01:00
Alexis Engelke
c050b34ff9 instrs: Add support for undocumented instructions
Undocumented instruction are not decoded by default.

- SALC: undocumented in any recent manual and unsupported by newer
  Intel CPUs. Including as listed by [1,2].
- Undocumented FPU instructions: see [2].

[1]: http://www.rcollins.org/secrets/opcodes/SALC.html
[2]: https://github.com/xoreaxeaxeax/sandsifter/issues/33
2021-01-10 15:04:37 +01:00
Alexis Engelke
862b6d285c instrs: Minor operand size fixes 2021-01-10 14:13:44 +01:00
Alexis Engelke
af9188e267 parseinstrs: Respect mem-only/reg-only encodings 2021-01-10 12:02:58 +01:00
Alexis Engelke
111769832f format: Properly output VSIB encodings 2021-01-08 10:37:13 +01:00
Alexis Engelke
3fdbd70153 encode: Fix erroneous encoding of high registers 2021-01-07 10:03:17 +01:00
Alexis Engelke
db183ee6f9 meson: Check compiler options and Python version
Thanks to William Woodruff for pointing out that -Wcast-align=strict is
a GCC-only option, which causes build errors (instead of just
complaining about an unsupported warning option).
2021-01-05 20:21:44 +01:00
Alexis Engelke
44808e7b1a format: Format instructions with Intel syntax 2021-01-03 21:18:57 +01:00
Alexis Engelke
d2bf961b77 instrs: Properly handle PUSH/POP of SEG registers 2021-01-03 20:08:34 +01:00
Alexis Engelke
3a3a284f6f parseinstrs: Improve performance 2021-01-03 20:08:34 +01:00
Alexis Engelke
5a77c0e6eb parseinstrs: Use suffix tree to reduce mnem size
This brings slight size improvements, although due to SSE/MMX
instruction name prefixes, benefits are rather small (~50 bytes).
2021-01-03 20:08:30 +01:00
Alexis Engelke
fd80706f54 decode: Store instruction descriptors separately 2020-11-22 22:27:43 +01:00
Alexis Engelke
6fe5500444 instrs: Force RIP access to 64-bit and fix XBEGIN 2020-11-22 15:13:52 +01:00
Alexis Engelke
f9bba6289e instrs: Annotate only-mem and only-reg in opcode 2020-11-22 11:34:55 +01:00
Alexis Engelke
62b0420147 parseinstr: Simplify opcode naming scheme 2020-11-09 09:47:36 +01:00
Alexis Engelke
9df6ac1788 decode: Replace T8+T72 with T16+T8E for R/M value 2020-11-09 09:47:36 +01:00
Alexis Engelke
7d7e72746e parseinstr: Split escape and opcode 2020-11-09 09:47:36 +01:00
Alexis Engelke
01e1587c5c decode: Move prefix before other opcode extensions 2020-11-09 09:47:36 +01:00
Alexis Engelke
2e7e396325 decode: Remove TABLE_PREFIX_REP and use NFx prefix 2020-11-09 09:47:36 +01:00
Alexis Engelke
69ce124354 encode: Add library for x86-64 encoding 2020-11-09 09:46:38 +01:00
Alexis Engelke
468eeaa249 parseinstrs: Create a separate class for parsed opcode 2020-07-05 14:57:22 +02:00
Alexis Engelke
9b6caeb2ae parseinstrs: Write mnemonics to separate file 2020-07-04 14:35:51 +02:00
Alexis Engelke
dc668691d8 instrs: Specify segment register size 2020-07-04 14:25:22 +02:00
Alexis Engelke
141680e77c instrs: Remove MUSTMEM, encode in operands 2020-07-04 14:24:56 +02:00
Alexis Engelke
da4ad137d8 instrs: Remove redundant IMM_8 2020-07-04 08:55:51 +02:00
Alexis Engelke
7ee9320840 decode: Add second fixed operand size 2020-06-30 22:07:18 +02:00
Alexis Engelke
08490d4503 parseinstrs: Simplify opkind lookup 2020-06-30 21:02:31 +02:00
Alexis Engelke
3221a319d3 instrs: Don't use O-encoding hack for FSTSW 2020-06-27 17:33:58 +02:00
Alexis Engelke
1b5461036e decode: Don't walk escape opcodes in tables 2020-06-27 17:33:58 +02:00
Alexis Engelke
3ad518e22e decode: Store op types early and compact encoding
* The encoding of operand types in the decode table now only requires 9
  bits instead of the previous 16 bits.
* Operand types are decoded before the operands itself are stored. This
  allows to ignore REX.RB prefixed for specific register types.
2020-06-27 17:33:58 +02:00
Alexis Engelke
618d90ed42 instrs: Encode memory size for FPU instructions 2020-06-27 17:33:58 +02:00
Alexis Engelke
807d8a817b decode: Change imm_control to get rid of imm_byte 2020-06-19 14:04:17 +02:00
Alexis Engelke
f978785df3 parseinstrs: Make TrieEntry always hashable 2020-06-17 18:36:18 +02:00
Alexis Engelke
93a61a0ff1 parseinstrs: Remove mnemonic from instr bitstruct 2020-06-17 17:16:53 +02:00
Alexis Engelke
38f52c98b5 parseinstrs: Store mnemonic enum entry in trie 2020-06-17 17:08:23 +02:00
Alexis Engelke
af5b36a58e parseinstrs: Don't needlessly convert to bytes 2020-06-17 16:49:27 +02:00
Alexis Engelke
f4b41a7e80 decode: Use uint16_t for trie 2020-06-17 16:44:22 +02:00
Alexis Engelke
1fedc069b6 parseinstrs: Propagate unpacked data for trie 2020-06-17 16:34:27 +02:00
Alexis Engelke
da4cbc237f parseinstr: Use typing.NamedTuple 2020-05-10 14:20:34 +02:00
Alexis Engelke
513a913feb decode: Store CL as register operand for shifts 2020-02-19 16:53:59 +01:00
Alexis Engelke
e65086c76c parseinstr: Separate fields for operand properties 2020-02-16 18:12:07 +01:00
Alexis Engelke
e59117538f parseinstr: Include mnemnonic in flag bitstruct 2020-02-16 18:05:32 +01:00
Alexis Engelke
f6a66ea4fb Use special root table for VEX
Some instruction opcodes have an entirely different encoding when a VEX
prefix is present. For example, 0f41 is CMOVNO without mandatory
prefixes while VEX.NP.W0.L1.0f41 is KANDW with a mandatory prefix. To
avoid collisions, the VEX prefix is better handled as a completely
separate decode tree, at the cost of a slight increase in table size.
2020-02-10 20:34:37 +01:00