Commit Graph

90 Commits

Author SHA1 Message Date
Alexis Engelke
effc0c7e49 parseinstrs: Fold trie layers with only one child 2021-09-13 17:27:47 +02:00
Alexis Engelke
71c0daf581 instrs: Change operand format
This changes the instruction description format:

- Use Intel/AMD terminology for describing operands (where applicable)
- Group instructions by ISA extension
- Indicate read/written status flags
2021-09-13 17:26:43 +02:00
Alexis Engelke
e41d6c26f8 parseinstrs: Make superstring function generic 2021-09-11 13:19:17 +02:00
Alexis Engelke
1fcacdeda7 parseinstrs: Optimize mnemonic compression
As the formatter no longer demands a null-terminated string, mnemonics
can arbitarily overlap and therefore save space.

This is the shortest superstring problem, which is NP-hard. This is
currently approximated with a greedy heuristic.
2021-09-11 13:05:34 +02:00
Alexis Engelke
99a1fbeee1 format: Major refactoring for performance 2021-05-30 14:25:38 +02:00
Alexis Engelke
50f052488d decode: More precise register types 2021-04-02 17:20:23 +02:00
Alexis Engelke
4185d7b2d6 encode: Support FD/TD encodings 2021-04-02 11:26:12 +02:00
Alexis Engelke
2d9587bc16 encode: Fix encoding of memory operand
When a modrm_idx is used without a ModRM being present, the encoder
attempted to encode memory operands using O/OA/AO encodings.
2021-04-02 10:54:04 +02:00
Alexis Engelke
5faa90a292 encode: Support RVMR encoding 2021-03-23 12:55:43 +01:00
Alexis Engelke
4f2366afd1 instrs: Add VIA PadLock and AMD RDPRU 2021-01-23 16:47:30 +01:00
Alexis Engelke
363698db3b parseinstrs: Move decode table gen to new function 2021-01-23 13:59:59 +01:00
Alexis Engelke
85fdaa3a9b instrs: Remove incorrect NFx specifiers
The new trie implementation is more flexible and allows omitting
prefixes even with a ModRM specifier in the opcode. Use this flexibility
to simplify instruction descriptions.
2021-01-23 13:25:23 +01:00
Alexis Engelke
dc399390a4 parseinstrs: Refactor mapping of opcode to Trie 2021-01-23 13:25:23 +01:00
Alexis Engelke
13a2456458 parseinstrs: Simplify trie implementation 2021-01-23 13:25:23 +01:00
Alexis Engelke
43910a6227 parseinstrs: Avoid redundant encoding of InstrDesc 2021-01-23 13:25:23 +01:00
Alexis Engelke
09d3886577 parseinstrs: Move regtype encoding to InstrDesc 2021-01-23 13:25:23 +01:00
Alexis Engelke
d6278de812 parseinstrs: Use tuples/ints for indexing in trie
This avoids useless internal string formatting.
2021-01-23 13:25:23 +01:00
Alexis Engelke
1390bae341 parseinstr: Create optype string in descriptor
The raw encoding representation is now only used in InstrDesc.
2021-01-23 13:25:23 +01:00
Alexis Engelke
801fe4bc43 parseinstrs: Generalize immediate size computation 2021-01-23 13:25:23 +01:00
Alexis Engelke
62018556a1 parseinstrs: Simplify operand kind parsing 2021-01-23 13:25:23 +01:00
Alexis Engelke
bd611902b0 parseinstrs: Add separate ModRM indicator to desc
Some instructions have no ModRM operand and no extended opcode but still
consume a ModRM byte.
2021-01-23 13:25:23 +01:00
Alexis Engelke
80df5ff47c instrs: Add reserved NOP/PREFETCH as weak opcodes 2021-01-10 16:53:27 +01:00
Alexis Engelke
1458bf9673 encode: Support VEX-encoded instructions 2021-01-10 16:03:40 +01:00
Alexis Engelke
c050b34ff9 instrs: Add support for undocumented instructions
Undocumented instruction are not decoded by default.

- SALC: undocumented in any recent manual and unsupported by newer
  Intel CPUs. Including as listed by [1,2].
- Undocumented FPU instructions: see [2].

[1]: http://www.rcollins.org/secrets/opcodes/SALC.html
[2]: https://github.com/xoreaxeaxeax/sandsifter/issues/33
2021-01-10 15:04:37 +01:00
Alexis Engelke
862b6d285c instrs: Minor operand size fixes 2021-01-10 14:13:44 +01:00
Alexis Engelke
af9188e267 parseinstrs: Respect mem-only/reg-only encodings 2021-01-10 12:02:58 +01:00
Alexis Engelke
111769832f format: Properly output VSIB encodings 2021-01-08 10:37:13 +01:00
Alexis Engelke
3fdbd70153 encode: Fix erroneous encoding of high registers 2021-01-07 10:03:17 +01:00
Alexis Engelke
db183ee6f9 meson: Check compiler options and Python version
Thanks to William Woodruff for pointing out that -Wcast-align=strict is
a GCC-only option, which causes build errors (instead of just
complaining about an unsupported warning option).
2021-01-05 20:21:44 +01:00
Alexis Engelke
44808e7b1a format: Format instructions with Intel syntax 2021-01-03 21:18:57 +01:00
Alexis Engelke
d2bf961b77 instrs: Properly handle PUSH/POP of SEG registers 2021-01-03 20:08:34 +01:00
Alexis Engelke
3a3a284f6f parseinstrs: Improve performance 2021-01-03 20:08:34 +01:00
Alexis Engelke
5a77c0e6eb parseinstrs: Use suffix tree to reduce mnem size
This brings slight size improvements, although due to SSE/MMX
instruction name prefixes, benefits are rather small (~50 bytes).
2021-01-03 20:08:30 +01:00
Alexis Engelke
fd80706f54 decode: Store instruction descriptors separately 2020-11-22 22:27:43 +01:00
Alexis Engelke
6fe5500444 instrs: Force RIP access to 64-bit and fix XBEGIN 2020-11-22 15:13:52 +01:00
Alexis Engelke
f9bba6289e instrs: Annotate only-mem and only-reg in opcode 2020-11-22 11:34:55 +01:00
Alexis Engelke
62b0420147 parseinstr: Simplify opcode naming scheme 2020-11-09 09:47:36 +01:00
Alexis Engelke
9df6ac1788 decode: Replace T8+T72 with T16+T8E for R/M value 2020-11-09 09:47:36 +01:00
Alexis Engelke
7d7e72746e parseinstr: Split escape and opcode 2020-11-09 09:47:36 +01:00
Alexis Engelke
01e1587c5c decode: Move prefix before other opcode extensions 2020-11-09 09:47:36 +01:00
Alexis Engelke
2e7e396325 decode: Remove TABLE_PREFIX_REP and use NFx prefix 2020-11-09 09:47:36 +01:00
Alexis Engelke
69ce124354 encode: Add library for x86-64 encoding 2020-11-09 09:46:38 +01:00
Alexis Engelke
468eeaa249 parseinstrs: Create a separate class for parsed opcode 2020-07-05 14:57:22 +02:00
Alexis Engelke
9b6caeb2ae parseinstrs: Write mnemonics to separate file 2020-07-04 14:35:51 +02:00
Alexis Engelke
dc668691d8 instrs: Specify segment register size 2020-07-04 14:25:22 +02:00
Alexis Engelke
141680e77c instrs: Remove MUSTMEM, encode in operands 2020-07-04 14:24:56 +02:00
Alexis Engelke
da4ad137d8 instrs: Remove redundant IMM_8 2020-07-04 08:55:51 +02:00
Alexis Engelke
7ee9320840 decode: Add second fixed operand size 2020-06-30 22:07:18 +02:00
Alexis Engelke
08490d4503 parseinstrs: Simplify opkind lookup 2020-06-30 21:02:31 +02:00
Alexis Engelke
3221a319d3 instrs: Don't use O-encoding hack for FSTSW 2020-06-27 17:33:58 +02:00