Commit Graph

80 Commits

Author SHA1 Message Date
Julian Seward
6f8a54b6a5 Adds support for legalizing CLZ, CTZ and POPCOUNT on baseline x86_64 targets.
Changes:

* Adds a new generic instruction, SELECTIF, that does value selection (a la
  conditional move) similarly to existing SELECT, except that it is
  controlled by condition code input and flags-register inputs.

* Adds a new Intel x86_64 variant, 'baseline', that supports SSE2 and
  nothing else.

* Adds new Intel x86_64 instructions BSR and BSF.

* Implements generic CLZ, CTZ and POPCOUNT on x86_64 'baseline' targets
  using the new BSR, BSF and SELECTIF instructions.

* Implements SELECTIF on x86_64 targets using conditional-moves.

* new test filetests/isa/intel/baseline_clz_ctz_popcount.cton
  (for legalization)

* new test filetests/isa/intel/baseline_clz_ctz_popcount_encoding.cton
  (for encoding)

* Allow lib/cretonne/meta/gen_legalizer.py to generate non-snake-caseified
  Rust without rustc complaining.

Fixes #238.
2018-02-06 09:43:00 -08:00
Tyler McMullen
14e39db428 Add filetest for statically out-of-bound heap addresses. 2018-01-18 15:49:10 -08:00
Tyler McMullen
df210bfdea Fix the Intel x64 PIC 'call' test, adding correct addend. 2018-01-18 14:23:00 -08:00
Dan Gohman
4f53cc1dad Align IntelGOTPCRel4 with R_X86_64_GOTPCREL.
Add an addend field to reloc_external, and use it to move the
responsibility for accounting for the difference between the end of an
instruction (where the PC is considered to be in PC-relative on intel)
and the beginning of the immediate field into the encoding code.

Specifically, this makes IntelGOTPCRel4 directly correspond to
R_X86_64_GOTPCREL, instead of also carrying an implicit `- 4`.
2017-12-15 16:17:32 -06:00
Dan Gohman
76e31cc1ad Rename GotPCRel4 to GOTPCRel4.
This emphasizes that GOT is being used as an abbreviation rather than
the word "got".
2017-12-15 16:17:32 -06:00
Pat Hickey
ed81bc21be filetests: add filetests for intel PIC encodings 2017-12-12 19:29:52 -08:00
Jakob Stoklund Olesen
a7eb13a151 Expand unknown instructions to runtime library calls. 2017-12-08 10:37:50 -08:00
Jakob Stoklund Olesen
f03729d742 Fix generated code for ISA predicates on encoding recipes.
The generated code had syntax errors and inverted logic.

Add an SSE 4.1 requirement to the floating point rounding instructions.
2017-12-08 10:37:50 -08:00
Tyler McMullen
7988d0c54c Add 8-bit variation of adjust_sp_imm for 32-bit and 64-bit Intel. 2017-12-05 11:49:12 -08:00
Tyler McMullen
5783ea2c9a Account for return address when reserving stack space for CSRs. 2017-12-05 11:49:12 -08:00
Tyler McMullen
a75248d2cf Move the initial stack pointer adjustment to after the CSR pushes. 2017-12-05 11:49:12 -08:00
Tyler McMullen
ebcbd54f61 Add 'compile' test and confirm the pro/epilogue is added. Fix regression this revealed. 2017-12-05 11:49:12 -08:00
Tyler McMullen
ced39f5186 Fix up adjust_sp_imm instruction.
* Use imm64 rather than offset32
* Add predicate to enforce signed 32-bit limit to imm
* Remove AdjustSpImm format
* Add encoding tests for adjust_sp_imm
* Adjust use of adjust_sp_imm in Intel prologue_epilogue to match
2017-12-05 11:49:12 -08:00
Tyler McMullen
1a11c351b5 Add tests and documentation for x86_(push|pop). Fix up encoding issues revealed by tests. 2017-12-05 11:49:12 -08:00
Tyler McMullen
3b1b33e0ac Add docs and tests for copy_special instruction. Fixes encoding issue that tests revealed. 2017-12-05 11:49:12 -08:00
Tyler McMullen
6ec4bfc4ca Fix up the encodings for new instructions, both expected and actual. Make the test more accurate. 2017-12-05 11:49:12 -08:00
Tyler McMullen
fdfe24760a Add missing newline to prologue epilogue test 2017-12-05 11:49:12 -08:00
Tyler McMullen
d4311d2b1d Add prologue-epilogue test that exercises new instructions and binary emission. 2017-12-05 11:49:12 -08:00
Pat Hickey
b5601d57c8 filetests: change hex function names to user function numbers 2017-11-23 14:08:47 -08:00
Dan Gohman
5d063eb8bc Merge reloc_func and reloc_globalsym into reloc_external. 2017-10-31 12:26:33 -07:00
Dan Gohman
9c54c3fff0 Introduce globalsym_addr.
This is an instruction used in legalization of GlobalVarData::Sym global
variables.
2017-10-30 13:26:56 -07:00
Dan Gohman
cb805f704d Put BaldrMonkey-specific behavior under a setting.
BaldrMonkey will need to enable allones_funcaddrs.
2017-10-30 13:26:56 -07:00
Jakob Stoklund Olesen
1b71285b34 Return bools in GPR registers.
Boolean types are returned in %rax, so regclass_for_abi_type() should
return GPR.

Fixes #179.
2017-10-25 13:34:55 -07:00
Jakob Stoklund Olesen
5d065c4d8f Add encodings for CPU flags instructions.
Branch on flags: brif, brff,
Compare integers to flags: ifcmp
Compare floats to flags: ffcmp
Convert flags to b1: trueif, trueff
2017-10-16 13:07:23 -07:00
Jakob Stoklund Olesen
ba52a38597 Add a t8jccd_long encoding recipe for brz.b1 and brnz.b1 in 32-bit mode.
The register allocator can't handle branches with constrained register
operands, and the brz.b1/brnz.b1 instructions only have the t8jccd_abcd
in 32-bit mode where no REX prefixes are possible.

This adds a worst case encoding for those cases where a b1 value lives
in a non-ABCD register.
2017-10-11 14:20:43 -07:00
Jakob Stoklund Olesen
73d4bb47c0 Intel encodings for regspill and regfill.
These are always SP-based.
2017-10-04 17:02:09 -07:00
Jakob Stoklund Olesen
ef048b8899 Allow for call args in incoming stack slots.
A value passed as an argument to a function call may live in an incoming
stack slot initially. Fix the call legalizer so it copies such an
argument into the expected outgoing stack slot for the call.
2017-10-03 11:13:46 -07:00
Jakob Stoklund Olesen
a274cdf275 Fix the Intel encoding of band_not.
The andnps instruction inverts its first argument while band_not inverts
is second argument.

Use a swapped-operands "fax" encoding recipe.
2017-09-27 18:14:13 -07:00
Jakob Stoklund Olesen
84471a8431 Add some very basic support for the Intel32 ABI.
In 32-bit mode, all function arguments are passed on the stack, not in
registers.

This ABI support is not complete or properly tested, but at least it
doesn't try to pass arguments in r8.
2017-09-27 12:55:34 -07:00
Jakob Stoklund Olesen
b6b474a8c9 Add Intel legalization for fmin and fmax.
The native x86_fmin and x86_fmax instructions don't behave correctly for
NaN inputs and when comparing +0.0 to -0.0, so we need separate branches
for those cases.
2017-09-27 12:55:34 -07:00
Jakob Stoklund Olesen
44eab3e158 Add Intel regmove encodings for floating point types. 2017-09-27 12:49:54 -07:00
Jakob Stoklund Olesen
1fe7890700 Add x86_fmin and x86_fmax instructions.
These Intel-specific instructions represent the semantics of the minss /
maxss Intel instructions which behave more like a C ternary operator
than the WebAssembly fmin and fmax instructions.

They will be used as building blocks for implementing the WebAssembly
semantics.
2017-09-27 09:17:09 -07:00
Jakob Stoklund Olesen
ac69f3bfdf Add an Intel-specific x86_cvtt2si instruction.
This is used to represent the non-trapping semantics of the cvttss2si and
cvttsd2si instructions (and their vectorized counterparts).

The overflow behavior of this instruction is specific to the Intel ISAs.

There is no float-to-i64 instruction on the 32-bit Intel ISA.
2017-09-26 15:44:41 -07:00
Jakob Stoklund Olesen
6ff681a90d Add general legalization for the select instruction. 2017-09-26 14:16:35 -07:00
Jakob Stoklund Olesen
ce767be703 Intel encodings for floating point copies. 2017-09-26 13:54:38 -07:00
Jakob Stoklund Olesen
7fb6159a85 Add Intel encodings for the fcmp instruction.
Not all floating point condition codes are directly supported by the
ucimiss/ucomisd instructions. Some inequalities need to be reversed and
eq+ne require two separate tests.
2017-09-26 11:17:32 -07:00
Jakob Stoklund Olesen
6bec5f8507 Intel encodings for nearest/floor/ceil/trunc.
These floating point rounding operations all use the roundss/roundsd
instructions that are available in SSE 4.1.
2017-09-25 15:08:04 -07:00
Jakob Stoklund Olesen
ac343ba92a Add encodings for square root instructions. 2017-09-25 13:15:09 -07:00
Jakob Stoklund Olesen
29dfcf5dfb Add spill/fill encodings for Intel ISAs.
To begin with, these are catch-all encodings with a SIB byte and a
32-bit displacement, so they can access any stack slot via both the
stack pointer and the frame pointer.

In the future, we will add encodings for 8-bit displacements as well as
EBP-relative references without a SIB byte.
2017-09-22 16:05:26 -07:00
Angus Holder
b003605132 Adapt intel to be able to correctly choose compressed instruction encodings: create a register class to identify the lower 8 registers, omit unnecessary REX prefixes, and fix the tests 2017-09-22 07:54:26 -07:00
Angus Holder
3b66c0be40 Emit compressed instruction encodings for instructions where constraints allow 2017-09-22 07:54:26 -07:00
Jakob Stoklund Olesen
e8723be33f Add trap codes to the Cretonne IL.
The trap and trapz/trapnz instructions now take a trap code immediate
operand which indicates the reason for trapping.
2017-09-20 15:50:02 -07:00
Jakob Stoklund Olesen
fb827a2d4b Add func_addr encodings for Intel. 2017-09-19 16:33:38 -07:00
Jakob Stoklund Olesen
1fdeddd0d3 Add Intel encodings for floating point load/store instructions.
Include wasm/*-memory64.cton tests too.
2017-09-19 09:32:54 -07:00
Jakob Stoklund Olesen
88348368a8 Add custom legalization for floating point constants.
Use the simplest expansion which materializes the bits of the floating
point constant as an integer and then bit-casts to the floating point
type. In the future, we may want to use constant pools instead. Either
way, we need custom legalization.

Also add a legalize_monomorphic() function to the Python targetISA class
which permits the configuration of a default legalization action for
monomorphic instructions, just like legalize_type() does for polymorphic
instructions.
2017-09-18 13:33:34 -07:00
Jakob Stoklund Olesen
446fcdd7c5 Fix the REX bits for load/store instruction encodings.
The two registers were swapped in the REX encoding, and the tests didn't
have any high bit set registers.
2017-09-15 13:02:36 -07:00
Jakob Stoklund Olesen
5845f56cda Add x86-64 encodings for call instructions. 2017-09-13 09:34:48 -07:00
Jakob Stoklund Olesen
e8276ed965 Add more heap expansion tests. 2017-08-28 15:47:21 -07:00
Jakob Stoklund Olesen
2201e6249e Add Intel encodings for brz.b1 and brnz.b1.
Use these encodings to test trapz.b1 and trapnz.b1.

When a b1 value is stored in a register, only the low 8 bits are valid.
This is so we can use the various setCC instructions to generate the b1
registers.
2017-08-28 14:56:11 -07:00
Jakob Stoklund Olesen
217434b474 Add custom legalization for conditional traps.
The expansion of these instructions requires the CFG to be modified,
something the Python XForms can't yet do.
2017-08-28 11:19:42 -07:00