The register allocator can't handle branches with constrained register
operands, and the brz.b1/brnz.b1 instructions only have the t8jccd_abcd
in 32-bit mode where no REX prefixes are possible.
This adds a worst case encoding for those cases where a b1 value lives
in a non-ABCD register.
A value passed as an argument to a function call may live in an incoming
stack slot initially. Fix the call legalizer so it copies such an
argument into the expected outgoing stack slot for the call.
In 32-bit mode, all function arguments are passed on the stack, not in
registers.
This ABI support is not complete or properly tested, but at least it
doesn't try to pass arguments in r8.
The native x86_fmin and x86_fmax instructions don't behave correctly for
NaN inputs and when comparing +0.0 to -0.0, so we need separate branches
for those cases.
These Intel-specific instructions represent the semantics of the minss /
maxss Intel instructions which behave more like a C ternary operator
than the WebAssembly fmin and fmax instructions.
They will be used as building blocks for implementing the WebAssembly
semantics.
This is used to represent the non-trapping semantics of the cvttss2si and
cvttsd2si instructions (and their vectorized counterparts).
The overflow behavior of this instruction is specific to the Intel ISAs.
There is no float-to-i64 instruction on the 32-bit Intel ISA.
Not all floating point condition codes are directly supported by the
ucimiss/ucomisd instructions. Some inequalities need to be reversed and
eq+ne require two separate tests.
To begin with, these are catch-all encodings with a SIB byte and a
32-bit displacement, so they can access any stack slot via both the
stack pointer and the frame pointer.
In the future, we will add encodings for 8-bit displacements as well as
EBP-relative references without a SIB byte.
Use the simplest expansion which materializes the bits of the floating
point constant as an integer and then bit-casts to the floating point
type. In the future, we may want to use constant pools instead. Either
way, we need custom legalization.
Also add a legalize_monomorphic() function to the Python targetISA class
which permits the configuration of a default legalization action for
monomorphic instructions, just like legalize_type() does for polymorphic
instructions.
Use these encodings to test trapz.b1 and trapnz.b1.
When a b1 value is stored in a register, only the low 8 bits are valid.
This is so we can use the various setCC instructions to generate the b1
registers.
The expansion of a heap_addr instruction depends on the type of heap and
its configuration, so this is handled by custom code.
Add a couple examples of heap access code to the language reference
manual.
The code to compute the address of a global variable depends on the kind
of variable, so custom legalization is required.
- Add a legalizer::globalvar module which exposes an
expand_global_addr() function. This module is likely to grow as we add
more types of global variables.
- Add a ArgumentPurpose::VMContext enumerator. This is used to represent
special 'vmctx' arguments that are used as base pointers for vmctx
globals.
A CallConv enum on every function signature makes it possible to
generate calls to functions with different calling conventions within
the same ISA / within a single function.
The calling conventions also serve as a way of customizing Cretonne's
behavior when embedded inside a VM. As an example, the SpiderWASM
calling convention is used to compile WebAssembly functions that run
inside the SpiderMonkey virtual machine.
All function signatures must have a calling convention at the end, so
this changes the textual IL syntax.
Before:
sig1 = signature(i32, f64) -> f64
After
sig1 = (i32, f64) -> f64 native
sig2 = (i32) spiderwasm
When printing functions, the signature goes after the return types:
function %r1() -> i32, f32 spiderwasm {
ebb1:
...
}
In the parser, this calling convention is optional and defaults to
"native". This is mostly to avoid updating all the existing test cases
under filetests/. When printing a function, the calling convention is
always included, including for "native" functions.
* Added Intel x86-64 encodings for 64bit loads and store instructions
* Using GPR registers instead of ABCD for istore8 with REX prefix
Fixed testing of 64bit intel encoding
* Emit REX and REX-less encodings for optional REX prefix
Value renumbering in binary64.cton
The following instructions have simple encodings:
- bitcast.f32.i32
- bitcast.i32.f32
- bitcast.f64.i64
- bitcast.i64.f64
- fpromote.f64.f32
- fdemote.f32.f64
Also add helper functions enc_flt() and enc_i32_i64 to
intel.encodings.py for generating the common set of encodings for an
instruction: I32, I64 w/REX, I64 w/o REX.
The encoding tables are keyed by the controlling type variable only. We
need to distinguish different encodings for instructions with multiple
type variables.
Add a TypePredicate instruction predicate which can check the type of an
instruction value operand. Combine type checks into the instruction
predicate for instructions with more than one type variable.
Add Intel encodings for fcvt_from_sint.f32.i64 which can now be
distinguished from fcvt_from_sint.f32.i32.
These map to single Intel instructions.
The i64 to float conversions are not tested yet. The encoding tables
can't yet differentiate instructions on a secondary type variable alone.
This instruction returns a `b1` value which is represented as the output
of a setCC instruction which is the low 8 bits of a GPR register. Use a
cmp+setCC macro recipe to encode this. That is not ideal, but we can't
represent CPU flags yet.
Add instructions representing Intel's division instructions which use a
numerator that is twice as wide as the denominator and produce both the
quotient and remainder.
Add encodings for the x86_[su]divmodx instructions.
Change the result type for the bit-counting instructions from a fixed i8
to the iB type variable which is the type of the input. This matches the
convention in WebAssembly, and at least Intel's instructions will set a
full register's worth of count result, even if it is always < 64.
Duplicate the Intel 'ur' encoding recipe into 'umr' and 'urm' variants
corresponding to the RM and MR encoding variants. The difference is
which register is encoded as 'reg' and which is 'r/m' in the ModR/M
byte. A 'mov' register copy uses the MR variant, a unary popcnt uses the
RM variant.