On some ISAs like Intel's, all arithmetic instructions set all or some
of the CPU flags, so flag values can't be live across these
instructions. On ISAs like ARM's Aarch32, flags are clobbered by compact
16-bit encodings but not necessarily by 32-bit encodings of the same
instruction.
The "clobbers_flags" bit on the encoding recipe is used to indicate if
CPU flag values can be live across an instruction, or conversely whether
the encoding can be used where flag values are live.
The register allocator can't handle branches with constrained register
operands, and the brz.b1/brnz.b1 instructions only have the t8jccd_abcd
in 32-bit mode where no REX prefixes are possible.
This adds a worst case encoding for those cases where a b1 value lives
in a non-ABCD register.
Most recipes with an ABCD constraint can handle the full GPR register
class when a REX prefix is applied, but not all. The "icscc" macro
recipe always generates a setCC instruction with no REX prefix, so it
can only write the ABCD registers, even in its REX form.
Don't automatically rewrite ABCD constraints to GPR constraints when
applying a REX prefix to a tail recipe. Instead, allow individual ABCD
recipes to specify a "when_prefixed" alternative recipe to use. This
also eliminates the spurious Rex*abcd recipe names which didn't have an
ABCD constraint.
Also allow recipes to specify that a REX prefix is required by setting
the prefix_required flag. This is used by recipes like t8jccb which
explicitly accesses an 8-bit register with a GPR constraint which is
only valid with a prefix.
Some REX-less encodings require an ABCD input because they are looking
at 8-bit registers. This constraint doesn't apply with a REX prefix
where the low 8 bits of all registers are addressable.
The brz and brnz instructions get support for 32-bit jump displacements
for long range branches.
Also change the way branch ranges are specified on tail recipes for the
Intel instructions. All branch displacements are relative to the end of
the instruction, so just compute the branch range origin as the
instruction size instead of trying to specify it in the tail recipe
definitions.
This is used to represent the non-trapping semantics of the cvttss2si and
cvttsd2si instructions (and their vectorized counterparts).
The overflow behavior of this instruction is specific to the Intel ISAs.
There is no float-to-i64 instruction on the 32-bit Intel ISA.
Not all floating point condition codes are directly supported by the
ucimiss/ucomisd instructions. Some inequalities need to be reversed and
eq+ne require two separate tests.
To begin with, these are catch-all encodings with a SIB byte and a
32-bit displacement, so they can access any stack slot via both the
stack pointer and the frame pointer.
In the future, we will add encodings for 8-bit displacements as well as
EBP-relative references without a SIB byte.
Use these encodings to test trapz.b1 and trapnz.b1.
When a b1 value is stored in a register, only the low 8 bits are valid.
This is so we can use the various setCC instructions to generate the b1
registers.
The following instructions have simple encodings:
- bitcast.f32.i32
- bitcast.i32.f32
- bitcast.f64.i64
- bitcast.i64.f64
- fpromote.f64.f32
- fdemote.f32.f64
Also add helper functions enc_flt() and enc_i32_i64 to
intel.encodings.py for generating the common set of encodings for an
instruction: I32, I64 w/REX, I64 w/o REX.
These map to single Intel instructions.
The i64 to float conversions are not tested yet. The encoding tables
can't yet differentiate instructions on a secondary type variable alone.
This instruction returns a `b1` value which is represented as the output
of a setCC instruction which is the low 8 bits of a GPR register. Use a
cmp+setCC macro recipe to encode this. That is not ideal, but we can't
represent CPU flags yet.
Add instructions representing Intel's division instructions which use a
numerator that is twice as wide as the denominator and produce both the
quotient and remainder.
Add encodings for the x86_[su]divmodx instructions.
Change the result type for the bit-counting instructions from a fixed i8
to the iB type variable which is the type of the input. This matches the
convention in WebAssembly, and at least Intel's instructions will set a
full register's worth of count result, even if it is always < 64.
Duplicate the Intel 'ur' encoding recipe into 'umr' and 'urm' variants
corresponding to the RM and MR encoding variants. The difference is
which register is encoded as 'reg' and which is 'r/m' in the ModR/M
byte. A 'mov' register copy uses the MR variant, a unary popcnt uses the
RM variant.
Add a TailRecipe.rex() method which creates an encoding recipe with a
REX prefix.
Define I64 encodings with REX.W for i64 operations and with/without REX
for i32 ops. Only test the with-REX encodings for now. We don't yet have
an instruction shrinking pass that can select the non-REX encodings.
Use a PUT_OP macro in the TailRecipe Python class to replace the code
snippet that emits the prefixes + opcode part of the instruction encoding.
Prepare for the addition of REX prefixes by giving the PUT_OP functions
a third argument representing the REX prefix. For the non-REX encodings,
verify that no REX bits wold be needed.
Cretonne's encoding recipes need to have a fixed size so we can compute
accurate branch destination addresses. Intel's instruction encoding has
a lot of variance in the number of bytes needed to encode the opcode
which leads to a number of duplicated encoding recipes that only differ
in the opcode size.
Add an Intel-specific TailEnc Python class which represents an
abstraction over a set of recipes that are identical except for the
opcode encoding. The TailEnc can then generate specific encoding recipes
for each opcode format.
The opcode format is a prefix of the recipe name, so for example, the
'rr' TailEnc will generate the 'Op1rr', 'Op2rr', 'Mp2rr' etc recipes.
The TailEnc class provides a __call__ implementation that simply takes
the sequence of opcode bytes as arguments. It then looks up the right
prefix for the opcode bytes.
We don't support the full set of Intel addressing modes yet. So far we
have:
- Register indirect, no displacement.
- Register indirect, 8-bit signed displacement.
- Register indirect, 32-bit signed displacement.
The SIB addressing modes will need new Cretonne instruction formats to
represent.
These instructions have a fixed register constraint; the shift amount is
passed in CL.
Add meta language syntax so a fixed register can be specified as
"GPR.rcx".
Tabulate the Intel opcode representations and implement an OP() function
which computes the encoding bits.
Implement the single-byte opcode with a reg-reg ModR/M byte.