Commit Graph

349 Commits

Author SHA1 Message Date
Dan Gohman
5f8b1b9f04 Fix a flake8 lint. 2017-10-31 13:05:26 -07:00
Dan Gohman
5d063eb8bc Merge reloc_func and reloc_globalsym into reloc_external. 2017-10-31 12:26:33 -07:00
Dan Gohman
b60b2ce135 Change parse_multiline to follow PEP 257.
The main change is that it avoids creating blank lines when processing
docstrings.

This also adds blank lines in various places to make the generated code
prettier.
2017-10-31 12:21:23 -07:00
Dan Gohman
9c54c3fff0 Introduce globalsym_addr.
This is an instruction used in legalization of GlobalVarData::Sym global
variables.
2017-10-30 13:26:56 -07:00
Dan Gohman
cb805f704d Put BaldrMonkey-specific behavior under a setting.
BaldrMonkey will need to enable allones_funcaddrs.
2017-10-30 13:26:56 -07:00
Dan Gohman
fae5ffb556 Make generated code more consistent with current rustfmt. 2017-10-30 10:06:23 -07:00
Jakob Stoklund Olesen
02e81dd1d7 Fix build after flake8 update.
There's a new version of flake8 out which doesn't like variables names
i, l, I.

No functional change intended.
2017-10-25 11:40:37 -07:00
Jakob Stoklund Olesen
e8ecf1f809 Add a FixedTied constraint kind for operand constraints.
Fixes #175.

The Intel division instructions have fixed input operands that are
clobbered by fixed output operands, so the value passed as an input will
be clobbered just like a tied operand.

The FixedTied operand constraint is used to indicate a fixed input
operand that has a corresponding output operand with the same fixed
register.

Teach the spiller to teach a FixedTied operand the same as a Tied
operand constraint and make sure that the input value is killed by the
instruction.
2017-10-25 11:22:20 -07:00
Dan Gohman
fc0671a0cf Avoid dangling references to block params when sealing an unreachable block. 2017-10-25 10:04:18 -07:00
Jakob Stoklund Olesen
ea68a69f8b Fix a flake8 lint.
Also don't infer writes_cpu_flags if it is specified explicitly.
2017-10-19 16:17:09 -07:00
Dan Gohman
7c9b9e3d27 Mark spill and fill as can_store and can_load.
This allows GVN to avoid hoisting them. These will be to coarse for
things that want more precise dependence information, however we can
work that out when we build such things.
2017-10-19 13:11:33 -07:00
Dan Gohman
cc0bb70c5d Make GVN aware of instructions that write to CPU flags. 2017-10-19 12:59:10 -07:00
Dan Gohman
3ccee371a7 Remove the todo for smod.
It's not present in either WebAssembly or Rust, for example. We can
still add smod in the future if future use cases need it.
2017-10-19 12:59:10 -07:00
Dan Gohman
55bc368bf8 Remove minnum/maxnum. 2017-10-18 15:44:17 -07:00
Jakob Stoklund Olesen
c3446ee472 Add CPU flags value types to the language reference manual.
Clean up a few other things in the value types section too.
2017-10-18 15:07:19 -07:00
Dan Gohman
35989f4069 Tidy up unneeded references. 2017-10-17 11:48:57 -07:00
Dan Gohman
e6c6f09e41 Tidy some formatting in the generated legalizer.rs. 2017-10-17 11:48:57 -07:00
Jakob Stoklund Olesen
620eb7effe Add a "clobbers_flags" flag to encoding recipes.
On some ISAs like Intel's, all arithmetic instructions set all or some
of the CPU flags, so flag values can't be live across these
instructions. On ISAs like ARM's Aarch32, flags are clobbered by compact
16-bit encodings but not necessarily by 32-bit encodings of the same
instruction.

The "clobbers_flags" bit on the encoding recipe is used to indicate if
CPU flag values can be live across an instruction, or conversely whether
the encoding can be used where flag values are live.
2017-10-16 14:40:28 -07:00
Jakob Stoklund Olesen
5d065c4d8f Add encodings for CPU flags instructions.
Branch on flags: brif, brff,
Compare integers to flags: ifcmp
Compare floats to flags: ffcmp
Convert flags to b1: trueif, trueff
2017-10-16 13:07:23 -07:00
Jakob Stoklund Olesen
0f4f663584 Add register banks for CPU flags to Intel and ARM ISAs.
The arm32 ISA technically has separate floating point and integer flags,
but the only useful thing you can do with the floating point flags is to
copy them ti the integer flags, so there is not need to model them.

The arm64 ISA fixes this and the fcmp instruction writes the integer
nzcv flags directly.

RISC-V does not have CPU flags.
2017-10-13 14:02:09 -07:00
Jakob Stoklund Olesen
1dbc55dadf Add a pressure_tracking flag to register banks.
This makes it possible to define register banks that opt out of register
pressure tracking. This will be used to define banks for special-purpose
registers like the CPU flags.

The pressure tracker does not need to use resources for a top-level
register class in a non-tracked bank. The constant MAX_TOPRCS is renamed
to MAX_TRACKED_TOPRCS to indicate that there may be top-level register
classes with higher numbers, but they won't require pressure tracking.

We won't be tracking register pressure for CPU flags since only one
value is allowed to be live at a time.
2017-10-13 13:46:16 -07:00
Jakob Stoklund Olesen
1f98fc491c Add instructions using CPU flags.
Add integer and floating comparison instructions that return CPU flags:
ifcmp, ifcmp_imm, and ffcmp.

Add conditional branch instructions that check CPU flags: brif, brff

Add instructions that check a condition in the CPU flags and return a
b1: trueif, trueff.
2017-10-12 19:12:28 -07:00
Jakob Stoklund Olesen
15461c1e4b Add two new value types: iflags and fflags.
These two value types represent the state of CPU flags after an integer
comparison and a floating point comparison respectively.

Instructions using these types TBD.
2017-10-12 19:05:24 -07:00
Jakob Stoklund Olesen
dbaa919ca9 Make room for SpecialType in the value type numbering.
The value types are now classified into three groups:

1. Lane types are scalar types that can also be used to form vectors.
2. Vector types 2-256 copies of a lane type.
3. Special types. This is where the CPU flag types will go.

The special types can't be used to form vectors.

Change the numbering scheme for value types to make room for the special
types and add `is_lane()` and `is_special()` classification methods.

The VOID type still has number 0, but it can no longer appear as a
vector lane. It classifies as special now.
2017-10-12 12:48:55 -07:00
Jakob Stoklund Olesen
89a24b2f13 Rename ScalarType to LaneType.
The word "scalar" is a bit vague and tends to mean "non-vector". Since
we are about to add new CPU flag value types that can't appear as vector
lanes, make the distinction clear: LaneType represents value types that
can appear as a vector lane.

Also replace the Type::is_scalar() method with an is_vector() method.
2017-10-12 10:39:12 -07:00
Jakob Stoklund Olesen
ba52a38597 Add a t8jccd_long encoding recipe for brz.b1 and brnz.b1 in 32-bit mode.
The register allocator can't handle branches with constrained register
operands, and the brz.b1/brnz.b1 instructions only have the t8jccd_abcd
in 32-bit mode where no REX prefixes are possible.

This adds a worst case encoding for those cases where a b1 value lives
in a non-ABCD register.
2017-10-11 14:20:43 -07:00
Jakob Stoklund Olesen
ece09f2df2 Add encodings for spill.b1, fill.b1 etc.
These spills and fills use 32-bit writes, knowing that the spill slot is
minimum 4 bytes which makes it safe.

Also simplify the definition of load/store encodings a bit by
introducing loops.
2017-10-11 14:20:43 -07:00
Jakob Stoklund Olesen
ecd537ecd6 Avoid widening TailRecipe register constraints automatically.
Most recipes with an ABCD constraint can handle the full GPR register
class when a REX prefix is applied, but not all. The "icscc" macro
recipe always generates a setCC instruction with no REX prefix, so it
can only write the ABCD registers, even in its REX form.

Don't automatically rewrite ABCD constraints to GPR constraints when
applying a REX prefix to a tail recipe. Instead, allow individual ABCD
recipes to specify a "when_prefixed" alternative recipe to use. This
also eliminates the spurious Rex*abcd recipe names which didn't have an
ABCD constraint.

Also allow recipes to specify that a REX prefix is required by setting
the prefix_required flag. This is used by recipes like t8jccb which
explicitly accesses an 8-bit register with a GPR constraint which is
only valid with a prefix.
2017-10-09 14:08:37 -07:00
Jakob Stoklund Olesen
73d4bb47c0 Intel encodings for regspill and regfill.
These are always SP-based.
2017-10-04 17:02:09 -07:00
Jakob Stoklund Olesen
dda3efcbdd Add regspill and regfill instructions.
These are parallels to the existing regmove instruction, but the divert
the value to and from a stack slot.

Like regmove diversions, this is a temporary diversion that must be
local to the EBB.
2017-10-04 17:02:09 -07:00
Jakob Stoklund Olesen
ce4d723a73 Give RegClassData a reference to its parent RegInfo.
This makes it possible to materialize new RegClass references without
requiring a RegInfo reference to be passed around.

- Move the RegInfo::toprc() method to RegClassData.
- Rename RegClassData::intersect() to intersect_index() and provide a
  new intersect() which returns a register class.
- Remove some &RegInfo parameters that are no longer needed.
2017-10-04 17:02:09 -07:00
Jakob Stoklund Olesen
7c023b2430 Don't omit the controlling typevar for instructions without results.
The controlling type variable passed to the format constructor in the
InstBuilder trait is not just used to generate the result values. In an
EncCursor, it is also used to encode the instruction, so VOID doesn't
work.
2017-10-03 13:39:43 -07:00
Jakob Stoklund Olesen
e10b3117cb Rename enc_flt() to enc_both().
This encoding method is not only used for floating point instructions.
2017-10-03 13:27:00 -07:00
Jakob Stoklund Olesen
c82e68efea Eliminate the ABCD register class constaint in REX encodings.
Some REX-less encodings require an ABCD input because they are looking
at 8-bit registers. This constraint doesn't apply with a REX prefix
where the low 8 bits of all registers are addressable.
2017-09-29 15:29:25 -07:00
Jakob Stoklund Olesen
51a6901a7f Implement coloring::iterate_solution().
It can happen that the currently live registers are blocking a smaller
register class completely, so the only way of solving the allocation
problem is to turn some of the live-through registers into solver
variables.

When the quick_solve attempt fails, try to free up registers in the
critical register class by turning live-through values into solver
variables.
2017-09-29 14:55:35 -07:00
Jakob Stoklund Olesen
86e22e7de5 Add long-range encodings for conditional branches.
The brz and brnz instructions get support for 32-bit jump displacements
for long range branches.

Also change the way branch ranges are specified on tail recipes for the
Intel instructions. All branch displacements are relative to the end of
the instruction, so just compute the branch range origin as the
instruction size instead of trying to specify it in the tail recipe
definitions.
2017-09-29 13:18:29 -07:00
Jakob Stoklund Olesen
711e5cd644 Handle srem INT_MIN, -1 correctly.
The x86_divmodx traps on integer overflow, but the srem instruction is
not supposed to trap with a -1 divisor.

Generate a legalization expansion for srem that special-cases the -1
divisor to simply return 0.
2017-09-29 08:53:49 -07:00
Jakob Stoklund Olesen
8abcdac5a1 Legalize fcvt_to_sint and fcvt_to_uint for Intel64.
We need to generate traps on NaN and overflow.
2017-09-28 12:00:38 -07:00
Jakob Stoklund Olesen
34146435e5 Legalize unsigned-to-float conversions for Intel 64.
Also make sure we generate type checks for the controlling type variable
in legalization patterns. This is not needed for encodings since the
encoding tables are already keyed on the controlling type variable.
2017-09-28 11:39:19 -07:00
Jakob Stoklund Olesen
a274cdf275 Fix the Intel encoding of band_not.
The andnps instruction inverts its first argument while band_not inverts
is second argument.

Use a swapped-operands "fax" encoding recipe.
2017-09-27 18:14:13 -07:00
Jakob Stoklund Olesen
b6b474a8c9 Add Intel legalization for fmin and fmax.
The native x86_fmin and x86_fmax instructions don't behave correctly for
NaN inputs and when comparing +0.0 to -0.0, so we need separate branches
for those cases.
2017-09-27 12:55:34 -07:00
Jakob Stoklund Olesen
384b04b411 Fix some misnamed TailRecipes and add a consistency check. 2017-09-27 12:55:34 -07:00
Jakob Stoklund Olesen
44eab3e158 Add Intel regmove encodings for floating point types. 2017-09-27 12:49:54 -07:00
Jakob Stoklund Olesen
1fe7890700 Add x86_fmin and x86_fmax instructions.
These Intel-specific instructions represent the semantics of the minss /
maxss Intel instructions which behave more like a C ternary operator
than the WebAssembly fmin and fmax instructions.

They will be used as building blocks for implementing the WebAssembly
semantics.
2017-09-27 09:17:09 -07:00
Jakob Stoklund Olesen
ac69f3bfdf Add an Intel-specific x86_cvtt2si instruction.
This is used to represent the non-trapping semantics of the cvttss2si and
cvttsd2si instructions (and their vectorized counterparts).

The overflow behavior of this instruction is specific to the Intel ISAs.

There is no float-to-i64 instruction on the 32-bit Intel ISA.
2017-09-26 15:44:41 -07:00
Jakob Stoklund Olesen
6ff681a90d Add general legalization for the select instruction. 2017-09-26 14:16:35 -07:00
Jakob Stoklund Olesen
ce767be703 Intel encodings for floating point copies. 2017-09-26 13:54:38 -07:00
Jakob Stoklund Olesen
7fb6159a85 Add Intel encodings for the fcmp instruction.
Not all floating point condition codes are directly supported by the
ucimiss/ucomisd instructions. Some inequalities need to be reversed and
eq+ne require two separate tests.
2017-09-26 11:17:32 -07:00
Jakob Stoklund Olesen
79968a2325 Add standard expansions for fcopysign.
This is also just a sign bit manipulation.
2017-09-25 15:17:32 -07:00
Jakob Stoklund Olesen
6bec5f8507 Intel encodings for nearest/floor/ceil/trunc.
These floating point rounding operations all use the roundss/roundsd
instructions that are available in SSE 4.1.
2017-09-25 15:08:04 -07:00