Commit Graph

1411 Commits

Author SHA1 Message Date
Alex Crichton
92394566fc x64: Migrate fabs and bnot vector operations to ISLE
This was my first attempt at transitioning code to ISLE to originally
fix #3327 but that fix has since landed on `main`, so this is instead
now just porting a few operations to ISLE.

Closes #3336
2021-11-16 07:36:49 -08:00
Johnnie Birch
5d5629de60 Fix for issue 3327. Updates Bnot to handle case for NaN float 2021-11-15 18:47:23 -08:00
Nick Fitzgerald
b38a96955c Merge pull request #3506 from fitzgen/isle
Initial ISLE integration for x64
2021-11-15 15:38:09 -08:00
Nick Fitzgerald
4a34d2c55b Remove old ISLE-generated file 2021-11-15 11:14:06 -08:00
Alex Crichton
1548ca3c47 Disable check_label_branch_invariants in fuzzing
This commit disables the `MachBuffer::check_label_branch_invariants`
debug check on the fuzzers due to it causing timeouts with the test case
from #3441. Fuzzing leads to a 20-30x slowdown of executed code and
locally the fuzz time it takes to instantiate #3441 drops from 3 minutes
to 6 seconds disabling this function. Note that this should still be
executed during our testing on CI since it's still enabled for debug
assertions.
2021-11-15 07:34:09 -08:00
Nick Fitzgerald
bfbf2f2f49 ISLE: implement x64 lowering for band_not in ISLE 2021-11-11 15:19:28 -08:00
Nick Fitzgerald
33fcd6b4a5 x64: special case 0 to use xor in Inst::gen_constant for i128s 2021-11-10 15:57:58 -08:00
Nick Fitzgerald
b5105c025c MachInst: always rematerialize constants, rather than assign them registers
There were a few previous code paths that attempted to handle this, but this new
check handles it for all callers.

Rematerializing constants, rather than assigning and reusing a register, allows
for lower register pressure.
2021-11-10 15:45:43 -08:00
Nick Fitzgerald
bbb4949128 ISLE: use lower_to_amode inside sink_load implementation 2021-11-10 13:21:48 -08:00
Nick Fitzgerald
b8494822dc ISLE: finish porting imul lowering to ISLE 2021-11-05 15:41:24 -07:00
Nick Fitzgerald
30d206779e x64: Remove some unreachable code that's been ported to ISLE 2021-11-05 14:28:03 -07:00
Chris Fallin
5e96a447f0 Add back the ifcmp_sp CLIF opcode.
This opcode was removed as part of the old-backend cleanup in #3446.
While this opcode will definitely go away eventually, it is
unfortunately still used today in Lucet (as we just discovered while
working to upgrade Lucet's pinned Cranelift version). Lucet is
deprecated and slated to eventually be completely sunset in favor of
Wasmtime; but until that happens, we need to keep this opcode.
2021-11-01 13:34:31 -07:00
Chris Fallin
9e7760bd83 Merge pull request #3497 from bjorn3/misc_meta_simplifications
Misc meta crate cleanups
2021-11-01 11:24:44 -07:00
bjorn3
2fbd57e9e2 Remove imm_with_name
It is only used once to rename an imm field to mask
2021-10-31 19:57:04 +01:00
bjorn3
e4d42a1be4 Move arg unpacking for all remaining expand functions to simple_legalize 2021-10-31 18:26:25 +01:00
bjorn3
ce3175993a Inline expand_br_icmp, expand_stack_load and expand_stack_store 2021-10-31 18:00:39 +01:00
bjorn3
96b14ed8eb Match on InstructionData instead of Opcode
This should be faster by avoiding matching on InstructionData to get the
Opcode, then on Opcode and then finally on the InstructionData again to
get the arguments
2021-10-31 17:52:43 +01:00
Nick Fitzgerald
e3a423c3e8 Remove some uses of foo.expect(&format!(...)) pattern
This eagerly evaluates the `format!` and produces a `String` with a heap
allocation, regardless whether `foo` is `Some`/`Ok` or `None`/`Err`. Using
`foo.unwrap_or_else(|| panic!(...))` makes it so that the error message
formatting is only evaluated if `foo` is `None`/`Err`.
2021-10-25 17:35:23 -07:00
Chris Fallin
472b1b2e8a Avoid quadratic behavior in pathological label-alias case in MachBuffer.
Fixes #3468.

If a program has many instances of the pattern "goto next; next:" in a
row (i.e., no-op branches to the fallthrough address), the branch
simplification in `MachBuffer` would remove them all, as expected.
However, in order to work correctly, the algorithm needs to track all
labels that alias the current buffer tail, so that they can be adjusted
later if another branch chomp occurs.

When many thousands of this branch-to-next pattern occur, many thousands
of labels will reference the current buffer tail, and this list of
thousands of labels will be shuffled between the branch metadata struct
and the "labels at tail" struct as branches are appended and then
chomped immediately.

It's possible that with smarter data structure design, we could somehow
share the list of labels -- e.g., a single array of all labels, in order
they are bound, with ranges of indices in this array used to represent
lists of labels (actually, that seems like a better design in general);
but let's leave that to future optimization work.

For now, we can avoid the quadratic behavior by just "giving up" if the
list is too long; it's always valid to not optimize a branch. It is very
unlikely that the "normal" case will have more than 100 "goto next"
branches in a row, so this should not have any perf impact; if it does,
we will leave 1 out of every 100 such branches un-optimized in a long
sequence of thousands.

This takes total compilation time down on my machine from ~300ms to
~72ms for the `foo.wasm` case in #3441. For reference, the old backend
(now removed), built from arbitrarily-chosen-1-year-old commit
`c7fcc344`, takes 158ms, so we're ~twice as fast, which is what I would
expect.
2021-10-21 12:07:39 -07:00
Alex Crichton
e8d3b8e3ea Fix an off-by-two condition in heap legalization (#3462)
This commit fixes an issue in Cranelift where legalization of
`heap_addr` instructions (used by wasm to represent heap accesses) could
be off-by-two where loads that should be valid were actually treated as
invalid. The bug here happened in an optimization where tests against
odd constants were being altered to tests against even constants by
subtracting one from the limit instead of adding one to the limit. The
comment around this area has been updated in accordance with a little
more math-stuff as well to help future readers.
2021-10-19 13:19:20 -05:00
Nick Fitzgerald
d377b665c6 Initial ISLE integration with the x64 backend
On the build side, this commit introduces two things:

1. The automatic generation of various ISLE definitions for working with
CLIF. Specifically, it generates extern type definitions for clif opcodes and
the clif instruction data `enum`, as well as extractors for matching each clif
instructions. This happens inside the `cranelift-codegen-meta` crate.

2. The compilation of ISLE DSL sources to Rust code, that can be included in the
main `cranelift-codegen` compilation.

Next, this commit introduces the integration glue code required to get
ISLE-generated Rust code hooked up in clif-to-x64 lowering. When lowering a clif
instruction, we first try to use the ISLE code path. If it succeeds, then we are
done lowering this instruction. If it fails, then we proceed along the existing
hand-written code path for lowering.

Finally, this commit ports many lowering rules over from hand-written,
open-coded Rust to ISLE.

In the process of supporting ISLE, this commit also makes the x64 `Inst` capable
of expressing SSA by supporting 3-operand forms for all of the existing
instructions that only have a 2-operand form encoding:

    dst = src1 op src2

Rather than only the typical x86-64 2-operand form:

    dst = dst op src

This allows `MachInst` to be in SSA form, since `dst` and `src1` are
disentangled.

("3-operand" and "2-operand" are a little bit of a misnomer since not all
operations are binary operations, but we do the same thing for, e.g., unary
operations by disentangling the sole operand from the result.)

There are two motivations for this change:

1. To allow ISLE lowering code to have value-equivalence semantics. We want ISLE
   lowering to translate a CLIF expression that evaluates to some value into a
   `MachInst` expression that evaluates to the same value. We want both the
   lowering itself and the resulting `MachInst` to be pure and referentially
   transparent. This is both a nice paradigm for compiler writers that are
   authoring and maintaining lowering rules and is a prerequisite to any sort of
   formal verification of our lowering rules in the future.

2. Better align `MachInst` with `regalloc2`'s API, which requires that the input
   be in SSA form.
2021-10-12 17:11:58 -07:00
bjorn3
a05bf2bf42 Remove instructions necessary for the old regalloc 2021-10-12 14:37:36 +02:00
bjorn3
1fd491dadd Remove fallthrough instruction 2021-10-12 14:22:07 +02:00
bjorn3
5b24e117ee Remove instructions used by old br_table legalization 2021-10-12 14:18:52 +02:00
Chris Fallin
5c2a629871 Merge pull request #2455 from Hywan/feat-cranelift-codegen-re-export-gimli
feat(cranelift-codegen) Re-export `gimli` when `unwind` feature is enabled
2021-10-11 13:09:16 -07:00
Alex Crichton
713ce07d35 Add some debug logging for timing in module compiles (#3417)
* Add some debug logging for timing in module compiles

This is sometimes helpful when debugging slow compiles from fuzz bugs or
similar.

* Fix total duration calculation to not double-count
2021-10-11 12:50:15 -05:00
bjorn3
20463d60f3 Replace StackSlots struct with a type alias 2021-10-11 16:41:45 +02:00
bjorn3
fd59a3e045 Remove all unused stackslot handling code 2021-10-11 16:41:45 +02:00
Pat Hickey
d3f81a3cb9 Merge pull request #3435 from bjorn3/remove_various_dead_code
Remove various dead code
2021-10-10 10:00:42 -07:00
Pat Hickey
ccab8c5357 Merge pull request #3434 from bjorn3/remove_cssa_verifier
Remove the CSSA verifier
2021-10-10 10:00:19 -07:00
Pat Hickey
bca6946a9d Merge pull request #3432 from bjorn3/remove_reloc_constant
ConstantData related cleanups for the removal of the old backend
2021-10-10 09:59:13 -07:00
Pat Hickey
b7375817b1 Merge pull request #3431 from bjorn3/remove_sarg_t
Remove the sarg_t type and dummy_sarg_t instruction
2021-10-10 09:58:14 -07:00
bjorn3
80709ab624 Rustfmt 2021-10-10 15:26:43 +02:00
bjorn3
54293a5929 Remove predicates module
It is dead code now
2021-10-10 15:25:29 +02:00
bjorn3
fad3868c1d Remove no longer existing passes from timing.rs 2021-10-10 15:25:29 +02:00
bjorn3
f7ce91e174 Remove the CSSA verifier
The old register allocator required CSSA as intermediate step. The new
register allocator doesn't use SSA at all.
2021-10-10 15:17:19 +02:00
bjorn3
355dd996a2 Fix tests 2021-10-10 15:00:25 +02:00
bjorn3
aa0486eb15 Remove offset fields from ConstantPool 2021-10-10 14:47:53 +02:00
bjorn3
d78f436daf Remove reloc_constant
It is no longer used by the new backends
2021-10-10 14:43:55 +02:00
bjorn3
8a8797b911 Remove the sarg_t type and dummy_sarg_t instruction
They are no longer necessary with the new style backends
2021-10-10 14:38:35 +02:00
bjorn3
2b89b13c57 Move condcodes from cranelift-codegen-shared to cranelift-codegen 2021-10-10 14:23:35 +02:00
Nick Fitzgerald
dbe01ff51e Remove references to the old postopt pass 2021-10-07 14:44:07 -07:00
Anton Kirilov
a986cf2438 Increase the default code section alignment to 64 KB for AArch64 targets (#3424)
Some platforms such as AArch64 Linux support different memory page
sizes, so we need to be conservative when choosing the code section
alignment (which is equal to the page size) by using the maximum.

Copyright (c) 2021, Arm Limited.
2021-10-07 12:49:40 -05:00
bjorn3
2db3b5b9df Remove code offsets from Function (#3412)
* Remove code offsets from Function

* Remove reloc_jt and fix wasmtime-cranelift
2021-10-07 15:54:00 +02:00
Chris Fallin
df5fa773ef Merge pull request #3413 from bjorn3/no_stack_layout
Remove StackLayoutInfo
2021-10-04 11:17:13 -07:00
bjorn3
a0f777b677 Remove BranchRange 2021-10-04 19:40:58 +02:00
bjorn3
c5c7508289 Remove StackLayoutInfo 2021-10-04 19:39:33 +02:00
bjorn3
b3702f5821 Remove old_signature 2021-10-04 19:39:33 +02:00
Benjamin Bouvier
772176dbfb Cranelift: remove unused EncCursor 2021-10-04 19:11:52 +02:00
Benjamin Bouvier
43a86f14d5 Remove more old backend ISA concepts (#3402)
This also paves the way for unifying TargetIsa and MachBackend, since now they map one to one. In theory the two traits could be merged, which would be nice to limit the number of total concepts. Also they have quite different responsibilities, so it might be fine to keep them separate.

Interestingly, this PR started as removing RegInfo from the TargetIsa trait since the adapter returned a dummy value there. From the fallout, noticed that all Display implementations didn't needed an ISA anymore (since these were only used to render ISA specific registers). Also the whole family of RegInfo / ValueLoc / RegUnit was exclusively used for the old backend, and these could be removed. Notably, some IR instructions needed to be removed, because they were using RegUnit too: this was the oddball of regfill / regmove / regspill / copy_special, which were IR instructions inserted by the old regalloc. Fare thee well!
2021-10-04 10:36:12 +02:00