Commit Graph

437 Commits

Author SHA1 Message Date
Anton Kirilov
a897742593 Initial back-edge CFI implementation (#3606)
Give the user the option to sign and to authenticate function
return addresses with the operations introduced by the Pointer
Authentication extension to the Arm instruction set architecture.

Copyright (c) 2021, Arm Limited.
2022-08-03 11:08:29 -07:00
Afonso Bordado
709716bb8e cranelift: Implement scalar FMA on x86 (#4460)
x86 does not have dedicated instructions for scalar FMA, lower
to a libcall which seems to be what llvm does.
2022-08-03 10:29:10 -07:00
Nick Fitzgerald
55215bbd1e Use a SmallVec for ABIArgSlots (#4586)
These are always length 1 for Wasm benchmarks.

<h3>Sightglass Benchmark Results</h3>

```
compilation :: nanoseconds :: benchmarks/spidermonkey/benchmark.wasm

  Δ = 328624015.86 ± 40274677.93 (confidence = 99%)

  main.so is 0.88x to 0.91x faster than slots-smallvec.so!
  slots-smallvec.so is 1.10x to 1.13x faster than main.so!

  [3070752447 3203778792.55 3446269274] main.so
  [2503544039 2875154776.69 3197966713] slots-smallvec.so

compilation :: nanoseconds :: benchmarks/pulldown-cmark/benchmark.wasm

  Δ = 9685705.06 ± 3221286.87 (confidence = 99%)

  main.so is 0.91x to 0.96x faster than slots-smallvec.so!
  slots-smallvec.so is 1.05x to 1.09x faster than main.so!

  [129356493 145594942.79 165038803] main.so
  [118555011 135909237.73 188780619] slots-smallvec.so

compilation :: nanoseconds :: benchmarks/bz2/benchmark.wasm

  No difference in performance.

  [79080493 86757564.46 112649639] main.so
  [78083384 85934125.69 94992743] slots-smallvec.so
```
2022-08-02 17:40:36 -07:00
Nick Fitzgerald
ab1cf3df2d Use a SmallVec for ABIArgs (#4584)
Instead of a regular `Vec`.

These vectors are usually very small, for example here is the histogram of sizes
when running Sightglass's `pulldown-cmark` benchmark:

```
;; Number of samples = 10332
;; Min = 0
;; Max = 11
;;
;; Mean = 2.496128532713901
;; Standard deviation = 2.2859559855427243
;; Variance = 5.225594767838607
;;
;; Each ∎ is a count of 62
;;
 0 ..  1 [ 3134 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
 1 ..  2 [ 2032 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
 2 ..  3 [  159 ]: ∎∎
 3 ..  4 [  838 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎
 4 ..  5 [  970 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
 5 ..  6 [ 2566 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
 6 ..  7 [  303 ]: ∎∎∎∎
 7 ..  8 [  272 ]: ∎∎∎∎
 8 ..  9 [   40 ]:
 9 .. 10 [   18 ]:
```

By using a `SmallVec` with capacity of 6 we avoid the vast majority of heap
allocations and get some nice benchmark wins of up to ~1.11x faster compilation.

<h3>Sightglass Benchmark Results</h3>

```
compilation :: nanoseconds :: benchmarks/spidermonkey/benchmark.wasm

  Δ = 340361395.90 ± 63384608.15 (confidence = 99%)

  main.so is 0.88x to 0.92x faster than smallvec.so!
  smallvec.so is 1.09x to 1.13x faster than main.so!

  [3101467423 3425524333.41 4060621653] main.so
  [2820915877 3085162937.51 3375167352] smallvec.so

compilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm

  Δ = 988446098.59 ± 184075718.89 (confidence = 99%)

  main.so is 0.88x to 0.92x faster than smallvec.so!
  smallvec.so is 1.09x to 1.13x faster than main.so!

  [9006994951 9948091070.66 11792481990] main.so
  [8192243090 8959644972.07 9801848982] smallvec.so

compilation :: nanoseconds :: benchmarks/bz2/benchmark.wasm

  Δ = 7854567.87 ± 2215491.16 (confidence = 99%)

  main.so is 0.89x to 0.94x faster than smallvec.so!
  smallvec.so is 1.07x to 1.12x faster than main.so!

  [80354527 93864666.76 119789198] main.so
  [77554917 86010098.89 94726994] smallvec.so

compilation :: cycles :: benchmarks/bz2/benchmark.wasm

  Δ = 22810509.85 ± 6434024.63 (confidence = 99%)

  main.so is 0.89x to 0.94x faster than smallvec.so!
  smallvec.so is 1.07x to 1.12x faster than main.so!

  [233358190 272593088.57 347880715] main.so
  [225227821 249782578.72 275097380] smallvec.so

compilation :: nanoseconds :: benchmarks/pulldown-cmark/benchmark.wasm

  Δ = 10849521.41 ± 4324757.85 (confidence = 99%)

  main.so is 0.90x to 0.96x faster than smallvec.so!
  smallvec.so is 1.04x to 1.10x faster than main.so!

  [133875427 156859544.47 222455440] main.so
  [126073854 146010023.06 181611647] smallvec.so

compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm

  Δ = 31508176.97 ± 12559561.91 (confidence = 99%)

  main.so is 0.90x to 0.96x faster than smallvec.so!
  smallvec.so is 1.04x to 1.10x faster than main.so!

  [388788638 455536988.31 646034523] main.so
  [366132033 424028811.34 527419755] smallvec.so
```
2022-08-02 15:53:44 -07:00
Nick Fitzgerald
42bba452a6 Cranelift: Add instructions for getting the current stack/frame/return pointers (#4573)
* Cranelift: Add instructions for getting the current stack/frame pointers and return address

This is the initial part of https://github.com/bytecodealliance/wasmtime/issues/4535

* x64: Remove `Amode::RbpOffset` and use `Amode::ImmReg` instead

We just special case getting operands from `Amode`s now.

* Fix s390x `get_return_address`; require `preserve_frame_pointers=true`

* Assert that `Amode::ImmRegRegShift` doesn't use rbp/rsp

* Handle non-allocatable registers in Amode::with_allocs

* Use "stack" instead of "r15" on s390x

* r14 is an allocatable register on s390x, so it shouldn't be used with `MovPReg`
2022-08-02 14:37:17 -07:00
Chris Fallin
8dddd6f1f7 Cranelift: Remove ifcmp_sp opcode. (#4578)
This was temporarily added back in #3502 due to a need from Lucet; now
that Lucet is EOL, the opcode is no longer needed and we can remove it.
2022-08-02 13:15:39 -07:00
Chris Fallin
43f1765272 Cranellift: remove Baldrdash support and related features. (#4571)
* Cranellift: remove Baldrdash support and related features.

As noted in Mozilla's bugzilla bug 1781425 [1], the SpiderMonkey team
has recently determined that their current form of integration with
Cranelift is too hard to maintain, and they have chosen to remove it
from their codebase. If and when they decide to build updated support
for Cranelift, they will adopt different approaches to several details
of the integration.

In the meantime, after discussion with the SpiderMonkey folks, they
agree that it makes sense to remove the bits of Cranelift that exist
to support the integration ("Baldrdash"), as they will not need
them. Many of these bits are difficult-to-maintain special cases that
are not actually tested in Cranelift proper: for example, the
Baldrdash integration required Cranelift to emit function bodies
without prologues/epilogues, and instead communicate very precise
information about the expected frame size and layout, then stitched
together something post-facto. This was brittle and caused a lot of
incidental complexity ("fallthrough returns", the resulting special
logic in block-ordering); this is just one example. As another
example, one particular Baldrdash ABI variant processed stack args in
reverse order, so our ABI code had to support both traversal
orders. We had a number of other Baldrdash-specific settings as well
that did various special things.

This PR removes Baldrdash ABI support, the `fallthrough_return`
instruction, and pulls some threads to remove now-unused bits as a
result of those two, with the  understanding that the SpiderMonkey folks
will build new functionality as needed in the future and we can perhaps
find cleaner abstractions to make it all work.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1781425

* Review feedback.

* Fix (?) DWARF debug tests: add `--disable-cache` to wasmtime invocations.

The debugger tests invoke `wasmtime` from within each test case under
the control of a debugger (gdb or lldb). Some of these tests started to
inexplicably fail in CI with unrelated changes, and the failures were
only inconsistently reproducible locally. It seems to be cache related:
if we disable cached compilation on the nested `wasmtime` invocations,
the tests consistently pass.

* Review feedback.
2022-08-02 19:37:56 +00:00
Benjamin Bouvier
ff37c9d8a4 [cranelift] Rejigger the compile API (#4540)
* Move `emit_to_memory` to `MachCompileResult`

This small refactoring makes it clearer to me that emitting to memory
doesn't require anything else from the compilation `Context`. While it's
a trivial change, it's a small public API change that shouldn't cause
too much trouble, and doesn't seem RFC-worthy. Happy to hear different
opinions about this, though!

* hide the MachCompileResult behind a method

* Add a `CompileError` wrapper type that references a `Function`

* Rename MachCompileResult to CompiledCode

* Additionally remove the last unsafe API in cranelift-codegen
2022-08-02 12:05:40 -07:00
Sam Parker
37cd96beff [AArch64] i64x2 support for min/max (#4575)
Also added interpreter support for vector min/max.

Copyright (c) 2022, Arm Limited.
2022-08-02 11:42:05 -07:00
Nick Fitzgerald
c77bec4dcb Cranelift: don't emit inside lowering rules for aarch64 (#4572)
* Cranelift: Don't `emit` inside lowering rules in aarch64

The lowering rules should be "pure" and side-effect free, using helpers defined
in `inst.isle` to perform actual side effects like emitting instructions.

* Cranelift: use 80 width for section separators in aarch64 lowering rules
2022-08-01 16:43:42 -07:00
Trevor Elliott
586ec95c11 ISLE: Allow shadowing in let expressions (#4562)
* Support shadowing in isle

* Re-run the isle build.rs if the examples change

* Print error messages when isle tests fail

* Move run tests

* Refactor `let` uses that don't need to introduce unique names
2022-08-01 21:10:28 +00:00
Anton Kirilov
a47a82d2e5 Cranelift AArch64: Harden the Spectre mitigations (#4555)
Use the `CSDB` instruction following Arm's recommendation.

Copyright (c) 2022, Arm Limited.
2022-08-01 10:20:48 -07:00
Benjamin Bouvier
8d0224341c cranelift: Introduce a feature to enable trace logs (#4484)
* Don't use `log::trace` directly but a feature-enabled `trace` macro
* Don't emit disassembly based on the log level
2022-08-01 11:19:15 +02:00
Damian Heaton
5e3bb588a8 Port Fence, IsNull/IsInvalid & Debugtrap to ISLE (AArch64) (#4548)
Ported the existing implementation of the following Opcodes for AArch64
to ISLE:
- `Fence`
- `IsNull`
- `IsInvalid`
- `Debugtrap`

Copyright (c) 2022 Arm Limited
2022-07-28 15:36:13 -07:00
Afonso Bordado
0508932174 cranelift: Align Scalar and SIMD shift semantics (#4520)
* cranelift: Reorganize test suite

Group some SIMD operations by instruction.

* cranelift: Deduplicate some shift tests

Also, new tests with the mod behaviour

* aarch64: Lower shifts with mod behaviour

* x64: Lower shifts with mod behaviour

* wasmtime: Don't mask SIMD shifts
2022-07-27 17:54:00 +00:00
Trevor Elliott
7ac6134894 x64: Shrink Inst from 72 to 48 bytes (#4514)
https://github.com/bytecodealliance/wasmtime/pull/4514
2022-07-27 10:39:22 -07:00
Anton Kirilov
ead6edb0c5 Cranelift AArch64: Migrate Splat to ISLE (#4521)
Copyright (c) 2022, Arm Limited.
2022-07-26 17:57:15 +00:00
Anton Kirilov
d041c4b376 Cranelift AArch64: Further integral constant fixes (#4530)
Copyright (c) 2022, Arm Limited.
2022-07-26 09:35:06 -07:00
Damian Heaton
3ef89b7787 Allow 64-bit vectors and implement for interpreter (#4509)
* Allow 64-bit vectors and implement for interpreter

The AArch64 backend already supports 64-bit vectors; this simply allows
instructions to make use of that.

Implemented support for 64-bit vectors within the interpreter to allow
interpret runtests to use them.

Copyright (c) 2022 Arm Limited

* Disable 64-bit SIMD `iaddpairwise` tests on s390x

Copyright (c) 2022 Arm Limited
2022-07-25 13:00:43 -07:00
Sam Parker
c5ddb4b803 [AArch64] Port SIMD narrowing to ISLE (#4478)
* [AArch64] Port SIMD narrowing to ISLE

Fvdemote, snarrow, unarrow and uunarrow.

Also refactor the aarch64 instructions descriptions to parameterize
on ScalarSize instead of using different opcodes.

The zero_value pure constructor has been introduced and used by the
integer narrow operations and it replaces, and extends, the compare
zero patterns.

Copright (c) 2022, Arm Limited.

* use short 'if' patterns
2022-07-25 12:40:36 -07:00
Damian Heaton
f1a0c40a53 Convert sqrt..nearest to ISLE (AArch64) (#4508)
Converted the existing implementations for the following opcodes to ISLE
on AArch64:
- `sqrt`
- `fneg`
- `fabs`
- `fpromote`
- `fdemote`
- `ceil`
- `floor`
- `trunc`
- `nearest`

Copyright (c) 2022 Arm Limited
2022-07-22 14:48:07 -07:00
Anton Kirilov
2ba4bce5cc Merge pull request from GHSA-7f6x-jwh5-m9r4
Copyright (c) 2022, Arm Limited.
2022-07-20 11:53:56 -05:00
Jeffrey Charles
d55eb64b9e Enable generating debug symbols on AArch64 (#4468) 2022-07-19 19:12:07 +00:00
Damian Heaton
00ac18c866 Convert fadd..fmax_pseudo to ISLE (AArch64) (#4452)
Converted the existing implementations for the following Opcodes to ISLE on AArch64:
- `fadd`
- `fsub`
- `fmul`
- `fdiv`
- `fmin`
- `fmax`
- `fmin_pseudo`
- `fmax_pseudo`

Copyright (c) 2022 Arm Limited
2022-07-19 12:03:05 -07:00
Sam Parker
e5678e8f8d [AArch64] Cleanup dynamic lowering (#4432)
Copyright (c) 2022, Arm Limited.
2022-07-18 11:13:16 -07:00
Damian Heaton
d792646677 Implement iabs in ISLE (AArch64) (#4399)
* Implement `iabs` in ISLE (AArch64)

Converts the existing implementation of `iabs` for AArch64 into ISLE,
and fixes support for `iabs` on scalar values.

Copyright (c) 2022 Arm Limited.

* Improve scalar `iabs` implementation.

Also introduces `CSNeg` instruction.

Copyright (c) 2022 Arm Limited
2022-07-18 11:12:34 -07:00
Damian Heaton
db7f9ccd2b Convert scalar_to_vector to ISLE (AArch64) (#4401)
* Convert `scalar_to_vector` to ISLE (AArch64)

Converted the exisiting implementation of `scalar_to_vector` for AArch64 to
ISLE.

Copyright (c) 2022 Arm Limited

* Add support for floats and fix FpuExtend

- Added rules to cover `f32 -> f32x4` and `f64 -> f64x2` for
`scalar_to_vector`
- Added tests for `scalar_to_vector` on floats.
- Corrected an invalid instruction emitted by `FpuExtend` on 64-bit
values.

Copyright (c) 2022 Arm Limited
2022-07-18 11:11:54 -07:00
Damian Heaton
6c70428735 Convert isplit / iconcat to ISLE (AArch64) (#4402)
Converted the existing implementations for `isplit` and `iconcat` for
AArch64 to ISLE.

Copyright (c) 2022 Arm Limited
2022-07-08 17:12:42 -07:00
Sam Parker
9c43749dfe [RFC] Dynamic Vector Support (#4200)
Introduce a new concept in the IR that allows a producer to create
dynamic vector types. An IR function can now contain global value(s)
that represent a dynamic scaling factor, for a given fixed-width
vector type. A dynamic type is then created by 'multiplying' the
corresponding global value with a fixed-width type. These new types
can be used just like the existing types and the type system has a
set of hard-coded dynamic types, such as I32X4XN, which the user
defined types map onto. The dynamic types are also used explicitly
to create dynamic stack slots, which have no set size like their
existing counterparts. New IR instructions are added to access these
new stack entities.

Currently, during codegen, the dynamic scaling factor has to be
lowered to a constant so the dynamic slots do eventually have a
compile-time known size, as do spill slots.

The current lowering for aarch64 just targets Neon, using a dynamic
scale of 1.

Copyright (c) 2022, Arm Limited.
2022-07-07 12:54:39 -07:00
Afonso Bordado
e9727b9d4b aarch64: Fix i128 of/nof implementations (#4403)
@yuyang-ok reported via zulip that i128 overflow tests were:
1. different from the interpreter implementation
2. wrong on some of the test cases

This fixes both the tests and the aarch64 implementation and adds the
interpreter to the testsuite.
2022-07-07 11:00:58 -07:00
Damian Heaton
6a5fe20956 Convert swizzle to ISLE (AArch64) (#4400)
Converted the implementation of `swizzle` for AArch64 to ISLE.

Copyright (c) 2022 Arm Limited
2022-07-07 10:29:33 -07:00
Sam Parker
d9e0e6a6a9 [AArch64] Port min/max to ISLE (#4374)
Copyright (c) 2022, Arm Limited.
2022-07-05 09:16:45 -07:00
Afonso Bordado
38ecd3744f aarch64: Implement bmask/bextend in ISLE (#4358)
* aarch64: Implement `bmask`/`bextend` in ISLE

* cranelift: Remove vector versions of `bextend`

* aarch64: Cleanup `bmask`/`bextend` documentation
2022-07-01 09:37:18 -07:00
Afonso Bordado
919604b8c5 aarch64: Implement ireduce/breduce in ISLE (#4331)
* aarch64: Implement `ireduce`/`breduce` in ISLE

* cranelift: Remove vector versions of `breduce`/`ireduce`
2022-06-30 11:15:47 -07:00
bjorn3
d1446f767d Mark return value as define instead of clobber for TLS pseudoinstructions (#4357) 2022-06-30 10:44:51 -07:00
Sam Parker
fb61774df2 [AArch64] Port AtomicLoad and AtomicStore to ISLE (#4301)
Copyright (c) 2022, Arm Limited.
2022-06-29 12:12:48 -07:00
Chris Fallin
b2e28b917a Cranelift: update to latest regalloc2: (#4324)
- Handle call instructions' clobbers with the clobbers API, using RA2's
  clobbers bitmask (bytecodealliance/regalloc2#58) rather than clobbers
  list;

- Pull in changes from bytecodealliance/regalloc2#59 for much more sane
  edge-case behavior w.r.t. liverange splitting.
2022-06-28 09:01:59 -07:00
Afonso Bordado
42d4f97b78 cranelift: Fix cls for small types on aarch64 (#4305)
The previous `cls` code was producing wrong results when fed with a -1 i8.

The fix here is to sign extend instead of zero extending since we want
to keep the sign bit as one in order for it to be counted correctly
in the cls instruction

This also merges the interpreter only tests now that aarch64
correctly supports this instruction
2022-06-27 15:55:02 -07:00
Afonso Bordado
aef53784ec aarch64: Implement bint in ISLE (#4319) 2022-06-27 15:50:46 -07:00
Anton Kirilov
25a588c35f Cranelift AArch64: Use an allocated encoding for Udf (#4281)
Preserve the current behaviour when code is generated for SpiderMonkey.

Copyright (c) 2022, Arm Limited.
2022-06-22 15:03:28 +01:00
Anton Kirilov
c15c3061ca CFI improvements to the AArch64 fiber implementation (#4195)
Now the fiber implementation on AArch64 authenticates function
return addresses and includes the relevant BTI instructions, except
on macOS.

Also, change the locations of the saved FP and LR registers on the
fiber stack to make them compliant with the Procedure Call Standard
for the Arm 64-bit Architecture.

Copyright (c) 2022, Arm Limited.
2022-06-09 09:17:12 -05:00
Sam Parker
acfeda4d80 [AArch64] Port IaddPairwise to ISLE (#4201)
Copyright (c) 2022, Arm Limited.
2022-06-06 15:37:13 +01:00
Sam Parker
010e028d67 [AArch64] Port AtomicCAS to isle (#4140)
Copyright (c) 2022, Arm Limited.
2022-05-25 09:19:24 +01:00
Chris Fallin
b830c3cf93 Pull in regalloc2 v0.2.0, with no more separate scratch registers. (#4182)
RA2 recently removed the need for a dedicated scratch register for
cyclic moves (bytecodealliance/regalloc2#51). This has moderate positive
performance impact on function bodies that were register-constrained, as
it means that one more register is available. In Sightglass, I measured
+5-8% on `blake3-scalar`, at least among current benchmarks.
2022-05-23 12:51:04 -07:00
Benjamin Bouvier
6e828df632 Remove unused SourceLoc in many Mach data structures (#4180)
* Remove unused srcloc in MachReloc

* Remove unused srcloc in MachTrap

* Use `into_iter` on array in bench code to suppress a warning

* Remove unused srcloc in MachCallSite
2022-05-23 09:27:28 -07:00
Chris Fallin
32622b3e6f Cranelift: fix use of pinned reg with SysV calling convention. (#4176)
Previously, the pinned register (enabled by the `enable_pinned_reg`
Cranelift setting and used via the `get_pinned_reg` and `set_pinned_reg`
CLIF ops) was only used when Cranelift was embedded in SpiderMonkey, in
order to support a pinned heap register. SpiderMonkey has its own
calling convention in Cranelift (named after the integration layer,
"Baldrdash").

However, the feature is more general, and should be usable with the
default system calling convention too, e.g. SysV or Windows Fastcall.

This PR fixes the ABI code to properly treat the pinned register as a
globally allocated register -- and hence an implicit input and output to
every function, not saved/restored in the prologue/epilogue -- for SysV
on x86-64 and aarch64, and Fastcall on x86-64.

Fixes #4170.
2022-05-23 09:18:51 -07:00
Anton Kirilov
edf07a8da6 Cranelift AArch64: Migrate Bitselect and Vselect to ISLE (#4139)
Copyright (c) 2022, Arm Limited.
2022-05-16 09:39:28 -07:00
Chris Fallin
5d671952ee Cranelift: do not check in generated ISLE code; regenerate on every compile. (#4143)
This PR fixes #4066: it modifies the Cranelift `build.rs` workflow to
invoke the ISLE DSL compiler on every compilation, rather than only
when the user specifies a special "rebuild ISLE" feature.

The main benefit of this change is that it vastly simplifies the mental
model required of developers, and removes a bunch of failure modes
we have tried to work around in other ways. There is now just one
"source of truth", the ISLE source itself, in the repository, and so there
is no need to understand a special "rebuild" step and how to handle
merge errors. There is no special process needed to develop the compiler
when modifying the DSL. And there is no "noise" in the git history produced
by constantly-regenerated files.

The two main downsides we discussed in #4066 are:
- Compile time could increase, by adding more to the "meta" step before the main build;
- It becomes less obvious where the source definitions are (everything becomes
  more "magic"), which makes exploration and debugging harder.

This PR addresses each of these concerns:

1. To maintain reasonable compile time, it includes work to cut down the
   dependencies of the `cranelift-isle` crate to *nothing* (only the Rust stdlib),
   in the default build. It does this by putting the error-reporting bits
   (`miette` crate) under an optional feature, and the logging (`log` crate) under
   a feature-controlled macro, and manually writing an `Error` impl rather than
   using `thiserror`. This completely avoids proc macros and the `syn` build slowness.

   The user can still get nice errors out of `miette`: this is enabled by specifying
   a Cargo feature `--features isle-errors`.

2. To allow the user to optionally inspect the generated source, which nominally
   lives in a hard-to-find path inside `target/` now, this PR adds a feature `isle-in-source-tree`
   that, as implied by the name, moves the target for ISLE generated source into
   the source tree, at `cranelift/codegen/isle_generated_source/`. It seems reasonable
   to do this when an explicit feature (opt-in) is specified because this is how ISLE regeneration
   currently works as well. To prevent surprises, if the feature is *not* specified, the
   build fails if this directory exists.
2022-05-11 22:25:24 -07:00
Chris Fallin
eb435f3057 x64: use constant pool for u64 constants rather than movabs. (#4088)
* Allow emitting u64 constants into constant pool.

* Use constant pool for constants on x64 that do not fit in a simm32 and are needed as a RegMem or RegMemImm.

* Fix rip-relative addressing bug in pinsrd emission.
2022-05-10 09:21:05 -07:00
Benjamin Bouvier
71fc16bbeb Narrow allow(dead_code) declarations (#4116)
* Narrow `allow(dead_code)` declarations

Having module wide `allow(dead_code)` may hide some code that's really
dead. In this commit I just narrowed the declarations to the specific
enum variants that were not used (as it seems reasonable to keep them
and their handling in all the matches, for future use). And the compiler
found more dead code that I think we can remove safely in the short
term.

With this, the only files annotated with a module-wide
`allow(dead_code)` are isle-generated files.

* resurrect some functions as test helpers
2022-05-10 12:02:52 +02:00