Commit Graph

171 Commits

Author SHA1 Message Date
Teymour Aldridge
40072f844e Clarify some documentation. (#3641) 2022-01-04 11:15:19 -08:00
Teymour Aldridge
28ede8356a Add a doclink. 2022-01-03 19:22:21 +00:00
Scott McMurray
ca7c54b5f8 Add Type::int_with_byte_size constructor 2021-11-29 16:53:54 -08:00
Scott McMurray
c266f7f4c3 Cranelift: Add LibCall::Memcmp
The comment says the enum is "likely to grow" and the function's been in libc since C89, so hopefully this is ok.

I'd like to use it for emitting things like array equality.
2021-11-29 01:42:59 -08:00
bjorn3
2fbd57e9e2 Remove imm_with_name
It is only used once to rename an imm field to mask
2021-10-31 19:57:04 +01:00
bjorn3
1fd491dadd Remove fallthrough instruction 2021-10-12 14:22:07 +02:00
bjorn3
5b24e117ee Remove instructions used by old br_table legalization 2021-10-12 14:18:52 +02:00
bjorn3
20463d60f3 Replace StackSlots struct with a type alias 2021-10-11 16:41:45 +02:00
bjorn3
fd59a3e045 Remove all unused stackslot handling code 2021-10-11 16:41:45 +02:00
Pat Hickey
bca6946a9d Merge pull request #3432 from bjorn3/remove_reloc_constant
ConstantData related cleanups for the removal of the old backend
2021-10-10 09:59:13 -07:00
Pat Hickey
b7375817b1 Merge pull request #3431 from bjorn3/remove_sarg_t
Remove the sarg_t type and dummy_sarg_t instruction
2021-10-10 09:58:14 -07:00
bjorn3
80709ab624 Rustfmt 2021-10-10 15:26:43 +02:00
bjorn3
355dd996a2 Fix tests 2021-10-10 15:00:25 +02:00
bjorn3
aa0486eb15 Remove offset fields from ConstantPool 2021-10-10 14:47:53 +02:00
bjorn3
8a8797b911 Remove the sarg_t type and dummy_sarg_t instruction
They are no longer necessary with the new style backends
2021-10-10 14:38:35 +02:00
bjorn3
2b89b13c57 Move condcodes from cranelift-codegen-shared to cranelift-codegen 2021-10-10 14:23:35 +02:00
bjorn3
2db3b5b9df Remove code offsets from Function (#3412)
* Remove code offsets from Function

* Remove reloc_jt and fix wasmtime-cranelift
2021-10-07 15:54:00 +02:00
bjorn3
c5c7508289 Remove StackLayoutInfo 2021-10-04 19:39:33 +02:00
bjorn3
b3702f5821 Remove old_signature 2021-10-04 19:39:33 +02:00
Benjamin Bouvier
43a86f14d5 Remove more old backend ISA concepts (#3402)
This also paves the way for unifying TargetIsa and MachBackend, since now they map one to one. In theory the two traits could be merged, which would be nice to limit the number of total concepts. Also they have quite different responsibilities, so it might be fine to keep them separate.

Interestingly, this PR started as removing RegInfo from the TargetIsa trait since the adapter returned a dummy value there. From the fallout, noticed that all Display implementations didn't needed an ISA anymore (since these were only used to render ISA specific registers). Also the whole family of RegInfo / ValueLoc / RegUnit was exclusively used for the old backend, and these could be removed. Notably, some IR instructions needed to be removed, because they were using RegUnit too: this was the oddball of regfill / regmove / regspill / copy_special, which were IR instructions inserted by the old regalloc. Fare thee well!
2021-10-04 10:36:12 +02:00
Benjamin Bouvier
bae4ec6427 Remove ancient register allocation (#3401) 2021-09-30 21:27:23 +02:00
Chris Fallin
38728c5746 Merge pull request #3362 from dheaton-arm/implement-unarrow
Implement `Unarrow`, `Uunarrow`, and `Snarrow` for the interpreter
2021-09-21 10:06:46 -07:00
dheaton-arm
3fc29f5f6c Return u128 from bounds; form new_vec from iter chain
Copyright (c) 2021, Arm Limited
2021-09-20 09:57:19 +01:00
dheaton-arm
83c3bc5b9d Implement Unarrow, Uunarrow, and Snarrow for the interpreter
Implemented the following Opcodes for the Cranelift interpreter:
- `Unarrow` to combine two SIMD vectors into a new vector with twice
the lanes but half the width, with signed inputs which are clamped to
`0x00`.
- `Uunarrow` to perform the same operation as `Unarrow` but treating
inputs as unsigned.
- `Snarrow` to perform the same operation as `Unarrow` but treating
both inputs and outputs as signed, and saturating accordingly.

Note that all 3 instructions saturate at the type boundaries.

Copyright (c) 2021, Arm Limited
2021-09-17 13:26:10 +01:00
Afonso Bordado
92690b84a0 cranelift: Add SIMD icmp comparisons to interpreter 2021-09-11 17:15:44 +01:00
Afonso Bordado
3c1133379c cranelift: Add is_bool_vector helper 2021-09-10 15:46:14 +01:00
Afonso Bordado
85d468dc5a cranelift: Add coerce_bools_to_ints helper 2021-09-10 15:38:30 +01:00
Afonso Bordado
9460a4fb16 cranelift: Support bool vectors in trampoline 2021-09-10 15:10:51 +01:00
dheaton-arm
8f057e0482 Implement SaddSat and SsubSat for the interpreter
Implemented `SaddSat` and `SsubSat` to add and subtract signed vector
values, saturating at the type boundaries rather than overflowing.

Changed the parser to allow signed `i8` immediates in vectors as part of
this work; fixes #3276.

Copyright (c) 2021, Arm Limited.
2021-09-03 11:35:39 +01:00
dheaton-arm
d956d349d8 Implement Insertlane for the Cranelift interpreter
Implemented `Insertlane` to insert a value in the lane specified by the
immediate value, overwriting the existing value in that lane.

Added `TernaryImm8` support for the `imm_value` function.

Copyright (c) 2021, Arm Limited.
2021-09-01 16:21:27 +01:00
Afonso Bordado
2776074dfc cranelift: Add stack support to the interpreter with virtual addresses (#3187)
* cranelift: Add stack support to the interpreter

We also change the approach for heap loads and stores.

Previously we would use the offset as the address to the heap. However,
this approach does not allow using the load/store instructions to
read/write from both the heap and the stack.

This commit changes the addressing mechanism of the interpreter. We now
return the real addresses from the addressing instructions
(stack_addr/heap_addr), and instead check if the address passed into
the load/store instructions points to an area in the heap or the stack.

* cranelift: Add virtual addresses to cranelift interpreter

Adds a  Virtual Addressing scheme that was discussed as a better
alternative to returning the real addresses.

The virtual addresses are split into 4 regions (stack, heap, tables and
global values), and the address itself is composed of an `entry` field
and an `offset` field. In general the `entry` field corresponds to the
instance of the resource (e.g. table5 is entry 5) and the `offset` field
is a byte offset inside that entry.

There is one exception to this which is the stack, where due to only
having one stack, the whole address is an offset field.

The number of bits in entry vs offset fields is variable with respect to
the `region` and the address size (32bits vs 64bits). This is done
because with 32 bit addresses we would have to compromise on heap size,
or have a small number of global values / tables. With 64 bit addresses
we do not have to compromise on this, but we need to support 32 bit
addresses.

* cranelift: Remove interpreter trap codes

* cranelift: Calculate frame_offset when entering or exiting a frame

* cranelift: Add safe read/write interface to DataValue

* cranelift: DataValue write full 128bit slot for booleans

* cranelift: Use DataValue accessors for trampoline.
2021-08-24 09:29:11 -07:00
Alex Crichton
e68aa99588 Implement the memory64 proposal in Wasmtime (#3153)
* Implement the memory64 proposal in Wasmtime

This commit implements the WebAssembly [memory64 proposal][proposal] in
both Wasmtime and Cranelift. In terms of work done Cranelift ended up
needing very little work here since most of it was already prepared for
64-bit memories at one point or another. Most of the work in Wasmtime is
largely refactoring, changing a bunch of `u32` values to something else.

A number of internal and public interfaces are changing as a result of
this commit, for example:

* Acessors on `wasmtime::Memory` that work with pages now all return
  `u64` unconditionally rather than `u32`. This makes it possible to
  accommodate 64-bit memories with this API, but we may also want to
  consider `usize` here at some point since the host can't grow past
  `usize`-limited pages anyway.

* The `wasmtime::Limits` structure is removed in favor of
  minimum/maximum methods on table/memory types.

* Many libcall intrinsics called by jit code now unconditionally take
  `u64` arguments instead of `u32`. Return values are `usize`, however,
  since the return value, if successful, is always bounded by host
  memory while arguments can come from any guest.

* The `heap_addr` clif instruction now takes a 64-bit offset argument
  instead of a 32-bit one. It turns out that the legalization of
  `heap_addr` already worked with 64-bit offsets, so this change was
  fairly trivial to make.

* The runtime implementation of mmap-based linear memories has changed
  to largely work in `usize` quantities in its API and in bytes instead
  of pages. This simplifies various aspects and reflects that
  mmap-memories are always bound by `usize` since that's what the host
  is using to address things, and additionally most calculations care
  about bytes rather than pages except for the very edge where we're
  going to/from wasm.

Overall I've tried to minimize the amount of `as` casts as possible,
using checked `try_from` and checked arithemtic with either error
handling or explicit `unwrap()` calls to tell us about bugs in the
future. Most locations have relatively obvious things to do with various
implications on various hosts, and I think they should all be roughly of
the right shape but time will tell. I mostly relied on the compiler
complaining that various types weren't aligned to figure out
type-casting, and I manually audited some of the more obvious locations.
I suspect we have a number of hidden locations that will panic on 32-bit
hosts if 64-bit modules try to run there, but otherwise I think we
should be generally ok (famous last words). In any case I wouldn't want
to enable this by default naturally until we've fuzzed it for some time.

In terms of the actual underlying implementation, no one should expect
memory64 to be all that fast. Right now it's implemented with
"dynamic" heaps which have a few consequences:

* All memory accesses are bounds-checked. I'm not sure how aggressively
  Cranelift tries to optimize out bounds checks, but I suspect not a ton
  since we haven't stressed this much historically.

* Heaps are always precisely sized. This means that every call to
  `memory.grow` will incur a `memcpy` of memory from the old heap to the
  new. We probably want to at least look into `mremap` on Linux and
  otherwise try to implement schemes where dynamic heaps have some
  reserved pages to grow into to help amortize the cost of
  `memory.grow`.

The memory64 spec test suite is scheduled to now run on CI, but as with
all the other spec test suites it's really not all that comprehensive.
I've tried adding more tests for basic things as I've had to implement
guards for them, but I wouldn't really consider the testing adequate
from just this PR itself. I did try to take care in one test to actually
allocate a 4gb+ heap and then avoid running that in the pooling
allocator or in emulation because otherwise that may fail or take
excessively long.

[proposal]: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md

* Fix some tests

* More test fixes

* Fix wasmtime tests

* Fix doctests

* Revert to 32-bit immediate offsets in `heap_addr`

This commit updates the generation of addresses in wasm code to always
use 32-bit offsets for `heap_addr`, and if the calculated offset is
bigger than 32-bits we emit a manual add with an overflow check.

* Disable memory64 for spectest fuzzing

* Fix wrong offset being added to heap addr

* More comments!

* Clarify bytes/pages
2021-08-12 09:40:20 -05:00
Alex Crichton
ee3ff52661 Refactor cranelift immediates slightly
I've run up against the `Into`-vs-`From` impls a few times and figured
I'd go ahead and put up a refactoring. This switches `Into` impls into
`From` impls which allows using both traits instead of just the `Into`
version. Additionally this removes a few small `as` casts in favor of
infallible `from`/`into` or `try_from` with error handling.
2021-08-06 09:14:25 -07:00
Alex Crichton
4cfa031c5f Implement API support for v128-globals (#3147)
Found via fuzzing, and looks like these were accidentally left out along
the way SIMD was taking shape.
2021-08-05 13:02:34 -05:00
Alex Crichton
63a3bbbf5a Change VMMemoryDefinition::current_length to usize (#3134)
* Change VMMemoryDefinition::current_length to `usize`

This commit changes the definition of
`VMMemoryDefinition::current_length` to `usize` from its previous
definition of `u32`. This is a pretty impactful change because it also
changes the cranelift semantics of "dynamic" heaps where the bound
global value specifier must now match the pointer type for the platform
rather than the index type for the heap.

The motivation for this change is that the `current_length` field (or
bound for the heap) is intended to reflect the current size of the heap.
This is bound by `usize` on the host platform rather than `u32` or`
u64`. The previous choice of `u32` couldn't represent a 4GB memory
because we couldn't put a number representing 4GB into the
`current_length` field. By using `usize`, which reflects the host's
memory allocation, this should better reflect the size of the heap and
allows Wasmtime to support a full 4GB heap for a wasm program (instead
of 4GB minus one page).

This commit also updates the legalization of the `heap_addr` clif
instruction to appropriately cast the address to the platform's pointer
type, handling bounds checks along the way. The practical impact for
today's targets is that a `uextend` is happening sooner than it happened
before, but otherwise there is no intended impact of this change. In the
future when 64-bit memories are supported there will likely need to be
fancier logic which handles offsets a bit differently (especially in the
case of a 64-bit memory on a 32-bit host).

The clif `filetest` changes should show the differences in codegen, and
the Wasmtime changes are largely removing casts here and there.

Closes #3022

* Add tests for memory.size at maximum memory size

* Add a dfg helper method
2021-08-02 13:09:40 -05:00
Andrew Brown
766774e1f5 refactor: reorganize crate imports 2021-07-26 13:39:16 -07:00
Andrew Brown
6b86984c41 x64: avoid load-coalescing SIMD operations with non-aligned loads
Fixes #2943, though not as optimally as may be desired. With x64 SIMD
instructions, the memory operand must be aligned--this change adds that
check. There are cases, however, where we can do better--see #3106.
2021-07-26 13:39:16 -07:00
Nick Fitzgerald
4283d2116d cranelift: Move most debug-level logs to the trace level
Cranelift crates have historically been much more verbose with debug-level
logging than most other crates in the Rust ecosystem. We log things like how
many parameters a basic block has, the color of virtual registers during
regalloc, etc. Even for Cranelift hackers, these things are largely only useful
when hacking specifically on Cranelift and looking at a particular test case,
not even when using some Cranelift embedding (such as Wasmtime).

Most of the time, when people want logging for their Rust programs, they do
something like:

    RUST_LOG=debug cargo run

This means that they get all that mostly not useful debug logging out of
Cranelift. So they might want to disable logging for Cranelift, or change it to
a higher log level:

    RUST_LOG=debug,cranelift=info cargo run

The problem is that this is already more annoying to type that `RUST_LOG=debug`,
and that Cranelift isn't one single crate, so you actually have to play
whack-a-mole with naming all the Cranelift crates off the top of your head,
something more like this:

    RUST_LOG=debug,cranelift=info,cranelift_codegen=info,cranelift_wasm=info,...

Therefore, we're changing most of the `debug!` logs into `trace!` logs: anything
that is very Cranelift-internal, unlikely to be useful/meaningful to the
"average" Cranelift embedder, or prints a message for each instruction visited
during a pass. On the other hand, things that just report a one line statistic
for a whole pass, for example, are left as `debug!`. The more verbose the log
messages are, the higher the bar they must clear to be `debug!` rather than
`trace!`.
2021-07-26 11:50:16 -07:00
StackDoubleFlow
9637bc5a09 Fix cranelift Module and ObjectModule docs links (#2852) 2021-04-21 06:29:02 -07:00
Chris Fallin
48d542d67c Fix bad jumptable block ref when DCE removes a block.
When a block is unreachable, the `unreachable_code` pass will remove it,
which is perfectly sensible. Jump tables factor into unreachability in
an expected way: even if a block is listed in a jump table, the block
might be unreachable if the jump table itself is unused (or used in an
unreachable block). Unfortunately, the verifier still expects all
block refs in all jump tables to be valid, even after DCE, which will
not always be the case.

This makes a simple change to the pass: after removing blocks, it scans
jump tables. Any jump table that refers to an unreachable block must
itself be unused, and so we just clear its entries. We do not bother
removing it (and renumbering all later jumptables), and we do not bother
computing full unused-ness of all jumptables, as that would be more
expensive; it's sufficient to clear out the ones that refer to
unreachable blocks, which are a subset of all unused jumptables.

Fixes #2670.
2021-02-23 15:01:01 -08:00
Chris Fallin
c07ec4c525 Merge pull request #2653 from bjorn3/more_atomic_ops
More atomic ops
2021-02-18 08:34:58 -08:00
bjorn3
ff22842da5 More atomic ops 2021-02-18 14:16:15 +01:00
bjorn3
720da20588 Describe serialization format 2021-02-18 11:27:51 +01:00
bjorn3
a0c2276ee7 Add a version marker
This prevents deserializing a function with a different Cranelift version
2021-02-18 11:27:51 +01:00
bjorn3
2fc964ea35 Add serde serialization support for the full clif ir 2021-02-18 11:27:02 +01:00
Chris Fallin
c84d6be6f4 Detailed debug-info (DWARF) support in new backends (initially x64).
This PR propagates "value labels" all the way from CLIF to DWARF
metadata on the emitted machine code. The key idea is as follows:

- Translate value-label metadata on the input into "value_label"
  pseudo-instructions when lowering into VCode. These
  pseudo-instructions take a register as input, denote a value label,
  and semantically are like a "move into value label" -- i.e., they
  update the current value (as seen by debugging tools) of the given
  local. These pseudo-instructions emit no machine code.

- Perform a dataflow analysis *at the machine-code level*, tracking
  value-labels that propagate into registers and into [SP+constant]
  stack storage. This is a forward dataflow fixpoint analysis where each
  storage location can contain a *set* of value labels, and each value
  label can reside in a *set* of storage locations. (Meet function is
  pairwise intersection by storage location.)

  This analysis traces value labels symbolically through loads and
  stores and reg-to-reg moves, so it will naturally handle spills and
  reloads without knowing anything special about them.

- When this analysis converges, we have, at each machine-code offset, a
  mapping from value labels to some number of storage locations; for
  each offset for each label, we choose the best location (prefer
  registers). Note that we can choose any location, as the symbolic
  dataflow analysis is sound and guarantees that the value at the
  value_label instruction propagates to all of the named locations.

- Then we can convert this mapping into a format that the DWARF
  generation code (wasmtime's debug crate) can use.

This PR also adds the new-backend variant to the gdb tests on CI.
2021-01-21 15:59:49 -08:00
Chris Fallin
743529b4eb Merge pull request #2492 from uweigand/endian-memory-v5
Support explicit endianness in Cranelift IR MemFlags
2020-12-14 13:59:08 -08:00
Ulrich Weigand
467a1af83a Support explicit endianness in Cranelift IR MemFlags
WebAssembly memory operations are by definition little-endian even on
big-endian target platforms.  However, other memory accesses will require
native target endianness (e.g. to access parts of the VMContext that is
also accessed by VM native code).  This means on big-endian targets,
the code generator will have to handle both little- and big-endian
memory accesses.  However, there is currently no way to encode that
distinction into the Cranelift IR that describes memory accesses.

This patch provides such a way by adding an (optional) explicit
endianness marker to an instance of MemFlags.  Since each Cranelift IR
instruction that describes memory accesses already has an instance of
MemFlags attached, this can now be used to provide endianness
information.

Note that by default, memory accesses will continue to use the native
target ISA endianness.  To override this to specify an explicit
endianness, a MemFlags value that was built using the set_endianness
routine must be used.  This patch does so for accesses that implement
WebAssembly memory operations.

This patch addresses issue #2124.
2020-12-14 20:15:37 +01:00
Y-Nak
855a6374dd Fix missing modification of jump table in licm 2020-12-09 11:13:33 +09:00
Pat Hickey
0f1dc9a735 Merge pull request #2403 from bjorn3/simplejit_hot_swapping
SimpleJIT hot code swapping
2020-12-03 13:36:32 -08:00