Remove heaps from core Cranelift, push them into cranelift-wasm (#5386)

* cranelift-wasm: translate Wasm loads into lower-level CLIF operations

Rather than using `heap_{load,store,addr}`.

* cranelift: Remove the `heap_{addr,load,store}` instructions

These are now legalized in the `cranelift-wasm` frontend.

* cranelift: Remove the `ir::Heap` entity from CLIF

* Port basic memory operation tests to .wat filetests

* Remove test for verifying CLIF heaps

* Remove `heap_addr` from replace_branching_instructions_and_cfg_predecessors.clif test

* Remove `heap_addr` from readonly.clif test

* Remove `heap_addr` from `table_addr.clif` test

* Remove `heap_addr` from the simd-fvpromote_low.clif test

* Remove `heap_addr` from simd-fvdemote.clif test

* Remove `heap_addr` from the load-op-store.clif test

* Remove the CLIF heap runtest

* Remove `heap_addr` from the global_value.clif test

* Remove `heap_addr` from fpromote.clif runtests

* Remove `heap_addr` from fdemote.clif runtests

* Remove `heap_addr` from memory.clif parser test

* Remove `heap_addr` from reject_load_readonly.clif test

* Remove `heap_addr` from reject_load_notrap.clif test

* Remove `heap_addr` from load_readonly_notrap.clif test

* Remove `static-heap-without-guard-pages.clif` test

Will be subsumed when we port `make-heap-load-store-tests.sh` to generating
`.wat` tests.

* Remove `static-heap-with-guard-pages.clif` test

Will be subsumed when we port `make-heap-load-store-tests.sh` over to `.wat`
tests.

* Remove more heap tests

These will be subsumed by porting `make-heap-load-store-tests.sh` over to `.wat`
tests.

* Remove `heap_addr` from `simple-alias.clif` test

* Remove `heap_addr` from partial-redundancy.clif test

* Remove `heap_addr` from multiple-blocks.clif test

* Remove `heap_addr` from fence.clif test

* Remove `heap_addr` from extends.clif test

* Remove runtests that rely on heaps

Heaps are not a thing in CLIF or the interpreter anymore

* Add generated load/store `.wat` tests

* Enable memory-related wasm features in `.wat` tests

* Remove CLIF heap from fcmp-mem-bug.clif test

* Add a mode for compiling `.wat` all the way to assembly in filetests

* Also generate WAT to assembly tests in `make-load-store-tests.sh`

* cargo fmt

* Reinstate `f{de,pro}mote.clif` tests without the heap bits

* Remove undefined doc link

* Remove outdated SVG and dot file from docs

* Add docs about `None` returns for base address computation helpers

* Factor out `env.heap_access_spectre_mitigation()` to a local

* Expand docs for `FuncEnvironment::heaps` trait method

* Restore f{de,pro}mote+load clif runtests with stack memory
This commit is contained in:
Nick Fitzgerald
2022-12-14 16:26:45 -08:00
committed by GitHub
parent e03d65cca7
commit c0b587ac5f
198 changed files with 2494 additions and 4232 deletions

View File

@@ -559,148 +559,6 @@ GV = [colocated] symbol Name
:arg Name: External name.
:result GV: Global value.
### Heaps
Code compiled from WebAssembly or asm.js runs in a sandbox where it can't access
all process memory. Instead, it is given a small set of memory areas to work
in, and all accesses are bounds checked. Cranelift models this through the
concept of *heaps*.
A heap is declared in the function preamble and can be accessed with the
`heap_addr` instruction that [traps] on out-of-bounds accesses or
returns a pointer that is guaranteed to trap. Heap addresses can be smaller than
the native pointer size, for example unsigned `i32` offsets on a 64-bit
architecture.
![Heap address space layout](./heap.svg)
A heap appears as three consecutive ranges of address space:
1. The *mapped pages* are the [accessible] memory range in the heap. A
heap may have a minimum guaranteed size which means that some mapped pages
are always present.
2. The *unmapped pages* is a possibly empty range of address space that may be
mapped in the future when the heap is grown. They are [addressable] but
not [accessible].
3. The *offset-guard pages* is a range of address space that is guaranteed to
always cause a trap when accessed. It is used to optimize bounds checking for
heap accesses with a shared base pointer. They are [addressable] but
not [accessible].
The *heap bound* is the total size of the mapped and unmapped pages. This is
the bound that `heap_addr` checks against. Memory accesses inside the
heap bounds can trap if they hit an unmapped page (which is not
[accessible]).
Two styles of heaps are supported, *static* and *dynamic*. They behave
differently when resized.
#### Static heaps
A *static heap* starts out with all the address space it will ever need, so it
never moves to a different address. At the base address is a number of mapped
pages corresponding to the heap's current size. Then follows a number of
unmapped pages where the heap can grow up to its maximum size. After the
unmapped pages follow the offset-guard pages which are also guaranteed to
generate a trap when accessed.
H = static Base, min MinBytes, bound BoundBytes, offset_guard OffsetGuardBytes
Declare a static heap in the preamble.
:arg Base: Global value holding the heap's base address.
:arg MinBytes: Guaranteed minimum heap size in bytes. Accesses below this
size will never trap.
:arg BoundBytes: Fixed heap bound in bytes. This defines the amount of
address space reserved for the heap, not including the offset-guard
pages.
:arg OffsetGuardBytes: Size of the offset-guard pages in bytes.
#### Dynamic heaps
A *dynamic heap* can be relocated to a different base address when it is
resized, and its bound can move dynamically. The offset-guard pages move when
the heap is resized. The bound of a dynamic heap is stored in a global value.
H = dynamic Base, min MinBytes, bound BoundGV, offset_guard OffsetGuardBytes
Declare a dynamic heap in the preamble.
:arg Base: Global value holding the heap's base address.
:arg MinBytes: Guaranteed minimum heap size in bytes. Accesses below this
size will never trap.
:arg BoundGV: Global value containing the current heap bound in bytes.
:arg OffsetGuardBytes: Size of the offset-guard pages in bytes.
#### Heap examples
Some Wasm VMs prefer to use fixed heaps with a 4 GB bound and 2 GB of
offset-guard pages when running WebAssembly code on 64-bit CPUs. The combination
of a 4 GB fixed bound and 1-byte bounds checks means that no code needs to be
generated for bounds checks at all:
```
test verifier
function %add_members(i32, i64 vmctx) -> f32 {
gv0 = vmctx
gv1 = load.i64 notrap aligned gv0+64
heap0 = static gv1, min 0x1000, bound 0x1_0000_0000, offset_guard 0x8000_0000
block0(v0: i32, v5: i64):
v1 = heap_addr.i64 heap0, v0, 1
v2 = load.f32 v1+16
v3 = load.f32 v1+20
v4 = fadd v2, v3
return v4
}
```
A static heap can also be used for 32-bit code when the WebAssembly module
declares a small upper bound on its memory. A 1 MB static bound with a single 4
KB offset-guard page still has opportunities for sharing bounds checking code:
```
test verifier
function %add_members(i32, i32 vmctx) -> f32 {
gv0 = vmctx
gv1 = load.i32 notrap aligned gv0+64
heap0 = static gv1, min 0x1000, bound 0x10_0000, offset_guard 0x1000
block0(v0: i32, v5: i32):
v1 = heap_addr.i32 heap0, v0, 1
v2 = load.f32 v1+16
v3 = load.f32 v1+20
v4 = fadd v2, v3
return v4
}
```
If the upper bound on the heap size is too large, a dynamic heap is required
instead.
Finally, a runtime environment that simply allocates a heap with
`malloc()` may not have any offset-guard pages at all. In that case,
full bounds checking is required for each access:
```
test verifier
function %add_members(i32, i64 vmctx) -> f32 {
gv0 = vmctx
gv1 = load.i64 notrap aligned gv0+64
gv2 = load.i32 notrap aligned gv0+72
heap0 = dynamic gv1, min 0x1000, bound gv2, offset_guard 0
block0(v0: i32, v6: i64):
v1 = heap_addr.i64 heap0, v0, 20
v2 = load.f32 v1+16
v3 = heap_addr.i64 heap0, v0, 24
v4 = load.f32 v3+20
v5 = fadd v2, v4
return v5
}
```
### Tables
Code compiled from WebAssembly often needs access to objects outside of its
@@ -728,7 +586,7 @@ T = dynamic Base, min MinElements, bound BoundGV, element_size ElementSize
:arg Base: Global value holding the table's base address.
:arg MinElements: Guaranteed minimum table size in elements.
:arg BoundGV: Global value containing the current heap bound in elements.
:arg BoundGV: Global value containing the current table bound in elements.
:arg ElementSize: Size of each element.
### Constant materialization