Commit Graph

7343 Commits

Author SHA1 Message Date
Pat Hickey
f7a0d86c64 install-openvino: typo 2020-11-20 11:22:52 -08:00
Pat Hickey
6681e6786e debug assert could catch double-free 2020-11-20 11:20:40 -08:00
Pat Hickey
f5f180a8fe refactor is_borrowed/unborrow into shared/mut variants 2020-11-19 15:29:12 -08:00
Pat Hickey
224e8b0e88 wasi-nn: fix mutable guestslice borrow 2020-11-19 15:28:53 -08:00
Pat Hickey
fb68c80420 install-openvino: make it easier to invoke on your local machine
put the default version in the shell script, not the yml.
write any files to the directory where the action lives,
and .gitignore them.
2020-11-19 15:23:07 -08:00
Pat Hickey
f9de1d3e5c rename immutable borrows to shared borrows 2020-11-19 14:42:31 -08:00
Pat Hickey
3509883f2d wiggle: add test of overlapping immutable borrows 2020-11-18 15:02:02 -08:00
Pat Hickey
26192d6760 wasi-common: opt in to mutable borrowing 2020-11-18 14:43:47 -08:00
Pat Hickey
fc608e392b wiggle: make Mut variants of GuestStr, GuestPtr 2020-11-18 12:32:21 -08:00
Pat Hickey
78db3ff13b wiggle: borrow checker lives in own crate, and supports both mut/immut 2020-11-18 12:19:47 -08:00
Chris Fallin
bf971efa42 Merge pull request #2426 from cfallin/machinst-trap-info
Carry `MemFlag`s on loads/stores in MachInst backends, and emit trap info only where needed.
2020-11-18 08:23:42 -08:00
Andrew Brown
3d606a01e5 wasi-nn: remove unused functions (#2427) 2020-11-18 09:21:51 -06:00
Chris Fallin
073c727a74 x64 and aarch64: carry MemFlags on loads/stores; don't emit trap info unless an op can trap.
This end result was previously enacted by carrying a `SourceLoc` on
every load/store, which was somewhat cumbersome, and only indirectly
encoded metadata about a memory reference (can it trap) by its presence
or absence. We have a type for this -- `MemFlags` -- that tells us
everything we might want to know about a load or store, and we should
plumb it through to code emission instead.

This PR attaches a `MemFlags` to an `Amode` on x64, and puts it on load
and store `Inst` variants on aarch64. These two choices seem to factor
things out in the nicest way: there are relatively few load/store insts
on aarch64 but many addressing modes, while the opposite is true on x64.
2020-11-17 11:43:06 -08:00
Chris Fallin
e7df081696 Merge pull request #2389 from cfallin/x64-load-op
x64 backend: merge loads into ALU ops when appropriate.
2020-11-17 11:42:19 -08:00
Chris Fallin
b97f07b405 x64 backend: merge loads into ALU ops when appropriate.
This PR makes use of the support in #2366 for sinking effectful
instructions and merging them with consumers. In particular, on x86, we
want to make use of the ability of many instructions to load one operand
directly from memory. That is, instead of this:

```
    movq 0(%rdi), %rax
    addq %rax, %rbx
```

we want to generate this:

```
    addq 0(%rdi), %rax
```

As described in more detail in #2366, sinking and merging the load is
only possible under certain conditions. In particular, we need to ensure
that the use is the *only* use (otherwise the load happens more than
once), and we need to ensure that it does not move across other
effectful ops (see #2366 for how we ensure this).

This change is actually fairly simple, given that all the framework is
in place: we simply pattern-match a load on one operand of an ALU
instruction that takes an RMI (reg, mem, or immediate) operand, and
generate the mem form when we match.

Also makes a drive-by improvement in the x64 backend to use
statically-monomorphized `LowerCtx` types rather than a `&mut dyn
LowerCtx`.

On `bz2.wasm`, this results in ~1% instruction-count reduction. More is
likely possible by following up with other instructions that can merge
memory loads as well.
2020-11-17 11:06:46 -08:00
Nick Fitzgerald
281a41c08b Merge pull request #2406 from fitzgen/remove-typo
wasmtime: Remove typo in doc comment
2020-11-17 10:39:12 -08:00
Nick Fitzgerald
02156eaef3 wasmtime: Remove typo in doc comment 2020-11-17 09:39:38 -08:00
Chris Fallin
9e511ec0c0 Merge pull request #2376 from cfallin/loadsplat
AArch64 SIMD: replace `LoadSplat` with pattern-matching on load+splat
2020-11-17 08:03:21 -08:00
Nick Fitzgerald
d7e4f92030 Merge pull request #2425 from alexcrichton/fix-wrong-store-2
Fix assertion with cross-store values in `Func::new`
2020-11-16 16:36:05 -08:00
Nick Fitzgerald
3dde6559c0 Merge pull request #2408 from alexcrichton/fix-use-after-free-trampoline
Fix a use-after-free of trampoline code
2020-11-16 16:35:02 -08:00
Chris Fallin
712ff22492 AArch64 SIMD: pattern-match load+splat into LD1R instruction. 2020-11-16 15:59:28 -08:00
Chris Fallin
39b5736727 Remove LoadSplat opcode, in preparation for pattern-matching Load+Splat.
This was added as an incremental step to improve AArch64 code quality in
PR #2278. At the time, we did not have a way to pattern-match the load +
splat opcode sequence that the relevant Wasm opcodes lowered to.
However, now with PR #2366, we can merge effectful instructions such as
loads into other ops, and so we can do this pattern matching directly.
The pattern-matching update will come in a subsequent commit.
2020-11-16 15:31:56 -08:00
Chris Fallin
2150a533b6 Merge pull request #2366 from cfallin/load-isel
MachInst lowering logic: allow effectful instructions to merge.
2020-11-16 15:31:38 -08:00
Chris Fallin
3c8cb7b908 MachInst lowering logic: allow effectful instructions to merge.
This PR updates the "coloring" scheme that accounts for side-effects in
the MachInst lowering logic. As a result, the new backends will now be
able to merge effectful operations (such as memory loads) *into* other
operations; previously, only the other way (pure ops merged into
effectful ops) was possible. This will allow, for example, a load+ALU-op
combination, as is common on x86. It should even allow a load + ALU-op +
store sequence to merge into one lowered instruction.

The scheme arose from many fruitful discussions with @julian-seward1
(thanks!); significant credit is due to him for the insights here.

The first insight is that given the right basic conditions, i.e.  that
the root instruction is the only use of an effectful instruction's
result, all we need is that the "color" of the effectful instruction is
*one less* than the color of the current instruction. It's easier to
think about colors on the program points between instructions: if the
color coming *out* of the first (effectful def) instruction and *in* to
the second (effectful or effect-free use) instruction are the same, then
they can merge. Basically the color denotes a version of global state;
if the same, then no other effectful ops happened in the meantime.

The second insight is that we can keep state as we scan, tracking the
"current color", and *update* this when we sink (merge) an op. Hence
when we sink a load into another op, we effectively *re-color* every
instruction it moved over; this may allow further sinks.

Consider the example (and assume that we consider loads effectful in
order to conservatively ensure a strong memory model; otherwise, replace
with other effectful value-producing insts):

```
  v0 = load x
  v1 = load y
  v2 = add v0, 1
  v3 = add v1, 1
```

Scanning from bottom to top, we first see the add producing `v3` and we
can sink the load producing `v1` into it, producing a load + ALU-op
machine instruction. This is legal because `v1` moves over only `v2`,
which is a pure instruction. Consider, though, `v2`: under a simple
scheme that has no other context, `v0` could not sink to `v2` because it
would move over `v1`, another load. But because we already sunk `v1`
down to `v3`, we are free to sink `v0` to `v2`; the update of the
"current color" during the scan allows this.

This PR also cleans up the `LowerCtx` interface a bit at the same time:
whereas previously it always gave some subset of (constant, mergeable
inst, register) directly from `LowerCtx::get_input()`, it now returns
zero or more of (constant, mergable inst) from
`LowerCtx::maybe_get_input_as_source_or_const()`, and returns the
register only from `LowerCtx::put_input_in_reg()`. This removes the need
to explicitly denote uses of the register, so it's a little safer.

Note that this PR does not actually make use of the new ability to merge
loads into other ops; that will come in future PRs, especially to
optimize the `x64` backend by using direct-memory operands.
2020-11-16 14:53:45 -08:00
Alex Crichton
ffca0fc908 Fix assertion with cross-store values in Func::new
If a host-defined `Func::new` closure returns values from the wrong
store, this currently trips a debug assertion and causes other issues
elsewhere in release mode. This commit adds the same dynamic checks
found in `Func::wrap` in the `Func::new` case today.
2020-11-16 12:34:02 -08:00
Alex Crichton
8675fa5aa7 Fix a memory leak on returning incompatible values (#2424)
This fixes an issue where if a store-incompatible value is returned from
a host-defined function then that value is leaked. Practically this
means that it's possible to accidentally leak `Func` values, but a
simple insertion of a `drop` does the trick!
2020-11-16 14:26:48 -06:00
Andrew Brown
a61f068c64 Add an initial wasi-nn implementation for Wasmtime (#2208)
* Add an initial wasi-nn implementation for Wasmtime

This change adds a crate, `wasmtime-wasi-nn`, that uses `wiggle` to expose the current state of the wasi-nn API and `openvino` to implement the exposed functions. It includes an end-to-end test demonstrating how to do classification using wasi-nn:
 - `crates/wasi-nn/tests/classification-example` contains Rust code that is compiled to the `wasm32-wasi` target and run with a Wasmtime embedding that exposes the wasi-nn calls
 - the example uses Rust bindings for wasi-nn contained in `crates/wasi-nn/tests/wasi-nn-rust-bindings`; this crate contains code generated by `witx-bindgen` and eventually should be its own standalone crate

* Test wasi-nn as a CI step

This change adds:
 - a GitHub action for installing OpenVINO
 - a script, `ci/run-wasi-nn-example.sh`, to run the classification example
2020-11-16 12:54:00 -06:00
Ivan Zvonimir Horvat
61a0bcbdc6 examples: threads.rs; fixed eun typo -> run (#2422) 2020-11-16 11:48:49 -06:00
Chris Fallin
fd36be3682 Merge pull request #2420 from bytecodealliance/fitzgen-patch-1
Update docs to reflect that reference types work on aarch64 now
2020-11-16 09:37:31 -08:00
Nick Fitzgerald
5256cd2e87 Update docs to reflect that reference types work on aarch64 now 2020-11-16 08:23:03 -08:00
Chris Fallin
7b9d870030 Merge pull request #2410 from cfallin/x64-gc
Fix and enable GC on new x64 backend.
2020-11-13 09:40:05 -08:00
Chris Fallin
88fce766b0 Merge pull request #2411 from cfallin/x86-backend-cfg
Don't run old x86 backend-specific tests with new x64 backend.
2020-11-13 09:29:16 -08:00
Ivan Zvonimir Horvat
5995c3774f Command: config; fix message typo (#2412) 2020-11-13 14:28:27 +01:00
Chris Fallin
0d703c12ed Don't run old x86 backend-specific tests with new x64 backend.
Some of the test failures tracked by #2079 are in unwind tests that are
specific to the old x86 backend: namely, these tests invoke the unwind
implementation that is paired with the old backend, rather than generic
over all backends. It thus doesn't make sense to try to run these tests
with the new backend. (The new backend's unwind code should have
analogous tests written/ported over eventually.)

It seems that we were actually building *both* x86 backends when the
`x64` feature was enabled, except that the old x86 backend would never
be instantiated by the usual ISA-lookup logic because a `x86-64` target
triple unconditionally resolves to the new one.

This PR resolves both of the issues by tweaking the feature-config
directives to exclude the `x86` backend when `x64` is enabled.
2020-11-12 20:44:53 -08:00
Chris Fallin
01b60e81b0 Fix and enable GC on new x64 backend.
One critical bit of plumbing was missing: the `StackMapSink` passed to
`compile_and_emit` was not actually receiving stackmaps. This seemingly
very basic issue was not caught because the other major user of reftype
support, SpiderMonkey, extracts stackmaps with a lower-level API. The
SM integration was built this way to avoid an awkward API quirk when
passing stackmaps through a `CodeSink` that proxies them to a
`StackMapSink`: the `CodeSink` wants `Value`s for each reference slot,
while the actual `StackMapSink` does not require these. This PR tweaks
the plumbing in a slightly different way to make `wasmtime` GC tests,
and presumably other consumers of stack-map info from the top-level
Cranelift interface, happy.
2020-11-12 16:55:18 -08:00
Chris Fallin
113d061129 Merge pull request #2369 from akirilov-arm/move_fix
Cranelift AArch64: Various small fixes
2020-11-12 14:59:10 -08:00
Alex Crichton
f4c3622dab Fix a use-after-free of trampoline code
This commit fixes an issue with wasmtime where it was possible for a
trampoline from one module to get used for another module after it was
freed. This issue arises because we register a module's native
trampolines *before* it's fully instantiated, which is a fallible
process. Some fallibility is predictable, such as import type
mismatches, but other fallibility is less predictable, such as failure
to allocate a linear memory.

The problem happened when a module was registered with a `Store`,
retaining information about its trampolines, but then instantiation
failed and the module's code was never persisted within the `Store`.
Unlike as documented in #2374 the `Module` inside an `Instance` is not
the primary way to hold on to a module's code, but rather the
`Arc<ModuleCode>` is persisted within the global frame information off
on the side. This persistence only made its way into the store through
the `Box<Any>` field of `InstanceHandle`, but that's never made if
instantiation fails during import matching.

The fix here is to build on the refactoring of #2407 to not store module
code in frame information but rather explicitly in the `Store`.
Registration is now deferred until just-before an instance handle is
created, and during module registration we insert the `Arc<ModuleCode>`
into a set stored within the `Store`.
2020-11-12 14:33:15 -08:00
Alex Crichton
243ab3b542 Remove the global variable associated with traps
This commit removes the global variable associated with wasm traps which
stores frame information. The only purpose of this global is to help
symbolicate `Trap`s created since we support creating a `Trap` without a
`Store`. The global, however, is only used for wasm frames on the stack,
and when wasm frames are on the stack we know that our thread local for
"what was the last context" is set and configured.

The change here is to hijack this thread-local some more to effectively
store the `Store` inside of it. All frame information is then moved
directly into `Store` and no longer lives off on the side in a global.
Additionally support for registering/unregistering modules is now
simplified because once a module is registered with a store it can never
be unregistered.

This has one slight functional change where if there are two instances
of `Store` interleaving calls to wasm code on the stack we'll only be
able to symbolicate one of them instead of both. That's arguably also a
feature however because this is sort of a way to leak information across
stores right now.

Otherwise, though, this isn't intended to change any existing logic, but
instead keep everything working as-is.
2020-11-12 14:33:02 -08:00
Andrew Brown
ad61eb4eb9 [machinst x64]: enable more SIMD spec tests 2020-11-12 14:21:45 -08:00
Andrew Brown
bd93e69eb4 [machinst x64]: implement packed shifts 2020-11-12 14:21:45 -08:00
Andrew Brown
8ba92853be [machinst x64]: add punpack[hl]bw instructions 2020-11-12 14:21:45 -08:00
Andrew Brown
8131b15921 [machinst x64]: allow addressing of constants 2020-11-12 14:21:45 -08:00
Alex Crichton
01b7d88641 Split up src/runtime.rs in wasmtime (#2404)
This file has grown quite a lot with `Store` over time so this splits it
up into three separate files, one for each of the main types defined in
it: `Config`, `Engine`, and `Store`.
2020-11-12 15:50:56 -06:00
Chris Fallin
c19762d5c2 Merge pull request #2354 from uweigand/fix-builtinuext
Add extension marker to i32 arguments of builtin functions
2020-11-12 12:27:44 -08:00
Chris Fallin
89dbc4590d Merge pull request #2363 from cfallin/extend-only-if-abi
Do value-extensions at ABI boundaries only when ABI requires it.
2020-11-12 12:26:20 -08:00
Chris Fallin
fd6433aaf5 Merge pull request #2395 from cfallin/lucet-x64-support
Add support for brff/brif and icmp_sp to new x64 backend to support Lucet.
2020-11-12 12:10:52 -08:00
Alex Crichton
068340d30f Fix a case of using the wrong stack map during gcs (#2396)
This commit fixes an issue where when looking up the stack map for a pc
within a function we might end up reading the *previous* function's
stack maps. This then later caused asserts to trip because we started
interpreting random data as a `VMExternRef` when it wasn't. The fix was
to add `None` markers for "this range has no stack map" in the function
ranges map.

Closes #2386
2020-11-12 13:24:00 -06:00
Julian Seward
cbce34af0a aarch64/inst/unwind.rs: handle zero-length prologues correctly. 2020-11-12 17:41:21 +01:00
Anton Kirilov
edaada3f57 Cranelift AArch64: Various small fixes
* Use FMOV to move 64-bit FP registers and SIMD vectors.
* Add support for additional vector load types.
* Fix the printing of Inst::LoadAddr.

Copyright (c) 2020, Arm Limited.
2020-11-12 13:54:05 +00:00
Chris Fallin
19640367db Merge pull request #2394 from cfallin/no-size-asserts
Remove size-of-struct asserts that break with some Rust versions.
2020-11-11 18:04:34 -08:00