Commit Graph

8178 Commits

Author SHA1 Message Date
Chris Fallin
1f21b32e99 Merge pull request #2838 from uweigand/optionalfp
Allow unwind support to work without a frame pointer
2021-04-14 10:58:51 -07:00
Chris Fallin
337cc47d2f Merge pull request #2840 from bnjbvr/fix-2839
cranelift: always spill i32 with i64 stores
2021-04-14 10:11:47 -07:00
Benjamin Bouvier
e7bced9512 cranelift: always spill i32 with i64 stores;
Fixes #2839. See also the issue description and comments in this commits for
details of what the fix is about here.
2021-04-14 18:08:52 +02:00
Ulrich Weigand
5904c09682 Allow unwind support to work without a frame pointer
The patch extends the unwinder to support targets that do not need
to use a dedicated frame pointer register.  Specifically, the
changes include:

- Change the "fp" routine in the RegisterMapper to return an
  *optional* frame pointer regsiter via Option<Register>.

- On targets that choose to not define a FP register via the above
  routine, the UnwindInst::DefineNewFrame operation no longer switches
  the CFA to be defined in terms of the FP.  (The operation still can
  be used to define the location of the clobber area.)

- In addition, on targets that choose not to define a FP register, the
  UnwindInst::PushFrameRegs operation is not supported.

- There is a new operation UnwindInst::StackAlloc that needs to be
  called on targets without FP whenever the stack pointer is updated.
  This caused the CFA offset to be adjusted accordingly.  (On
  targets with FP this operation is a no-op.)
2021-04-14 15:32:31 +02:00
Andrew Brown
45bee40f33 wasi-nn: use the newly-published Rust bindings for wasi-nn
The bindings are now published on [crates.io](https://crates.io/crates/wasi-nn) and have been moved to their [own repository](https://github.com/bytecodealliance/wasi-nn).
2021-04-13 16:00:06 -07:00
Andrew Brown
e9e4afe2c7 wasi-nn: use the MobileNet model instead of AlexNet
The MobileNet model is significantly smaller in size (14MB) than the AlexNet model (233MB); this change should reduce bandwidth used during CI.
2021-04-13 16:00:06 -07:00
Chris Fallin
27b3162f87 Merge pull request #2833 from abrown/2826
x64: fix Inst::store to understand all scalar types
2021-04-13 15:36:41 -07:00
Chris Fallin
8caac9ed79 Merge pull request #2823 from akirilov-arm/callee_saves
Cranelift AArch64: Improve the handling of callee-saved registers
2021-04-13 15:35:46 -07:00
Chris Fallin
f222802b7a Merge pull request #2828 from bjorn3/fix_srem_i8
Fix srem.{i8,i16}
2021-04-13 15:35:40 -07:00
Andrew Brown
6bdef48473 x64: refactor to use Inst::store during lowering
This re-factoring replaces uses of `Inst::mov_r_m` with `Inst::store` to ensure there is only one code location to troubleshoot when generating store instructions for a specific type.
2021-04-13 13:09:07 -07:00
Andrew Brown
9b25b06d86 x64: store to all scalar sizes
Previously, `Inst::store` only understood a subset of the scalar types, which resulted in failures seen in #2826. This change allows `Inst::store` to generate instructions for all scalar widths (`8 | 16 | 32 | 64`) since all of these are supported in the emission code of `Inst::MovRM`.
2021-04-13 12:38:35 -07:00
bjorn3
b272d4b7da Fix srem.{i8,i16} 2021-04-13 21:28:27 +02:00
Anton Kirilov
7248abd591 Cranelift AArch64: Improve the handling of callee-saved registers
SIMD & FP registers are now saved and restored in pairs, similarly
to general-purpose registers. Also, only the bottom 64 bits of the
registers are saved and restored (in case of non-Baldrdash ABIs),
which is the requirement from the Procedure Call Standard for the
Arm 64-bit Architecture.

As for the callee-saved general-purpose registers, if a procedure
needs to save and restore an odd number of them, it no longer uses
store and load pair instructions for the last register.

Copyright (c) 2021, Arm Limited.
2021-04-13 20:23:08 +01:00
Chris Fallin
8387bc0d76 Merge pull request #2830 from cfallin/pin-nightly
CI: pin nightly Rust version to limit breakages to explicit pinning updates.
2021-04-13 12:06:44 -07:00
Chris Fallin
908d47011f CI: pin nightly Rust version to limit breakages to explicit pinning updates. 2021-04-13 11:12:07 -07:00
Chris Fallin
67cc42d4c3 Merge pull request #2750 from bjorn3/anon_allocs
Support declaring anonymous functions and data objects
2021-04-12 12:11:11 -07:00
Nick Fitzgerald
2a32567871 Merge pull request #2821 from alexcrichton/faster-vmoffsets
Precompute fields in `VMOffsets`
2021-04-08 14:17:11 -07:00
Alex Crichton
18dd82ba7d Improve signature lookup happening during instantiation (#2818)
This commit is intended to be a perf improvement for instantiation of
modules with lots of functions. Previously the `lookup_shared_signature`
callback was showing up quite high in profiles as part of instantiation.

As some background, this callback is used to translate from a module's
`SignatureIndex` to a `VMSharedSignatureIndex` which the instance
stores. This callback is called for two reasons, one is to translate all
of the module's own types into `VMSharedSignatureIndex` for the purposes
of `call_indirect` (the translation of that loads from this table to
compare indices). The second reason is that a `VMCallerCheckedAnyfunc`
is prepared for all functions and this embeds a `VMSharedSignatureIndex`
inside of it.

The slow part today is that the lookup callback was called
once-per-function and each lookup involved hashing a full
`WasmFuncType`. Albeit our hash algorithm is still Rust's default
SipHash algorithm which is quite slow, but we also shouldn't need to
re-hash each signature if we see it multiple times anyway.

The fix applied in this commit is to change this lookup callback to an
`enum` where one variant is that there's a table to lookup from. This
table is a `PrimaryMap` which means that lookup is quite fast. The only
thing we need to do is to prepare the table ahead of time. Currently
this happens on the instantiation path because in my measurments the
creation of the table is quite fast compared to the rest of
instantiation. If this becomes an issue, though, we can look into
creating the table as part of `SigRegistry::register_module` and caching
it somewhere (I'm not entirely sure where but I'm sure we can figure it
out).

There's in generally not a ton of efficiency around the `SigRegistry`
type. I'm hoping though that this fixes the next-lowest-hanging-fruit in
terms of performance without complicating the implementation too much. I
tried a few variants and this change seemed like the best balance
between simplicity and still a nice performance gain.

Locally I measured an improvement in instantiation time for a large-ish
module by reducing the time from ~3ms to ~2.6ms per instance.
2021-04-08 15:04:18 -05:00
Alex Crichton
c91e14d83f Precompute fields in VMOffsets
This commit updates the implementation of `VMOffsets` to frontload all
checked arithmetic on construction of the `VMOffsets` which allows
eliding all checked arithmetic when accessing the fields of `VMOffsets`.
For testing and such this adds a new constructor as well from a new
`VMOffsetsFields` structure which is a clone of the old definition.

This should help speed up some profile hot spots I've been seeing where
with all the checked arithmetic on field sizes this was slowing down the
various accessors during instantiation (which uses `VMOffsets` to
initialize various fields of the `VMContext`).
2021-04-08 12:46:17 -07:00
Andrew Brown
8e495ac79d x64: match multiple ISA requirements before emitting
Because there are instructions that are present in more than one ISA feature set, we need to see if any of the ISA requirements match before emitting. This change includes the `VPABSQ` instruction as an example, which is present in both `AVX512F` and `AVX512VL`.
2021-04-08 10:30:39 -07:00
Alex Crichton
c77ea0c5c7 Add some more #[inline] annotations for trivial functions (#2817)
Looking at some profiles these or their related functions were all
showing up, so this commit adds `#[inline]` to allow cross-crate
inlining by default.
2021-04-08 12:23:54 -05:00
Alex Crichton
5c4c03d278 Don't hash for the cache if it's disabled (#2816)
This fixes an issue where even if the wasmtime cache was disabled we're
still calculating the sha256 of modules for the hash key. This hash
was then simply discarded if the cache was disabled!
2021-04-08 11:48:24 -05:00
Alex Crichton
812c8ef368 Update wast to 35.0.2 (#2815)
Fixes a panic or two in translation found by fuzzing.
2021-04-08 11:45:10 -05:00
Peter Huene
2ca97ed7e4 Merge pull request #2814 from fitzgen/inline-vm-offsets
wasmtime-environ: Mark all VM offset functions as `#[inline]`
2021-04-07 18:16:29 -07:00
Peter Huene
45a500701f Merge pull request #2811 from peterhuene/improve-store-registration
Refactor store frame information.
2021-04-07 18:16:10 -07:00
Peter Huene
ad9fa11d48 Code review feedback.
* Remove `once-cell` dependency.
* Remove function address `BTreeMap` from `CompiledModule` in favor of binary
  searching finished functions directly.
* Use `with_capacity` when populating `CompiledModule` finished functions and
  trampolines.
2021-04-07 16:37:04 -07:00
Nick Fitzgerald
ed31f28158 wasmtime-environ: Mark all VM offset functions as #[inline]
Otherwise they won't get inlined across crates unless we enable LTO, and much of
the usage of these function is across crates (eg from the `wasmtime-runtime`
crate).
2021-04-07 16:17:26 -07:00
Alex Crichton
e43d94033f Document guidance around multithreading and Wasmtime (#2812)
* Document guidance around multithreading and Wasmtime

This commit writes a page of documentation for the Wasmtime book to
serve as guidance for embedders looking to add multithreading with
Wasmtime support. As always with any safe Rust API this reading is
optional because you can't mis-use Wasmtime without `unsafe`, but I'm
hoping that this documentation can serve as a point of reference for
folks who want to add multithreading but are confused/annoyed that
Wasmtime's types do not implement the `Send` and `Sync` traits.

Closes #793

* I can type
2021-04-07 16:34:07 -05:00
Peter Huene
875cb92cf0 Refactor store frame information.
This commit refactors the store frame information to eliminate the copying of
data out from `CompiledModule`.

It also moves the population of a `BTreeMap` out of the frame information and
into `CompiledModule` where it is only ever calculated once rather than at
every new module instantiation into a `Store`. The map is also lazy-initialized
so the cost of populating the map is incurred only when a trap occurs.

This should help improve instantiation time of modules with a large number of
functions and functions with lots of instructions.
2021-04-07 12:47:04 -07:00
Alex Crichton
195bf0e29a Fully support multiple returns in Wasmtime (#2806)
* Fully support multiple returns in Wasmtime

For quite some time now Wasmtime has "supported" multiple return values,
but only in the mose bare bones ways. Up until recently you couldn't get
a typed version of functions with multiple return values, and never have
you been able to use `Func::wrap` with functions that return multiple
values. Even recently where `Func::typed` can call functions that return
multiple values it uses a double-indirection by calling a trampoline
which calls the real function.

The underlying reason for this lack of support is that cranelift's ABI
for returning multiple values is not possible to write in Rust. For
example if a wasm function returns two `i32` values there is no Rust (or
C!) function you can write to correspond to that. This commit, however
fixes that.

This commit adds two new ABIs to Cranelift: `WasmtimeSystemV` and
`WasmtimeFastcall`. The intention is that these Wasmtime-specific ABIs
match their corresponding ABI (e.g. `SystemV` or `WindowsFastcall`) for
everything *except* how multiple values are returned. For multiple
return values we simply define our own version of the ABI which Wasmtime
implements, which is that for N return values the first is returned as
if the function only returned that and the latter N-1 return values are
returned via an out-ptr that's the last parameter to the function.

These custom ABIs provides the ability for Wasmtime to bind these in
Rust meaning that `Func::wrap` can now wrap functions that return
multiple values and `Func::typed` no longer uses trampolines when
calling functions that return multiple values. Although there's lots of
internal changes there's no actual changes in the API surface area of
Wasmtime, just a few more impls of more public traits which means that
more types are supported in more places!

Another change made with this PR is a consolidation of how the ABI of
each function in a wasm module is selected. The native `SystemV` ABI,
for example, is more efficient at returning multiple values than the
wasmtime version of the ABI (since more things are in more registers).
To continue to take advantage of this Wasmtime will now classify some
functions in a wasm module with the "fast" ABI. Only functions that are
not reachable externally from the module are classified with the fast
ABI (e.g. those not exported, used in tables, or used with `ref.func`).
This should enable purely internal functions of modules to have a faster
calling convention than those which might be exposed to Wasmtime itself.

Closes #1178

* Tweak some names and add docs

* "fix" lightbeam compile

* Fix TODO with dummy environ

* Unwind info is a property of the target, not the ABI

* Remove lightbeam unused imports

* Attempt to fix arm64

* Document new ABIs aren't stable

* Fix filetests to use the right target

* Don't always do 64-bit stores with cranelift

This was overwriting upper bits when 32-bit registers were being stored
into return values, so fix the code inline to do a sized store instead
of one-size-fits-all store.

* At least get tests passing on the old backend

* Fix a typo

* Add some filetests with mixed abi calls

* Get `multi` example working

* Fix doctests on old x86 backend

* Add a mixture of wasmtime/system_v tests
2021-04-07 12:34:26 -05:00
Benjamin Bouvier
7588565078 Tweaks some tests for Mac aarch64
- some tests don't pass because of bad interactions with the system's
libunwind; ignore them for now.
- the page size on mac aarch64 is 16K, not 4K; tweak some tests which
were expecting 4K or multiples of 4K pages to use a multiple of host page size
instead.
- a cranelift-native test needed an update for the new calling convention.
2021-04-07 14:54:50 +02:00
Chris Fallin
6b77786a6e Merge pull request #2805 from cfallin/release-0.26.0
Release 0.26.0.
2021-04-05 12:01:09 -07:00
Chris Fallin
6bec13da04 Bump versions: Wasmtime to 0.26.0, Cranelift to 0.73.0. 2021-04-05 10:48:42 -07:00
Chris Fallin
ae19bd5daf Update RELEASES.md. 2021-04-05 10:17:33 -07:00
Chris Fallin
8d78212a15 Merge pull request #2718 from cfallin/new-backend
Switch default to new x86_64 backend.
2021-04-05 09:38:08 -07:00
Alex Crichton
04bf6e5bbb Move some scopes around to fix a leak on raising a trap (#2803)
Some recent refactorings accidentally had a local `Store` on the stack
when a longjmp was initiated, bypassing its destructor and causing
`Store` to leak.

Closes #2802
2021-04-05 10:29:18 -05:00
Peter Huene
4d036a4e34 Merge pull request #2801 from peterhuene/fix-initialize-instance-range
Fix incorrect range in `ininitialize_instance`.
2021-04-02 17:30:35 -07:00
Peter Huene
37bb7af454 Fix incorrect range in ininitialize_instance.
This commit fixes a bug where the wrong destination range was used when copying
data from the module's memory initialization upon instance initialization.

This affects the on-demand allocator only when using the `uffd` feature on
Linux and when the Wasm page being initialized is not the last in the module's
initial pages.

Fixes #2784.
2021-04-02 16:27:22 -07:00
Chris Fallin
cb48ea406e Switch default to new x86_64 backend.
This PR switches the default backend on x86, for both the
`cranelift-codegen` crate and for Wasmtime, to the new
(`MachInst`-style, `VCode`-based) backend that has been under
development and testing for some time now.

The old backend is still available by default in builds with the
`old-x86-backend` feature, or by requesting `BackendVariant::Legacy`
from the appropriate APIs.

As part of that switch, it adds some more runtime-configurable plumbing
to the testing infrastructure so that tests can be run using the
appropriate backend. `clif-util test` is now capable of parsing a
backend selector option from filetests and instantiating the correct
backend.

CI has been updated so that the old x86 backend continues to run its
tests, just as we used to run the new x64 backend separately.

At some point, we will remove the old x86 backend entirely, once we are
satisfied that the new backend has not caused any unforeseen issues and
we do not need to revert.
2021-04-02 11:35:53 -07:00
Peter Huene
b7b47e380d Merge pull request #2791 from peterhuene/compile-command
Add a compile command to Wasmtime.
2021-04-02 11:18:14 -07:00
Andrew Brown
d32501c554 x64: refactor REX-specific encoding machinery to its own module
In preparation for adding new encoding modes to the x64 backend (e.g. VEX,
EVEX), this change moves all of the current instruction encoding functions to
`encodings::rex`. This refactor does not change any logic.
2021-04-02 11:17:39 -07:00
Chris Fallin
074b83eee4 Merge pull request #2800 from wrbs/switch-bug
cranelift-frontend: Seal unsealed block in switch generation
2021-04-02 10:32:49 -07:00
Alex Crichton
29949505d6 Fix printing float results from the CLI (#2797)
Previously their bit patterns were printed interpreted as decimals, now
they're printed as floats.
2021-04-02 10:07:59 -05:00
Will Robson
da33496063 Seal the block that should have been sealed 2021-04-02 15:35:12 +01:00
Will Robson
f5f3c2fb25 Add test cases to show the unsealed block in switch generation 2021-04-02 15:34:49 +01:00
Peter Huene
0ddfe97a09 Change how flags are stored in serialized modules.
This commit changes how both the shared flags and ISA flags are stored in the
serialized module to detect incompatibilities when a serialized module is
instantiated.

It improves the error reporting when a compiled module has mismatched shared
flags.
2021-04-01 21:39:57 -07:00
Peter Huene
4ad0099da4 Update wat crate.
Update the `wat` crate to latest version and use `Error::set_path` in
`Module::from_file` to properly record the path associated with errors.
2021-04-01 20:11:26 -07:00
Peter Huene
3da03bcfcf Code review feedback.
* Expand doc comment on `Engine::precompile_module`.
* Add FIXME comment regarding a future ISA flag compatibility check before
  doing a JIT from `Module::from_binary`.
* Remove no-longer-needed CLI groups from the `compile` command.
2021-04-01 19:38:20 -07:00
Peter Huene
9e7d2fed98 Sort output in wasmtime settings.
This commit sorts the settings output by the `wasmtime settings` command.
2021-04-01 19:38:19 -07:00
Peter Huene
d1313b1291 Code review feedback.
* Move `Module::compile` to `Engine::precompile_module`.
* Remove `Module::deserialize` method.
* Make `Module::serialize` the same format as `Engine::precompile_module`.
* Make `Engine::precompile_module` return a `Vec<u8>`.
* Move the remaining serialization-related code to `serialization.rs`.
2021-04-01 19:38:19 -07:00