Commit Graph

220 Commits

Author SHA1 Message Date
Nick Fitzgerald
a1f4b46f64 Bump Wasmtime to version 0.30.0; cranelift to 0.77.0 2021-09-17 10:33:50 -07:00
Nick Fitzgerald
d2ce1ac753 Fix a use-after-free bug when passing ExternRefs to Wasm
We _must not_ trigger a GC when moving refs from host code into
Wasm (e.g. returned from a host function or passed as arguments to a Wasm
function). After insertion into the table, this reference is no longer
rooted. If multiple references are being sent from the host into Wasm and we
allowed GCs during insertion, then the following events could happen:

* Reference A is inserted into the activations table. This does not trigger a
  GC, but does fill the table to capacity.

* The caller's reference to A is removed. Now the only reference to A is from
  the activations table.

* Reference B is inserted into the activations table. Because the table is at
  capacity, a GC is triggered.

* A is reclaimed because the only reference keeping it alive was the activation
  table's reference (it isn't inside any Wasm frames on the stack yet, so stack
  scanning and stack maps don't increment its reference count).

* We transfer control to Wasm, giving it A and B. Wasm uses A. That's a use
  after free.

To prevent uses after free, we cannot GC when moving refs into the
`VMExternRefActivationsTable` because we are passing them from the host to Wasm.

On the other hand, when we are *cloning* -- as opposed to moving -- refs from
the host to Wasm, then it is fine to GC while inserting into the activations
table, because the original referent that we are cloning from is still alive and
rooting the ref.
2021-09-14 14:23:42 -07:00
Alex Crichton
a237e73b5a Remove some allocations in CodeMemory (#3253)
* Remove some allocations in `CodeMemory`

This commit removes the `FinishedFunctions` type as well as allocations
associated with trampolines when allocating inside of a `CodeMemory`.
The main goal of this commit is to improve the time spent in
`CodeMemory` where currently today a good portion of time is spent
simply parsing symbol names and trying to extract function indices from
them. Instead this commit implements a new strategy (different from #3236)
where compilation records offset/length information for all
functions/trampolines so this doesn't need to be re-learned from the
object file later.

A consequence of this commit is that this offset information will be
decoded/encoded through `bincode` unconditionally, but we can also
optimize that later if necessary as well.

Internally this involved quite a bit of refactoring since the previous
map for `FinishedFunctions` was relatively heavily relied upon.

* comments
2021-08-30 10:35:17 -05:00
Alex Crichton
12515e6646 Move trap information to a section of the compiled image (#3241)
This commit moves the `traps` field of `FunctionInfo` into a section of
the compiled artifact produced by Cranelift. This section is quite large
and when previously encoded/decoded with `bincode` this can take quite
some time to process. Traps are expected to be relatively rare and it's
not necessarily the right tradeoff to spend so much time
serializing/deserializing this data, so this commit offloads the section
into a custom-encoded binary format located elsewhere in the compiled image.

This is similar to #3240 in its goal which is to move very large pieces
of metadata to their own sections to avoid decoding anything when we
load a precompiled modules. This also has a small benefit that it's
slightly more efficient storage for the trap information too, but that's
a negligible benefit.

This is part of #3230 to make loading modules fast.
2021-08-27 01:09:55 -05:00
Alex Crichton
fc91176685 Move address maps to a section of the compiled image (#3240)
This commit moves the `address_map` field of `FunctionInfo` into a
custom-encoded section of the executable. The goal of this commit is, as
previous commits, to push less data through `bincode`. The `address_map`
field is actually extremely large and has huge benefits of not being
decoded when we load a module. This data is only used for traps and such
as well, so it's not overly important that it's massaged in to precise
data the runtime can extremely speedily use.

The `FunctionInfo` type does retain a tiny bit of information about the
function itself (it's start source location), but other than that the
`FunctionAddressMap` structure is moved from `wasmtime-environ` to
`wasmtime-cranelift` since it's now no longer needed outside of that
context.
2021-08-26 23:06:41 -05:00
Alex Crichton
7d05ebe7ff Move wasm data/debuginfo into the ELF compilation image (#3235)
* Move wasm data/debuginfo into the ELF compilation image

This commit moves existing allocations of `Box<[u8]>` stored separately
from compilation's final ELF image into the ELF image itself. The goal
of this commit is to reduce the amount of data which `bincode` will need
to process in the future. DWARF debugging information and wasm data
segments can be quite large, and they're relatively rarely read, so
there's typically no need to copy them around. Instead by moving them
into the ELF image this opens up the opportunity in the future to
eliminate copies and use data directly as-found in the image itself.

For information accessed possibly-multiple times, such as the wasm data
ranges, the indexes of the data within the ELF image are computed when
a `CompiledModule` is created. These indexes are then used to directly
index into the image without having to root around in the ELF file each
time they're accessed.

One other change located here is that the symbolication context
previously cloned the debug information into it to adhere to the
`'static` lifetime safely, but this isn't actually ever used in
`wasmtime` right now so the unsafety around this has been removed and
instead borrowed data is returned (no more clones, yay!).

* Fix lightbeam
2021-08-25 09:03:07 -05:00
Alex Crichton
a662f5361d Move wasm data sections out of wasmtime_environ::Module (#3231)
* Reduce indentation in `to_paged`

Use a few early-returns from `match` to avoid lots of extra indentation.

* Move wasm data sections out of `wasmtime_environ::Module`

This is the first step down the road of #3230. The long-term goal is
that `Module` is always `bincode`-decoded, but wasm data segments are a
possibly very-large portion of this residing in modules which we don't
want to shove through bincode. This refactors the internals of wasmtime
to be ok with this data living separately from the `Module` itself,
providing access at necessary locations.

Wasm data segments are now extracted from a wasm module and
concatenated directly. Data sections then describe ranges within this
concatenated list of data, and passive data works the same way. This
implementation does not lend itself to eventually optimizing the case
where passive data is dropped and no longer needed. That's left for a
future PR.
2021-08-24 14:04:03 -05:00
Alex Crichton
f3977f1d97 Fix determinism of compiled modules (#3229)
* Fix determinism of compiled modules

Currently wasmtime's compilation artifacts are not deterministic due to
the usage of `HashMap` during serialization which has randomized order
of its elements. This commit fixes that by switching to a sorted
`BTreeMap` for various maps. A test is also added to ensure determinism.

If in the future the performance of `BTreeMap` is not as good as
`HashMap` for some of these cases we can implement a fancier
`serialize_with`-style solution where we sort keys during serialization,
but only during serialization and otherwise use a `HashMap`.

* fix lightbeam
2021-08-23 17:08:19 -05:00
Alex Crichton
eb21ae149a Move definition of ModuleMemoryOffset (#3228)
This was historically defined in `wasmtime-environ` but it's only used
in `wasmtime-cranelift`, so this commit moves the definition to the
`debug` module where it's primarily used.
2021-08-23 14:42:21 -05:00
Alex Crichton
f5041dd362 Implement a setting for reserved dynamic memory growth (#3215)
* Implement a setting for reserved dynamic memory growth

Dynamic memories aren't really that heavily used in Wasmtime right now
because for most 32-bit memories they're classified as "static" which
means they reserve 4gb of address space and never move. Growth of a
static memory is simply making pages accessible, so it's quite fast.

With the memory64 feature, however, this is no longer true since all
memory64 memories are classified as "dynamic" at this time. Previous to
this commit growth of a dynamic memory unconditionally moved the entire
linear memory in the host's address space, always resulting in a new
`Mmap` allocation. This behavior is causing fuzzers to time out when
working with 64-bit memories because incrementally growing a memory by 1
page at a time can incur a quadratic time complexity as bytes are
constantly moved.

This commit implements a scheme where there is now a tunable setting for
memory to be reserved at the end of a dynamic memory to grow into. This
means that dynamic memory growth is ideally amortized as most calls to
`memory.grow` will be able to grow into the pre-reserved space. Some
calls, though, will still need to copy the memory around.

This helps enable a commented out test for 64-bit memories now that it's
fast enough to run in debug mode. This is because the growth of memory
in the test no longer needs to copy 4gb of zeros.

* Test fixes & review comments

* More comments
2021-08-20 10:54:23 -05:00
Alex Crichton
f1793934d6 Disable default features of gimli (#3208)
* Disable default features of `gimli`

For cranelift-less builds this avoids pulling in extra dependencies into
`gimli` that we don't need, improving build times slightly.

* Enable read features where necessary
2021-08-19 10:30:18 -05:00
Alex Crichton
87c33c2969 Remove wasmtime-environ's dependency on cranelift-codegen (#3199)
* Move `CompiledFunction` into wasmtime-cranelift

This commit moves the `wasmtime_environ::CompiledFunction` type into the
`wasmtime-cranelift` crate. This type has lots of Cranelift-specific
pieces of compilation and doesn't need to be generated by all Wasmtime
compilers. This replaces the usage in the `Compiler` trait with a
`Box<Any>` type that each compiler can select. Each compiler must still
produce a `FunctionInfo`, however, which is shared information we'll
deserialize for each module.

The `wasmtime-debug` crate is also folded into the `wasmtime-cranelift`
crate as a result of this commit. One possibility was to move the
`CompiledFunction` commit into its own crate and have `wasmtime-debug`
depend on that, but since `wasmtime-debug` is Cranelift-specific at this
time it didn't seem like it was too too necessary to keep it separate.
If `wasmtime-debug` supports other backends in the future we can
recreate a new crate, perhaps with it refactored to not depend on
Cranelift.

* Move wasmtime_environ::reference_type

This now belongs in wasmtime-cranelift and nowhere else

* Remove `Type` reexport in wasmtime-environ

One less dependency on `cranelift-codegen`!

* Remove `types` reexport from `wasmtime-environ`

Less cranelift!

* Remove `SourceLoc` from wasmtime-environ

Change the `srcloc`, `start_srcloc`, and `end_srcloc` fields to a custom
`FilePos` type instead of `ir::SourceLoc`. These are only used in a few
places so there's not much to lose from an extra abstraction for these
leaf use cases outside of cranelift.

* Remove wasmtime-environ's dep on cranelift's `StackMap`

This commit "clones" the `StackMap` data structure in to
`wasmtime-environ` to have an independent representation that that
chosen by Cranelift. This allows Wasmtime to decouple this runtime
dependency of stack map information and let the two evolve
independently, if necessary.

An alternative would be to refactor cranelift's implementation into a
separate crate and have wasmtime depend on that but it seemed a bit like
overkill to do so and easier to clone just a few lines for this.

* Define code offsets in wasmtime-environ with `u32`

Don't use Cranelift's `binemit::CodeOffset` alias to define this field
type since the `wasmtime-environ` crate will be losing the
`cranelift-codegen` dependency soon.

* Commit to using `cranelift-entity` in Wasmtime

This commit removes the reexport of `cranelift-entity` from the
`wasmtime-environ` crate and instead directly depends on the
`cranelift-entity` crate in all referencing crates. The original reason
for the reexport was to make cranelift version bumps easier since it's
less versions to change, but nowadays we have a script to do that.
Otherwise this encourages crates to use whatever they want from
`cranelift-entity` since  we'll always depend on the whole crate.

It's expected that the `cranelift-entity` crate will continue to be a
lean crate in dependencies and suitable for use at both runtime and
compile time. Consequently there's no need to avoid its usage in
Wasmtime at runtime, since "remove Cranelift at compile time" is
primarily about the `cranelift-codegen` crate.

* Remove most uses of `cranelift-codegen` in `wasmtime-environ`

There's only one final use remaining, which is the reexport of
`TrapCode`, which will get handled later.

* Limit the glob-reexport of `cranelift_wasm`

This commit removes the glob reexport of `cranelift-wasm` from the
`wasmtime-environ` crate. This is intended to explicitly define what
we're reexporting and is a transitionary step to curtail the amount of
dependencies taken on `cranelift-wasm` throughout the codebase. For
example some functions used by debuginfo mapping are better imported
directly from the crate since they're Cranelift-specific. Note that
this is intended to be a temporary state affairs, soon this reexport
will be gone entirely.

Additionally this commit reduces imports from `cranelift_wasm` and also
primarily imports from `crate::wasm` within `wasmtime-environ` to get a
better sense of what's imported from where and what will need to be
shared.

* Extract types from cranelift-wasm to cranelift-wasm-types

This commit creates a new crate called `cranelift-wasm-types` and
extracts type definitions from the `cranelift-wasm` crate into this new
crate. The purpose of this crate is to be a shared definition of wasm
types that can be shared both by compilers (like Cranelift) as well as
wasm runtimes (e.g. Wasmtime). This new `cranelift-wasm-types` crate
doesn't depend on `cranelift-codegen` and is the final step in severing
the unconditional dependency from Wasmtime to `cranelift-codegen`.

The final refactoring in this commit is to then reexport this crate from
`wasmtime-environ`, delete the `cranelift-codegen` dependency, and then
update all `use` paths to point to these new types.

The main change of substance here is that the `TrapCode` enum is
mirrored from Cranelift into this `cranelift-wasm-types` crate. While
this unfortunately results in three definitions (one more which is
non-exhaustive in Wasmtime itself) it's hopefully not too onerous and
ideally something we can patch up in the future.

* Get lightbeam compiling

* Remove unnecessary dependency

* Fix compile with uffd

* Update publish script

* Fix more uffd tests

* Rename cranelift-wasm-types to wasmtime-types

This reflects the purpose a bit more where it's types specifically
intended for Wasmtime and its support.

* Fix publish script
2021-08-18 13:14:52 -05:00
Alex Crichton
03a3a5939a Move module translation from cranelift to wasmtime (#3196)
The main purpose for doing this is that this is a large piece of
functionality used by Wasmtime which is entirely independent of
Cranelift. Eventually Wasmtime wants to be able to compile without
Cranelift, but it can't also depend on `cranelift-wasm` in that
situation for module translation which means that something needs to
happen. One option is to refactor what's in `cranelift-wasm` into a
separate crate (since all these pieces don't actually depend on
`cranelift-codegen`), but I personally chose to not do this because:

* The `ModuleEnvironment` trait, AFAIK, only has a primary user of
  Wasmtime. The Spidermonkey integration, for example, does not use this.

* This is an extra layer of abstraction between Wasmtime and the
  compilation phase which was a bit of a pain to maintain. It couldn't
  be Wasmtime-specific as it was part of Cranelift but at the same time
  it had lots of Wasmtime-centric functionality (such as module
  linking).

* Updating the "dummy" implementation has become pretty onerous over
  time as frequent additions are made and the "dummy" implementation was
  never actually used anywhere. This ended up feeling like effectively
  busy-work to update this.

For these reasons I've opted to to move the meat of `cranelift-wasm`
used by `wasmtime-environ` directly into `wasmtime-environ`. This means
that the only real meat that Wasmtime uses from `cranelift-wasm` is the
function-translation bits in the `wasmtime-cranelift` crate.

The changes in `wasmtime-environ` are largely to inline module parsing
together so it's a bit easier to follow instead of trying to connect
the dots between lots of various function calls.
2021-08-18 12:15:02 -05:00
Alex Crichton
e8aa7bb53b Reimplement how unwind information is stored (#3180)
* Reimplement how unwind information is stored

This commit is a major refactoring of how unwind information is stored
after compilation of a function has finished. Previously we would store
the raw `UnwindInfo` as a result of compilation and this would get
serialized/deserialized alongside the rest of the ELF object that
compilation creates. Whenever functions were registered with
`CodeMemory` this would also result in registering unwinding information
dynamically at runtime, which in the case of Unix, for example, would
dynamically created FDE/CIE entries on-the-fly.

Eventually I'd like to support compiling Wasmtime without Cranelift, but
this means that `UnwindInfo` wouldn't be easily available to decode into
and create unwinding information from. To solve this I've changed the
ELF object created to have the unwinding information encoded into it
ahead-of-time so loading code into memory no longer needs to create
unwinding tables. This change has two different implementations for
Windows/Unix:

* On Windows the implementation was much easier. The unwinding
  information on Windows is already stored after the function itself in
  the text section. This was actually slightly duplicated in object
  building and in code memory allocation. Now the object building
  continues to do the same, recording unwinding information after
  functions, and code memory no longer manually tracks this.
  Additionally Wasmtime will emit a special custom section in the object
  file with unwinding information which is the list of
  `RUNTIME_FUNCTION` structures that `RtlAddFunctionTable` expects. This
  means that the object file has all the information precompiled into it
  and registration at runtime is simply passing a few pointers around to
  the runtime.

* Unix was a little bit more difficult than Windows. Today a `.eh_frame`
  section is created on-the-fly with offsets in FDEs specified as the
  absolute address that functions are loaded at. This absolute
  address hindered the ability to precompile the FDE into the object
  file itself. I've switched how addresses are encoded, though, to using
  `DW_EH_PE_pcrel` which means that FDE addresses are now specified
  relative to the FDE itself. This means that we can maintain a fixed
  offset between the `.eh_frame` loaded in memory and the beginning of
  code memory. When doing so this enables precompiling the `.eh_frame`
  section into the object file and at runtime when loading an object no
  further construction of unwinding information is needed.

The overall result of this commit is that unwinding information is no
longer stored in its cranelift-data-structure form on disk. This means
that this unwinding information format is only present during
compilation, which will make it that much easier to compile out
cranelift in the future.

This commit also significantly refactors `CodeMemory` since the way
unwinding information is handled is not much different from before.
Previously `CodeMemory` was suitable for incrementally adding more and
more functions to it, but nowadays a `CodeMemory` either lives per
module (in which case all functions are known up front) or it's created
once-per-`Func::new` with two trampolines. In both cases we know all
functions up front so the functionality of incrementally adding more and
more segments is no longer needed. This commit removes the ability to
add a function-at-a-time in `CodeMemory` and instead it can now only
load objects in their entirety. A small helper function is added to
build a small object file for trampolines in `Func::new` to handle
allocation there.

Finally, this commit also folds the `wasmtime-obj` crate directly into
the `wasmtime-cranelift` crate and its builder structure to be more
amenable to this strategy of managing unwinding tables.

It is not intentional to have any real functional change as a result of
this commit. This might accelerate loading a module from cache slightly
since less work is needed to manage the unwinding information, but
that's just a side benefit from the main goal of this commit which is to
remove the dependence on cranelift unwinding information being available
at runtime.

* Remove isa reexport from wasmtime-environ

* Trim down reexports of `cranelift-codegen`

Remove everything non-essential so that only the bits which will need to
be refactored out of cranelift remain.

* Fix debug tests

* Review comments
2021-08-17 17:14:18 -05:00
Alex Crichton
0313e30d76 Remove dependency on TargetIsa from Wasmtime crates (#3178)
This commit started off by deleting the `cranelift_codegen::settings`
reexport in the `wasmtime-environ` crate and then basically played
whack-a-mole until everything compiled again. The main result of this is
that the `wasmtime-*` family of crates have generally less of a
dependency on the `TargetIsa` trait and type from Cranelift. While the
dependency isn't entirely severed yet this is at least a significant
start.

This commit is intended to be largely refactorings, no functional
changes are intended here. The refactorings are:

* A `CompilerBuilder` trait has been added to `wasmtime_environ` which
  server as an abstraction used to create compilers and configure them
  in a uniform fashion. The `wasmtime::Config` type now uses this
  instead of cranelift-specific settings. The `wasmtime-jit` crate
  exports the ability to create a compiler builder from a
  `CompilationStrategy`, which only works for Cranelift right now. In a
  cranelift-less build of Wasmtime this is expected to return a trait
  object that fails all requests to compile.

* The `Compiler` trait in the `wasmtime_environ` crate has been souped
  up with a number of methods that Wasmtime and other crates needed.

* The `wasmtime-debug` crate is now moved entirely behind the
  `wasmtime-cranelift` crate.

* The `wasmtime-cranelift` crate is now only depended on by the
  `wasmtime-jit` crate.

* Wasm types in `cranelift-wasm` no longer contain their IR type,
  instead they only contain the `WasmType`. This is required to get
  everything to align correctly but will also be required in a future
  refactoring where the types used by `cranelift-wasm` will be extracted
  to a separate crate.

* I moved around a fair bit of code in `wasmtime-cranelift`.

* Some gdb-specific jit-specific code has moved from `wasmtime-debug` to
  `wasmtime-jit`.
2021-08-16 09:55:39 -05:00
Alex Crichton
e9f33fc618 Move all trampoline compilation to wasmtime-cranelift (#3176)
* Move all trampoline compilation to `wasmtime-cranelift`

This commit moves compilation of all the trampolines used in wasmtime
behind the `Compiler` trait object to live in `wasmtime-cranelift`. The
long-term goal of this is to enable depending on cranelift *only* from
the `wasmtime-cranelift` crate, so by moving these dependencies we
should make that a little more flexible.

* Fix windows build
2021-08-12 16:58:21 -05:00
Alex Crichton
e68aa99588 Implement the memory64 proposal in Wasmtime (#3153)
* Implement the memory64 proposal in Wasmtime

This commit implements the WebAssembly [memory64 proposal][proposal] in
both Wasmtime and Cranelift. In terms of work done Cranelift ended up
needing very little work here since most of it was already prepared for
64-bit memories at one point or another. Most of the work in Wasmtime is
largely refactoring, changing a bunch of `u32` values to something else.

A number of internal and public interfaces are changing as a result of
this commit, for example:

* Acessors on `wasmtime::Memory` that work with pages now all return
  `u64` unconditionally rather than `u32`. This makes it possible to
  accommodate 64-bit memories with this API, but we may also want to
  consider `usize` here at some point since the host can't grow past
  `usize`-limited pages anyway.

* The `wasmtime::Limits` structure is removed in favor of
  minimum/maximum methods on table/memory types.

* Many libcall intrinsics called by jit code now unconditionally take
  `u64` arguments instead of `u32`. Return values are `usize`, however,
  since the return value, if successful, is always bounded by host
  memory while arguments can come from any guest.

* The `heap_addr` clif instruction now takes a 64-bit offset argument
  instead of a 32-bit one. It turns out that the legalization of
  `heap_addr` already worked with 64-bit offsets, so this change was
  fairly trivial to make.

* The runtime implementation of mmap-based linear memories has changed
  to largely work in `usize` quantities in its API and in bytes instead
  of pages. This simplifies various aspects and reflects that
  mmap-memories are always bound by `usize` since that's what the host
  is using to address things, and additionally most calculations care
  about bytes rather than pages except for the very edge where we're
  going to/from wasm.

Overall I've tried to minimize the amount of `as` casts as possible,
using checked `try_from` and checked arithemtic with either error
handling or explicit `unwrap()` calls to tell us about bugs in the
future. Most locations have relatively obvious things to do with various
implications on various hosts, and I think they should all be roughly of
the right shape but time will tell. I mostly relied on the compiler
complaining that various types weren't aligned to figure out
type-casting, and I manually audited some of the more obvious locations.
I suspect we have a number of hidden locations that will panic on 32-bit
hosts if 64-bit modules try to run there, but otherwise I think we
should be generally ok (famous last words). In any case I wouldn't want
to enable this by default naturally until we've fuzzed it for some time.

In terms of the actual underlying implementation, no one should expect
memory64 to be all that fast. Right now it's implemented with
"dynamic" heaps which have a few consequences:

* All memory accesses are bounds-checked. I'm not sure how aggressively
  Cranelift tries to optimize out bounds checks, but I suspect not a ton
  since we haven't stressed this much historically.

* Heaps are always precisely sized. This means that every call to
  `memory.grow` will incur a `memcpy` of memory from the old heap to the
  new. We probably want to at least look into `mremap` on Linux and
  otherwise try to implement schemes where dynamic heaps have some
  reserved pages to grow into to help amortize the cost of
  `memory.grow`.

The memory64 spec test suite is scheduled to now run on CI, but as with
all the other spec test suites it's really not all that comprehensive.
I've tried adding more tests for basic things as I've had to implement
guards for them, but I wouldn't really consider the testing adequate
from just this PR itself. I did try to take care in one test to actually
allocate a 4gb+ heap and then avoid running that in the pooling
allocator or in emulation because otherwise that may fail or take
excessively long.

[proposal]: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md

* Fix some tests

* More test fixes

* Fix wasmtime tests

* Fix doctests

* Revert to 32-bit immediate offsets in `heap_addr`

This commit updates the generation of addresses in wasm code to always
use 32-bit offsets for `heap_addr`, and if the calculated offset is
bigger than 32-bits we emit a manual add with an overflow check.

* Disable memory64 for spectest fuzzing

* Fix wrong offset being added to heap addr

* More comments!

* Clarify bytes/pages
2021-08-12 09:40:20 -05:00
Alex Crichton
85f16f488d Consolidate address calculations for atomics (#3143)
* Consolidate address calculations for atomics

This commit consolidates all calcuations of guest addresses into one
`prepare_addr` function. This notably remove the atomics-specifics paths
as well as the `prepare_load` function (now renamed to `prepare_addr`
and folded into `get_heap_addr`).

The goal of this commit is to simplify how addresses are managed in the
code generator for atomics to use all the shared infrastrucutre of other
loads/stores as well. This additionally fixes #3132 via the use of
`heap_addr` in clif for all operations.

I also added a number of tests for loads/stores with varying alignments.
Originally I was going to allow loads/stores to not be aligned since
that's what the current formal specification says, but the overview of
the threads proposal disagrees with the formal specification, so I
figured I'd leave it as-is but adding tests probably doesn't hurt.

Closes #3132

* Fix old backend

* Guarantee misalignment checks happen before out-of-bounds
2021-08-04 15:57:56 -05:00
Alex Crichton
a33caec9be Bump the wasm-tools crates (#3139)
* Bump the wasm-tools crates

Pulls in some updates here and there, mostly for updating crates to the
latest version to prepare for later memory64 work.

* Update lightbeam
2021-08-04 09:53:47 -05:00
Chris Fallin
a13a777230 Bump to Wasmtime v0.29.0 and Cranelift 0.76.0. 2021-08-02 11:24:09 -07:00
Alex Crichton
63a3bbbf5a Change VMMemoryDefinition::current_length to usize (#3134)
* Change VMMemoryDefinition::current_length to `usize`

This commit changes the definition of
`VMMemoryDefinition::current_length` to `usize` from its previous
definition of `u32`. This is a pretty impactful change because it also
changes the cranelift semantics of "dynamic" heaps where the bound
global value specifier must now match the pointer type for the platform
rather than the index type for the heap.

The motivation for this change is that the `current_length` field (or
bound for the heap) is intended to reflect the current size of the heap.
This is bound by `usize` on the host platform rather than `u32` or`
u64`. The previous choice of `u32` couldn't represent a 4GB memory
because we couldn't put a number representing 4GB into the
`current_length` field. By using `usize`, which reflects the host's
memory allocation, this should better reflect the size of the heap and
allows Wasmtime to support a full 4GB heap for a wasm program (instead
of 4GB minus one page).

This commit also updates the legalization of the `heap_addr` clif
instruction to appropriately cast the address to the platform's pointer
type, handling bounds checks along the way. The practical impact for
today's targets is that a `uextend` is happening sooner than it happened
before, but otherwise there is no intended impact of this change. In the
future when 64-bit memories are supported there will likely need to be
fancier logic which handles offsets a bit differently (especially in the
case of a 64-bit memory on a 32-bit host).

The clif `filetest` changes should show the differences in codegen, and
the Wasmtime changes are largely removing casts here and there.

Closes #3022

* Add tests for memory.size at maximum memory size

* Add a dfg helper method
2021-08-02 13:09:40 -05:00
Nick Fitzgerald
3d76cbdf34 Update gimli to 0.25; addr2line to 0.16 2021-07-26 11:04:53 -07:00
Alex Crichton
73fd702bb7 Don't assume all custom sections are dwarf info (#3083)
This incorrectly assumed that we had unparsed dwarf information,
regardless of custom section name. This commit updates the logic to
calculate that by first checking the section name before we set the flag
indicating that there's unparsed debuginfo.
2021-07-13 15:53:17 -05:00
Alex Crichton
992d85ae8b Add a type parameter to VMOffsets for pointer size (#3020)
* Add a type parameter to `VMOffsets` for pointer size

This commit adds a type parameter to `VMOffsets` representing the
pointer size to improve computations in `wasmtime-runtime` which always
use a constant value of the host's pointer size. The type parameter is
`u8` for `wasmtime-cranelift`'s use case where cross-compilation may be
involved.

* fix lightbeam
2021-07-13 09:52:27 -05:00
Alex Crichton
aa5d837428 Start a high-level architecture document for Wasmtime (#3019)
* Start a high-level architecture document for Wasmtime

This commit cleands up some existing documentation by removing a number
of "noop README files" and starting a high-level overview of the
architecture of Wasmtime. I've placed this documentation under the
contributing section of the book since it seems most useful for possible
contributors.

I've surely left some things out in this pass, and am happy to add more!

* Review comments

* More rewording

* typos
2021-07-02 09:02:26 -05:00
Alex Crichton
a273add815 Simplify the list of builtin intrinsics Wasmtime needs
This commit slims down the list of builtin intrinsics. It removes the
duplicated intrinsics for imported and locally defined items, instead
always using one intrinsic for both. This was previously inconsistently
applied where some intrinsics got two copies (one for imported one for
local) and other intrinsics got only one copy. This does add an extra
branch in intrinsics since they need to determine whether something is
local or not, but that's generally much lower cost than the intrinsics
themselves.

This also removes the `memory32_size` intrinsic, instead inlining the
codegen directly into the clif IR. This matches what the `table.size`
instruction does and removes the need for a few functions on a
`wasmtime_runtime::Instance`.
2021-06-23 10:30:31 -07:00
Alex Crichton
7ce46043dc Add guard pages to the front of linear memories (#2977)
* Add guard pages to the front of linear memories

This commit implements a safety feature for Wasmtime to place guard
pages before the allocation of all linear memories. Guard pages placed
after linear memories are typically present for performance (at least)
because it can help elide bounds checks. Guard pages before a linear
memory, however, are never strictly needed for performance or features.
The intention of a preceding guard page is to help insulate against bugs
in Cranelift or other code generators, such as CVE-2021-32629.

This commit adds a `Config::guard_before_linear_memory` configuration
option, defaulting to `true`, which indicates whether guard pages should
be present both before linear memories as well as afterwards. Guard
regions continue to be controlled by
`{static,dynamic}_memory_guard_size` methods.

The implementation here affects both on-demand allocated memories as
well as the pooling allocator for memories. For on-demand memories this
adjusts the size of the allocation as well as adjusts the calculations
for the base pointer of the wasm memory. For the pooling allocator this
will place a singular extra guard region at the very start of the
allocation for memories. Since linear memories in the pooling allocator
are contiguous every memory already had a preceding guard region in
memory, it was just the previous memory's guard region afterwards. Only
the first memory needed this extra guard.

I've attempted to write some tests to help test all this, but this is
all somewhat tricky to test because the settings are pretty far away
from the actual behavior. I think, though, that the tests added here
should help cover various use cases and help us have confidence in
tweaking the various `Config` settings beyond their defaults.

Note that this also contains a semantic change where
`InstanceLimits::memory_reservation_size` has been removed. Instead this
field is now inferred from the `static_memory_maximum_size` and guard
size settings. This should hopefully remove some duplication in these
settings, canonicalizing on the guard-size/static-size settings as the
way to control memory sizes and virtual reservations.

* Update config docs

* Fix a typo

* Fix benchmark

* Fix wasmtime-runtime tests

* Fix some more tests

* Try to fix uffd failing test

* Review items

* Tweak 32-bit defaults

Makes the pooling allocator a bit more reasonable by default on 32-bit
with these settings.
2021-06-18 09:57:08 -05:00
Alex Crichton
5140fd251a Update wasm-tools crates (#2989)
* Update wasm-tools crates

This brings in recent updates, notably including more improvements to
wasm-smith which will hopefully help exercise non-trapping wasm more.

* Fix some wat
2021-06-15 22:56:10 -05:00
Alex Crichton
e8b8947956 Bump to 0.28.0 (#2972) 2021-06-09 14:00:13 -05:00
Alex Crichton
7a1b7cdf92 Implement RFC 11: Redesigning Wasmtime's APIs (#2897)
Implement Wasmtime's new API as designed by RFC 11. This is quite a large commit which has had lots of discussion externally, so for more information it's best to read the RFC thread and the PR thread.
2021-06-03 09:10:53 -05:00
Chris Fallin
88455007b2 Bump Wasmtime to v0.27.0 and Cranelift to v0.74.0. 2021-05-20 14:06:41 -07:00
Olivier Lemasle
b5f29bd3b2 Update wasm-tools crates (#2908)
wasmparser 0.78 adds the Unknown name subsection type.
2021-05-17 10:08:17 -05:00
Benjamin Bouvier
d7053ea9c7 Upgrade to the latest versions of gimli, addr2line, object (#2901)
* Upgrade to the latest versions of gimli, addr2line, object

And adapt to API changes. New gimli supports wasm dwarf, resulting in
some simplifications in the debug crate.

* upgrade gimli usage in linux-specific profiling too

* Add "continue" statement after interpreting a wasm local dwarf opcode
2021-05-12 10:53:17 -05:00
bjorn3
84c79982e7 Remove unnecessary dependencies
Found using cargo-udeps
2021-05-04 13:51:26 +02:00
Chris Fallin
b89c959e4a Merge pull request #2854 from uweigand/debug-endian
debug: Support big-endian architectures
2021-04-27 10:27:22 -07:00
Ulrich Weigand
801358333d debug: Support big-endian architectures
This fixes some hard-coded assumptions in the debug crate that
the native ELF files being accessed are little-endian; specifically
in create_gdbjit_image as well as in emit_dwarf.

In addition, data in WebAssembly memory always uses little-endian
byte order.  Therefore, if the native architecture is big-endian,
all references to base types need to be marked as little-endian
using the DW_AT_endianity attribute, so that the debugger will
be able to correctly access them.
2021-04-21 14:14:59 +02:00
Alex Crichton
196bcec6cf Process declared element segments for "possibly exported funcs" (#2851)
Now that we're using "possibly exported" as an impactful decision for
codegen (which trampolines to generate and which ABI a function has)
it's important that we calculate this property of a wasm function
correctly! Previously Wasmtime forgot to processed "declared" elements
in apart from active/passive element segments, but this updates Wasmtime
to ensure that these entries are processed and all the functions
contained within are flagged as "possibly exported".

Closes #2850
2021-04-20 16:52:51 -05:00
Alex Crichton
193551a8d6 Optimize table.init instruction and instantiation (#2847)
* Optimize `table.init` instruction and instantiation

This commit optimizes table initialization as part of instance
instantiation and also applies the same optimization to the `table.init`
instruction. One part of this commit is to remove some preexisting
duplication between instance instantiation and the `table.init`
instruction itself, after this the actual implementation of `table.init`
is optimized to effectively have fewer bounds checks in fewer places and
have a much tighter loop for instantiation.

A big fallout from this change is that memory/table initializer offsets
are now stored as `u32` instead of `usize` to remove a few casts in a
few places. This ended up requiring moving some overflow checks that
happened in parsing to later in code itself because otherwise the wrong
spec test errors are emitted during testing. I've tried to trace where
these can possibly overflow but I think that I managed to get
everything.

In a local synthetic test where an empty module with a single 80,000
element initializer this improves total instantiation time by 4x (562us
=> 141us)

* Review comments
2021-04-19 18:44:48 -05:00
Peter Huene
b775b68cfb Make module information lookup from runtime safe.
This commit uses a two-phase lookup of stack map information from modules
rather than giving back raw pointers to stack maps.

First the runtime looks up information about a module from a pc value, which
returns an `Arc` it keeps a reference on while completing the stack map lookup.

Second it then queries the module information for the stack map from a pc
value, getting a reference to the stack map (which is now safe because of the
`Arc` held by the runtime).
2021-04-16 12:30:10 -07:00
Peter Huene
ea72c621f0 Remove the stack map registry.
This commit removes the stack map registry and instead uses the existing
information from the store's module registry to lookup stack maps.

A trait is now used to pass the lookup context to the runtime, implemented by
`Store` to do the lookup.

With this change, module registration in `Store` is now entirely limited to
inserting the module into the module registry.
2021-04-16 11:08:21 -07:00
Alex Crichton
c91e14d83f Precompute fields in VMOffsets
This commit updates the implementation of `VMOffsets` to frontload all
checked arithmetic on construction of the `VMOffsets` which allows
eliding all checked arithmetic when accessing the fields of `VMOffsets`.
For testing and such this adds a new constructor as well from a new
`VMOffsetsFields` structure which is a clone of the old definition.

This should help speed up some profile hot spots I've been seeing where
with all the checked arithmetic on field sizes this was slowing down the
various accessors during instantiation (which uses `VMOffsets` to
initialize various fields of the `VMContext`).
2021-04-08 12:46:17 -07:00
Alex Crichton
c77ea0c5c7 Add some more #[inline] annotations for trivial functions (#2817)
Looking at some profiles these or their related functions were all
showing up, so this commit adds `#[inline]` to allow cross-crate
inlining by default.
2021-04-08 12:23:54 -05:00
Nick Fitzgerald
ed31f28158 wasmtime-environ: Mark all VM offset functions as #[inline]
Otherwise they won't get inlined across crates unless we enable LTO, and much of
the usage of these function is across crates (eg from the `wasmtime-runtime`
crate).
2021-04-07 16:17:26 -07:00
Alex Crichton
195bf0e29a Fully support multiple returns in Wasmtime (#2806)
* Fully support multiple returns in Wasmtime

For quite some time now Wasmtime has "supported" multiple return values,
but only in the mose bare bones ways. Up until recently you couldn't get
a typed version of functions with multiple return values, and never have
you been able to use `Func::wrap` with functions that return multiple
values. Even recently where `Func::typed` can call functions that return
multiple values it uses a double-indirection by calling a trampoline
which calls the real function.

The underlying reason for this lack of support is that cranelift's ABI
for returning multiple values is not possible to write in Rust. For
example if a wasm function returns two `i32` values there is no Rust (or
C!) function you can write to correspond to that. This commit, however
fixes that.

This commit adds two new ABIs to Cranelift: `WasmtimeSystemV` and
`WasmtimeFastcall`. The intention is that these Wasmtime-specific ABIs
match their corresponding ABI (e.g. `SystemV` or `WindowsFastcall`) for
everything *except* how multiple values are returned. For multiple
return values we simply define our own version of the ABI which Wasmtime
implements, which is that for N return values the first is returned as
if the function only returned that and the latter N-1 return values are
returned via an out-ptr that's the last parameter to the function.

These custom ABIs provides the ability for Wasmtime to bind these in
Rust meaning that `Func::wrap` can now wrap functions that return
multiple values and `Func::typed` no longer uses trampolines when
calling functions that return multiple values. Although there's lots of
internal changes there's no actual changes in the API surface area of
Wasmtime, just a few more impls of more public traits which means that
more types are supported in more places!

Another change made with this PR is a consolidation of how the ABI of
each function in a wasm module is selected. The native `SystemV` ABI,
for example, is more efficient at returning multiple values than the
wasmtime version of the ABI (since more things are in more registers).
To continue to take advantage of this Wasmtime will now classify some
functions in a wasm module with the "fast" ABI. Only functions that are
not reachable externally from the module are classified with the fast
ABI (e.g. those not exported, used in tables, or used with `ref.func`).
This should enable purely internal functions of modules to have a faster
calling convention than those which might be exposed to Wasmtime itself.

Closes #1178

* Tweak some names and add docs

* "fix" lightbeam compile

* Fix TODO with dummy environ

* Unwind info is a property of the target, not the ABI

* Remove lightbeam unused imports

* Attempt to fix arm64

* Document new ABIs aren't stable

* Fix filetests to use the right target

* Don't always do 64-bit stores with cranelift

This was overwriting upper bits when 32-bit registers were being stored
into return values, so fix the code inline to do a sized store instead
of one-size-fits-all store.

* At least get tests passing on the old backend

* Fix a typo

* Add some filetests with mixed abi calls

* Get `multi` example working

* Fix doctests on old x86 backend

* Add a mixture of wasmtime/system_v tests
2021-04-07 12:34:26 -05:00
Chris Fallin
6bec13da04 Bump versions: Wasmtime to 0.26.0, Cranelift to 0.73.0. 2021-04-05 10:48:42 -07:00
Peter Huene
0ddfe97a09 Change how flags are stored in serialized modules.
This commit changes how both the shared flags and ISA flags are stored in the
serialized module to detect incompatibilities when a serialized module is
instantiated.

It improves the error reporting when a compiled module has mismatched shared
flags.
2021-04-01 21:39:57 -07:00
Peter Huene
abf3bf29f9 Add a wasmtime settings command to print Cranelift settings.
This commit adds the `wasmtime settings` command to print out available
Cranelift settings for a target (defaults to the host).

The compile command has been updated to remove the Cranelift ISA options in
favor of encouraging users to use `wasmtime settings` to discover what settings
are available.  This will reduce the maintenance cost for syncing the compile
command with Cranelift ISA flags.
2021-04-01 19:38:19 -07:00
Peter Huene
29d366db7b Add a compile command to Wasmtime.
This commit adds a `compile` command to the Wasmtime CLI.

The command can be used to Ahead-Of-Time (AOT) compile WebAssembly modules.

With the `all-arch` feature enabled, AOT compilation can be performed for
non-native architectures (i.e. cross-compilation).

The `Module::compile` method has been added to perform AOT compilation.

A few of the CLI flags relating to "on by default" Wasm features have been
changed to be "--disable-XYZ" flags.

A simple example of using the `wasmtime compile` command:

```text
$ wasmtime compile input.wasm
$ wasmtime input.cwasm
```
2021-04-01 19:38:18 -07:00
Alex Crichton
211731b876 Update wasm-tools crates (#2773)
Brings in some fuzzing-related bug-fixes
2021-03-25 18:44:31 -05:00
Alex Crichton
654156714c Deduplicate function signatures in wasm modules (#2772)
Currently wasmtime will generate a `SignatureIndex`-per-type in the
module itself, even if the module itself declares the same type multiple
times. To make matters worse if the same type is declared across
multiple modules used in a module-linking-using-module then the
signature will be recorded each time it's declared.

This commit adds a simple map to module translation to deduplicate these
function types. This should improve the performance of module-linking
graphs where the same function type may be declared in a number of
modules. For modules that don't use module linking this adds an extra
map that's not used too often, but the time spent managing it should be
dwarfed by other compile tasks.
2021-03-25 18:44:22 -05:00