Commit Graph

45 Commits

Author SHA1 Message Date
Alex Crichton
2af358dd9c Add a VMComponentContext type and create it on instantiation (#4215)
* Add a `VMComponentContext` type and create it on instantiation

This commit fills out the `wasmtime-runtime` crate's support for
`VMComponentContext` and creates it as part of the instantiation
process. This moves a few maps that were temporarily allocated in an
`InstanceData` into the `VMComponentContext` and additionally reads the
canonical options data from there instead.

This type still won't be used in its "full glory" until the lowering of
host functions is completely implemented, however, which will be coming
in a future commit.

* Remove `DerefMut` implementation

* Rebase conflicts
2022-06-03 13:34:50 -05:00
Alex Crichton
3ed6fae7b3 Add trampoline compilation support for lowered imports (#4206)
* Add trampoline compilation support for lowered imports

This commit adds support to the component model implementation for
compiling trampolines suitable for calling host imports. Currently this
is purely just the compilation side of things, modifying the
wasmtime-cranelift crate and additionally filling out a new
`VMComponentOffsets` type (similar to `VMOffsets`). The actual creation
of a `VMComponentContext` is still not performed and will be a
subsequent PR.

Internally though some tests are actually possible with this where we at
least assert that compilation of a component and creation of everything
in-memory doesn't panic or trip any assertions, so some tests are added
here for that as well.

* Fix some test errors
2022-06-03 10:01:42 -05:00
Alex Crichton
2a4851ad2b Change some VMContext pointers to () pointers (#4190)
* Change some `VMContext` pointers to `()` pointers

This commit is motivated by my work on the component model
implementation for imported functions. Currently all context pointers in
wasm are `*mut VMContext` but with the component model my plan is to
make some pointers instead along the lines of `*mut VMComponentContext`.
In doing this though one worry I have is breaking what has otherwise
been a core invariant of Wasmtime for quite some time, subtly
introducing bugs by accident.

To help assuage my worry I've opted here to erase knowledge of
`*mut VMContext` where possible. Instead where applicable a context
pointer is simply known as `*mut ()` and the embedder doesn't actually
know anything about this context beyond the value of the pointer. This
will help prevent Wasmtime from accidentally ever trying to interpret
this context pointer as an actual `VMContext` when it might instead be a
`VMComponentContext`.

Overall this was a pretty smooth transition. The main change here is
that the `VMTrampoline` (now sporting more docs) has its first argument
changed to `*mut ()`. The second argument, the caller context, is still
configured as `*mut VMContext` though because all functions are always
called from wasm still. Eventually for component-to-component calls I
think we'll probably "fake" the second argument as the same as the first
argument, losing track of the original caller, as an intentional way of
isolating components from each other.

Along the way there are a few host locations which do actually assume
that the first argument is indeed a `VMContext`. These are valid
assumptions that are upheld from a correct implementation, but I opted
to add a "magic" field to `VMContext` to assert this in debug mode. This
new "magic" field is inintialized during normal vmcontext initialization
and it's checked whenever a `VMContext` is reinterpreted as an
`Instance` (but only in debug mode). My hope here is to catch any future
accidental mistakes, if ever.

* Use a VMOpaqueContext wrapper

* Fix typos
2022-06-01 11:00:43 -05:00
Alex Crichton
a02a609528 Make ValRaw fields private (#4186)
* Make `ValRaw` fields private

Force accessing to go through constructors and accessors to localize the
knowledge about little-endian-ness. This is spawned since I made a
mistake in #4039 about endianness.

* Fix some tests

* Component model changes
2022-05-24 19:14:29 -05:00
Alex Crichton
51d82aebfd Store the ValRaw type in little-endian format (#4035)
* Store the `ValRaw` type in little-endian format

This commit changes the internal representation of the `ValRaw` type to
an unconditionally little-endian format instead of its current
native-endian format. The documentation and various accessors here have
been updated as well as the associated trampolines that read `ValRaw`
to always work with little-endian values, converting to the host
endianness as necessary.

The motivation for this change originally comes from the implementation
of the component model that I'm working on. One aspect of the component
model's canonical ABI is how variants are passed to functions as
immediate arguments. For example for a component model function:

```
foo: function(x: expected<i32, f64>)
```

This translates to a core wasm function:

```wasm
(module
  (func (export "foo") (param i32 i64)
    ;; ...
  )
)
```

The first `i32` parameter to the core wasm function is the discriminant
of whether the result is an "ok" or an "err". The second `i64`, however,
is the "join" operation on the `i32` and `f64` payloads. Essentially
these two types are unioned into one type to get passed into the function.

Currently in the implementation of the component model my plan is to
construct a `*mut [ValRaw]` to pass through to WebAssembly, always
invoking component exports through host trampolines. This means that the
implementation for `Result<T, E>` needs to do the correct "join"
operation here when encoding a particular case into the corresponding
`ValRaw`.

I personally found this particularly tricky to do structurally. The
solution that I settled on with fitzgen was that if `ValRaw` was always
stored in a little endian format then we could employ a trick where when
encoding a variant we first set all the `ValRaw` slots to zero, then the
associated case we have is encoding. Afterwards the `ValRaw` values are
already encoded into the correct format as if they'd been "join"ed.

For example if we were to encode `Ok(1i32)` then this would produce
`ValRaw { i32: 1 }`, which memory-wise is equivalent to `ValRaw { i64: 1 }`
if the other bytes in the `ValRaw` are guaranteed to be zero. Similarly
storing `ValRaw { f64 }` is equivalent to the storage required for
`ValRaw { i64 }` here in the join operation.

Note, though, that this equivalence relies on everything being
little-endian. Otherwise the in-memory representations of `ValRaw { i32: 1 }`
and `ValRaw { i64: 1 }` are different.

That motivation is what leads to this change. It's expected that this is
a low-to-zero cost change in the sense that little-endian platforms will
see no change and big-endian platforms are already required to
efficiently byte-swap loads/stores as WebAssembly requires that.
Additionally the `ValRaw` type is an esoteric niche use case primarily
used for accelerating the C API right now, so it's expected that not
many users will have to update for this change.

* Track down some more endianness conversions
2022-04-14 13:09:32 -05:00
Alex Crichton
453feb6f82 Remove some dead code (#3970)
This commit removes methods that are never used between crates or trait
impls like `Clone` which may have been used one day but are no longer used.
2022-03-30 13:51:34 -05:00
Alex Crichton
c22033bf93 Delete historical interruptable support in Wasmtime (#3925)
* Delete historical interruptable support in Wasmtime

This commit removes the `Config::interruptable` configuration along with
the `InterruptHandle` type from the `wasmtime` crate. The original
support for adding interruption to WebAssembly was added pretty early on
in the history of Wasmtime when there was no other method to prevent an
infinite loop from the host. Nowadays, however, there are alternative
methods for interruption such as fuel or epoch-based interruption.

One of the major downsides of `Config::interruptable` is that even when
it's not enabled it forces an atomic swap to happen when entering
WebAssembly code. This technically could be a non-atomic swap if the
configuration option isn't enabled but that produces even more branch-y
code on entry into WebAssembly which is already something we try to
optimize. Calling into WebAssembly is on the order of a dozens of
nanoseconds at this time and an atomic swap, even uncontended, can add
up to 5ns on some platforms.

The main goal of this PR is to remove this atomic swap on entry into
WebAssembly. This is done by removing the `Config::interruptable` field
entirely, moving all existing consumers to epochs instead which are
suitable for the same purposes. This means that the stack overflow check
is no longer entangled with the interruption check and perhaps one day
we could continue to optimize that further as well.

Some consequences of this change are:

* Epochs are now the only method of remote-thread interruption.
* There are no more Wasmtime traps that produces the `Interrupted` trap
  code, although we may wish to move future traps to this so I left it
  in place.
* The C API support for interrupt handles was also removed and bindings
  for epoch methods were added.
* Function-entry checks for interruption are a tiny bit less efficient
  since one check is performed for the stack limit and a second is
  performed for the epoch as opposed to the `Config::interruptable`
  style of bundling the stack limit and the interrupt check in one. It's
  expected though that this is likely to not really be measurable.
* The old `VMInterrupts` structure is renamed to `VMRuntimeLimits`.
2022-03-14 15:25:11 -05:00
Alex Crichton
a25f7bdba5 Don't copy VMBuiltinFunctionsArray into each VMContext (#3741)
* Don't copy `VMBuiltinFunctionsArray` into each `VMContext`

This is another PR along the lines of "let's squeeze all possible
performance we can out of instantiation". Before this PR we would copy,
by value, the contents of `VMBuiltinFunctionsArray` into each
`VMContext` allocated. This array of function pointers is modestly-sized
but growing over time as we add various intrinsics. Additionally it's
the exact same for all `VMContext` allocations.

This PR attempts to speed up instantiation slightly by instead storing
an indirection to the function array. This means that calling a builtin
intrinsic is a tad bit slower since it requires two loads instead of one
(one to get the base pointer, another to get the actual address).
Otherwise though `VMContext` initialization is now simply setting one
pointer instead of doing a `memcpy` from one location to another.

With some macro-magic this commit also replaces the previous
implementation with one that's more `const`-friendly which also gets us
compile-time type-checks of libcalls as well as compile-time
verification that all libcalls are defined.

Overall, as with #3739, the win is very modest here. Locally I measured
a speedup from 1.9us to 1.7us taken to instantiate an empty module with
one function. While small at these scales it's still a 10% improvement!

* Review comments
2022-01-28 16:24:34 -06:00
Dan Gohman
881c19473d Use ptr::cast instead of as casts in several places. (#3507)
`ptr::cast` has the advantage of being unable to silently cast
`*const T` to `*mut T`. This turned up several places that were
performing such casts, which this PR also fixes.
2022-01-21 13:03:17 -08:00
Chris Fallin
8a55b5c563 Add epoch-based interruption for cooperative async timeslicing.
This PR introduces a new way of performing cooperative timeslicing that
is intended to replace the "fuel" mechanism. The tradeoff is that this
mechanism interrupts with less precision: not at deterministic points
where fuel runs out, but rather when the Engine enters a new epoch. The
generated code instrumentation is substantially faster, however, because
it does not need to do as much work as when tracking fuel; it only loads
the global "epoch counter" and does a compare-and-branch at backedges
and function prologues.

This change has been measured as ~twice as fast as fuel-based
timeslicing for some workloads, especially control-flow-intensive
workloads such as the SpiderMonkey JS interpreter on Wasm/WASI.

The intended interface is that the embedder of the `Engine` performs an
`engine.increment_epoch()` call periodically, e.g. once per millisecond.
An async invocation of a Wasm guest on a `Store` can specify a number of
epoch-ticks that are allowed before an async yield back to the
executor's event loop. (The initial amount and automatic "refills" are
configured on the `Store`, just as for fuel.) This call does only
signal-safe work (it increments an `AtomicU64`) so could be invoked from
a periodic signal, or from a thread that wakes up once per period.
2022-01-20 13:58:17 -08:00
Alex Crichton
bfdbd10a13 Add *_unchecked variants of Func APIs for the C API (#3350)
* Add `*_unchecked` variants of `Func` APIs for the C API

This commit is what is hopefully going to be my last installment within
the saga of optimizing function calls in/out of WebAssembly modules in
the C API. This is yet another alternative approach to #3345 (sorry) but
also contains everything necessary to make the C API fast. As in #3345
the general idea is just moving checks out of the call path in the same
style of `TypedFunc`.

This new strategy takes inspiration from previously learned attempts
effectively "just" exposes how we previously passed `*mut u128` through
trampolines for arguments/results. This storage format is formalized
through a new `ValRaw` union that is exposed from the `wasmtime` crate.
By doing this it made it relatively easy to expose two new APIs:

* `Func::new_unchecked`
* `Func::call_unchecked`

These are the same as their checked equivalents except that they're
`unsafe` and they work with `*mut ValRaw` rather than safe slices of
`Val`. Working with these eschews type checks and such and requires
callers/embedders to do the right thing.

These two new functions are then exposed via the C API with new
functions, enabling C to have a fast-path of calling/defining functions.
This fast path is akin to `Func::wrap` in Rust, although that API can't
be built in C due to C not having generics in the same way that Rust
has.

For some benchmarks, the benchmarks here are:

* `nop` - Call a wasm function from the host that does nothing and
  returns nothing.
* `i64` - Call a wasm function from the host, the wasm function calls a
  host function, and the host function returns an `i64` all the way out to
  the original caller.
* `many` - Call a wasm function from the host, the wasm calls
   host function with 5 `i32` parameters, and then an `i64` result is
   returned back to the original host
* `i64` host - just the overhead of the wasm calling the host, so the
  wasm calls the host function in a loop.
* `many` host - same as `i64` host, but calling the `many` host function.

All numbers in this table are in nanoseconds, and this is just one
measurement as well so there's bound to be some variation in the precise
numbers here.

| Name      | Rust | C (before) | C (after) |
|-----------|------|------------|-----------|
| nop       | 19   | 112        | 25        |
| i64       | 22   | 207        | 32        |
| many      | 27   | 189        | 34        |
| i64 host  | 2    | 38         | 5         |
| many host | 7    | 75         | 8         |

The main conclusion here is that the C API is significantly faster than
before when using the `*_unchecked` variants of APIs. The Rust
implementation is still the ceiling (or floor I guess?) for performance
The main reason that C is slower than Rust is that a little bit more has
to travel through memory where on the Rust side of things we can
monomorphize and inline a bit more to get rid of that. Overall though
the costs are way way down from where they were originally and I don't
plan on doing a whole lot more myself at this time. There's various
things we theoretically could do I've considered but implementation-wise
I think they'll be much more weighty.

* Tweak `wasmtime_externref_t` API comments
2021-09-24 14:05:45 -05:00
Alex Crichton
63a3bbbf5a Change VMMemoryDefinition::current_length to usize (#3134)
* Change VMMemoryDefinition::current_length to `usize`

This commit changes the definition of
`VMMemoryDefinition::current_length` to `usize` from its previous
definition of `u32`. This is a pretty impactful change because it also
changes the cranelift semantics of "dynamic" heaps where the bound
global value specifier must now match the pointer type for the platform
rather than the index type for the heap.

The motivation for this change is that the `current_length` field (or
bound for the heap) is intended to reflect the current size of the heap.
This is bound by `usize` on the host platform rather than `u32` or`
u64`. The previous choice of `u32` couldn't represent a 4GB memory
because we couldn't put a number representing 4GB into the
`current_length` field. By using `usize`, which reflects the host's
memory allocation, this should better reflect the size of the heap and
allows Wasmtime to support a full 4GB heap for a wasm program (instead
of 4GB minus one page).

This commit also updates the legalization of the `heap_addr` clif
instruction to appropriately cast the address to the platform's pointer
type, handling bounds checks along the way. The practical impact for
today's targets is that a `uextend` is happening sooner than it happened
before, but otherwise there is no intended impact of this change. In the
future when 64-bit memories are supported there will likely need to be
fancier logic which handles offsets a bit differently (especially in the
case of a 64-bit memory on a 32-bit host).

The clif `filetest` changes should show the differences in codegen, and
the Wasmtime changes are largely removing casts here and there.

Closes #3022

* Add tests for memory.size at maximum memory size

* Add a dfg helper method
2021-08-02 13:09:40 -05:00
Dan Gohman
784a380e5f Add comments about vmctx pointers in various datastructures. (#2925)
This forward-ports the relevant parts of #1396.
2021-07-27 09:33:27 -05:00
Alex Crichton
a273add815 Simplify the list of builtin intrinsics Wasmtime needs
This commit slims down the list of builtin intrinsics. It removes the
duplicated intrinsics for imported and locally defined items, instead
always using one intrinsic for both. This was previously inconsistently
applied where some intrinsics got two copies (one for imported one for
local) and other intrinsics got only one copy. This does add an extra
branch in intrinsics since they need to determine whether something is
local or not, but that's generally much lower cost than the intrinsics
themselves.

This also removes the `memory32_size` intrinsic, instead inlining the
codegen directly into the clif IR. This matches what the `table.size`
instruction does and removes the need for a few functions on a
`wasmtime_runtime::Instance`.
2021-06-23 10:30:31 -07:00
Ulrich Weigand
83007b79e3 Fix access to VMMemoryDefinition::current_length on big-endian (#3013)
The current_length member is defined as "usize" in Rust code,
but generated wasm code refers to it as if it were "u32".
While this happens to mostly work on little-endian machines
(as long as the length is < 4GB), it will always fail on
big-endian machines.

Fixed by making current_length "u32" in Rust as well, and
ensuring the actual memory size is always less than 4GB.
2021-06-23 11:45:32 -05:00
Alex Crichton
7a1b7cdf92 Implement RFC 11: Redesigning Wasmtime's APIs (#2897)
Implement Wasmtime's new API as designed by RFC 11. This is quite a large commit which has had lots of discussion externally, so for more information it's best to read the RFC thread and the PR thread.
2021-06-03 09:10:53 -05:00
Alex Crichton
c77ea0c5c7 Add some more #[inline] annotations for trivial functions (#2817)
Looking at some profiles these or their related functions were all
showing up, so this commit adds `#[inline]` to allow cross-crate
inlining by default.
2021-04-08 12:23:54 -05:00
Peter Huene
b58afbf849 Refactor module instantiation in the runtime.
This commit refactors module instantiation in the runtime to allow for
different instance allocation strategy implementations.

It adds an `InstanceAllocator` trait with the current implementation put behind
the `OnDemandInstanceAllocator` struct.

The Wasmtime API has been updated to allow a `Config` to have an instance
allocation strategy set which will determine how instances get allocated.

This change is in preparation for an alternative *pooling* instance allocator
that can reserve all needed host process address space in advance.

This commit also makes changes to the `wasmtime_environ` crate to represent
compiled modules in a way that reduces copying at instantiation time.
2021-03-04 18:18:50 -08:00
Alex Crichton
0e41861662 Implement limiting WebAssembly execution with fuel (#2611)
* Consume fuel during function execution

This commit adds codegen infrastructure necessary to instrument wasm
code to consume fuel as it executes. Currently nothing is really done
with the fuel, but that'll come in later commits.

The focus of this commit is to implement the codegen infrastructure
necessary to consume fuel and account for fuel consumed correctly.

* Periodically check remaining fuel in wasm JIT code

This commit enables wasm code to periodically check to see if fuel has
run out. When fuel runs out an intrinsic is called which can do what it
needs to do in the result of fuel running out. For now a trap is thrown
to have at least some semantics in synchronous stores, but another
planned use for this feature is for asynchronous stores to periodically
yield back to the host based on fuel running out.

Checks for remaining fuel happen in the same locations as interrupt
checks, which is to say the start of the function as well as loop
headers.

* Improve codegen by caching `*const VMInterrupts`

The location of the shared interrupt value and fuel value is through a
double-indirection on the vmctx (load through the vmctx and then load
through that pointer). The second pointer in this chain, however, never
changes, so we can alter codegen to account for this and remove some
extraneous load instructions and hopefully reduce some register
pressure even maybe.

* Add tests fuel can abort infinite loops

* More fuzzing with fuel

Use fuel to time out modules in addition to time, using fuzz input to
figure out which.

* Update docs on trapping instructions

* Fix doc links

* Fix a fuzz test

* Change setting fuel to adding fuel

* Fix a doc link

* Squelch some rustdoc warnings
2021-01-29 08:57:17 -06:00
Yury Delendik
3580205f12 [Cranelift][Atomics] Add address folding for atomic notify/wait. (#2556)
* fold address in wasm wait and notify ops

* add atomics addr folding tests
2021-01-08 11:55:21 -06:00
Alex Crichton
3887881800 Refactor how signatures/trampolines are stored in Store
This commit refactors where trampolines and signature information is
stored within a `Store`, namely moving them from
`wasmtime_runtime::Instance` instead to `Store` itself. The goal here is
to remove an allocation inside of an `Instance` and make them a bit
cheaper to create. Additionally this should open up future possibilities
like not creating duplicate trampolines for signatures already in the
`Store` when using `Func::new`.
2020-11-02 07:54:18 -08:00
Alex Crichton
e659d5cecd Add initial support for the multi-memory proposal (#2263)
This commit adds initial (gated) support for the multi-memory wasm
proposal. This was actually quite easy since almost all of wasmtime
already expected multi-memory to be implemented one day. The only real
substantive change is the `memory.copy` intrinsic changes, which now
accounts for the source/destination memories possibly being different.
2020-10-13 19:13:52 -05:00
Alex Crichton
3d2e0e55f2 Remove the local field of Module (#2091)
This was added long ago at this point to assist with caching, but
caching has moved to a different level such that this wonky second level
of a `Module` isn't necessary. This commit removes the `ModuleLocal`
type to simplify accessors and generally make it easier to work with.
2020-08-04 12:29:16 -05:00
Nick Fitzgerald
3555f97906 wasmtime: Implement table.fill
Part of #929
2020-07-02 16:59:07 -07:00
Nick Fitzgerald
bffd54c016 wasmtime: Implement global.{get,set} for externref globals (#1969)
* wasmtime: Implement `global.{get,set}` for externref globals

We use libcalls to implement these -- unlike `table.{get,set}`, for which we
create inline JIT fast paths -- because no known toolchain actually uses
externref globals.

Part of #929

* wasmtime: Enable `{extern,func}ref` globals in the API
2020-07-02 16:04:01 -05:00
Nick Fitzgerald
8c5f59c0cf wasmtime: Implement table.get and table.set
These instructions have fast, inline JIT paths for the common cases, and only
call out to host VM functions for the slow paths. This required some changes to
`cranelift-wasm`'s `FuncEnvironment`: instead of taking a `FuncCursor` to insert
an instruction sequence within the current basic block,
`FuncEnvironment::translate_table_{get,set}` now take a `&mut FunctionBuilder`
so that they can create whole new basic blocks. This is necessary for
implementing GC read/write barriers that involve branching (e.g. checking for
null, or whether a store buffer is at capacity).

Furthermore, it required that the `load`, `load_complex`, and `store`
instructions handle loading and storing through an `r{32,64}` rather than just
`i{32,64}` addresses. This involved making `r{32,64}` types acceptable
instantiations of the `iAddr` type variable, plus a few new instruction
encodings.

Part of #929
2020-06-30 12:00:57 -07:00
Nick Fitzgerald
58bb5dd953 wasmtime: Add support for func.ref and table.grow with funcrefs
`funcref`s are implemented as `NonNull<VMCallerCheckedAnyfunc>`.

This should be more efficient than using a `VMExternRef` that points at a
`VMCallerCheckedAnyfunc` because it gets rid of an indirection, dynamic
allocation, and some reference counting.

Note that the null function reference is *NOT* a null pointer; it is a
`VMCallerCheckedAnyfunc` that has a null `func_ptr` member.

Part of #929
2020-06-24 10:08:13 -07:00
Nick Fitzgerald
bbd99c5bfa reference types: Implement the table.size and table.grow instructions (#1894)
Part of #929
2020-06-18 08:57:18 -05:00
Alex Crichton
c9a0ba81a0 Implement interrupting wasm code, reimplement stack overflow (#1490)
* Implement interrupting wasm code, reimplement stack overflow

This commit is a relatively large change for wasmtime with two main
goals:

* Primarily this enables interrupting executing wasm code with a trap,
  preventing infinite loops in wasm code. Note that resumption of the
  wasm code is not a goal of this commit.

* Additionally this commit reimplements how we handle stack overflow to
  ensure that host functions always have a reasonable amount of stack to
  run on. This fixes an issue where we might longjmp out of a host
  function, skipping destructors.

Lots of various odds and ends end up falling out in this commit once the
two goals above were implemented. The strategy for implementing this was
also lifted from Spidermonkey and existing functionality inside of
Cranelift. I've tried to write up thorough documentation of how this all
works in `crates/environ/src/cranelift.rs` where gnarly-ish bits are.

A brief summary of how this works is that each function and each loop
header now checks to see if they're interrupted. Interrupts and the
stack overflow check are actually folded into one now, where function
headers check to see if they've run out of stack and the sentinel value
used to indicate an interrupt, checked in loop headers, tricks functions
into thinking they're out of stack. An interrupt is basically just
writing a value to a location which is read by JIT code.

When interrupts are delivered and what triggers them has been left up to
embedders of the `wasmtime` crate. The `wasmtime::Store` type has a
method to acquire an `InterruptHandle`, where `InterruptHandle` is a
`Send` and `Sync` type which can travel to other threads (or perhaps
even a signal handler) to get notified from. It's intended that this
provides a good degree of flexibility when interrupting wasm code. Note
though that this does have a large caveat where interrupts don't work
when you're interrupting host code, so if you've got a host import
blocking for a long time an interrupt won't actually be received until
the wasm starts running again.

Some fallout included from this change is:

* Unix signal handlers are no longer registered with `SA_ONSTACK`.
  Instead they run on the native stack the thread was already using.
  This is possible since stack overflow isn't handled by hitting the
  guard page, but rather it's explicitly checked for in wasm now. Native
  stack overflow will continue to abort the process as usual.

* Unix sigaltstack management is now no longer necessary since we don't
  use it any more.

* Windows no longer has any need to reset guard pages since we no longer
  try to recover from faults on guard pages.

* On all targets probestack intrinsics are disabled since we use a
  different mechanism for catching stack overflow.

* The C API has been updated with interrupts handles. An example has
  also been added which shows off how to interrupt a module.

Closes #139
Closes #860
Closes #900

* Update comment about magical interrupt value

* Store stack limit as a global value, not a closure

* Run rustfmt

* Handle review comments

* Add a comment about SA_ONSTACK

* Use `usize` for type of `INTERRUPTED`

* Parse human-readable durations

* Bring back sigaltstack handling

Allows libstd to print out stack overflow on failure still.

* Add parsing and emission of stack limit-via-preamble

* Fix new example for new apis

* Fix host segfault test in release mode

* Fix new doc example
2020-04-21 11:03:28 -07:00
Alex Crichton
7eea5d8d43 Optimize codegen in Func::wrap (#1491)
This commit optimizes the codegen of `Func::wrap` such that if you do
something like `Func::wrap(&store, || {})` then the shim generated
contains zero code (as expected). In general this means that the extra
tidbits generated by wasmtime are all eligible to be entirely optimized
away so long as you don't actually rely on something.
2020-04-10 12:52:06 -05:00
Alex Crichton
3e2be43502 Pre-generate trampoline functions (#957)
* Refactor wasmtime_runtime::Export

Instead of an enumeration with variants that have data fields have an
enumeration where each variant has a struct, and each struct has the
data fields. This allows us to store the structs in the `wasmtime` API
and avoid lots of `panic!` calls and various extraneous matches.

* Pre-generate trampoline functions

The `wasmtime` crate supports calling arbitrary function signatures in
wasm code, and to do this it generates "trampoline functions" which have
a known ABI that then internally convert to a particular signature's ABI
and call it. These trampoline functions are currently generated
on-the-fly and are cached in the global `Store` structure. This,
however, is suboptimal for a few reasons:

* Due to how code memory is managed each trampoline resides in its own
  64kb allocation of memory. This means if you have N trampolines you're
  using N * 64kb of memory, which is quite a lot of overhead!

* Trampolines are never free'd, even if the referencing module goes
  away. This is similar to #925.

* Trampolines are a source of shared state which prevents `Store` from
  being easily thread safe.

This commit refactors how trampolines are managed inside of the
`wasmtime` crate and jit/runtime internals. All trampolines are now
allocated in the same pass of `CodeMemory` that the main module is
allocated into. A trampoline is generated per-signature in a module as
well, instead of per-function. This cache of trampolines is stored
directly inside of an `Instance`. Trampolines are stored based on
`VMSharedSignatureIndex` so they can be looked up from the internals of
the `ExportFunction` value.

The `Func` API has been updated with various bits and pieces to ensure
the right trampolines are registered in the right places. Overall this
should ensure that all trampolines necessary are generated up-front
rather than lazily. This allows us to remove the trampoline cache from
the `Compiler` type, and move one step closer to making `Compiler`
threadsafe for usage across multiple threads.

Note that as one small caveat the `Func::wrap*` family of functions
don't need to generate a trampoline at runtime, they actually generate
the trampoline at compile time which gets passed in.

Also in addition to shuffling a lot of code around this fixes one minor
bug found in `code_memory.rs`, where `self.position` was loaded before
allocation, but the allocation may push a new chunk which would cause
`self.position` to be zero instead.

* Pass the `SignatureRegistry` as an argument to where it's needed.

This avoids the need for storing it in an `Arc`.

* Ignore tramoplines for functions with lots of arguments

Co-authored-by: Dan Gohman <sunfish@mozilla.com>
2020-03-12 16:17:48 -05:00
Nick Fitzgerald
674a6208d8 Implement data.drop and memory.init and get the rest of the bulk memory spec tests passing (#1264)
* Enable the already-passing `bulk-memoryoperations/imports.wast` test

* Implement support for the `memory.init` instruction and passive data

This adds support for passive data segments and the `memory.init` instruction
from the bulk memory operations proposal. Passive data segments are stored on
the Wasm module and then `memory.init` instructions copy their contents into
memory.

* Implement the `data.drop` instruction

This allows wasm modules to deallocate passive data segments that it doesn't
need anymore. We keep track of which segments have not been dropped on an
`Instance` and when dropping them, remove the entry from the instance's hash
map. The module always needs all of the segments for new instantiations.

* Enable final bulk memory operations spec test

This requires special casing an expected error message for an `assert_trap`,
since the expected error message contains the index of an uninitialized table
element, but our trap implementation doesn't save that diagnostic information
and shepherd it out.
2020-03-10 09:30:11 -05:00
Nick Fitzgerald
ef0cabf8b4 Address review feedback 2020-02-26 14:37:28 -08:00
Nick Fitzgerald
44c28612fb Implement the memory.fill instruction from the bulk memory proposal 2020-02-26 14:35:09 -08:00
Nick Fitzgerald
98ecef1700 Implement the memory.copy instruction from the bulk memory proposal 2020-02-26 14:35:09 -08:00
Nick Fitzgerald
cb97e4ec8e Implement table.init and elem.drop from the bulk memory proposal 2020-02-26 14:35:09 -08:00
Nick Fitzgerald
33b4a37bcb Add support for table.copy
This adds support for the `table.copy` instruction from the bulk memory
proposal. It also supports multiple tables, which were introduced by the
reference types proposal.

Part of #928
2020-02-26 14:30:43 -08:00
Alex Crichton
c8ab1e293e Improve robustness of cache loading/storing (#974)
* Improve robustness of cache loading/storing

Today wasmtime incorrectly loads compiled compiled modules from the
global cache when toggling settings such as optimizations. For example
if you execute `wasmtime foo.wasm` that will cache globally an
unoptimized version of the wasm module. If you then execute `wasmtime -O
foo.wasm` it would then reload the unoptimized version from cache, not
realizing the compilation settings were different, and use that instead.
This can lead to very surprising behavior naturally!

This commit updates how the cache is managed in an attempt to make it
much more robust against these sorts of issues. This takes a leaf out of
rustc's playbook and models the cache with a function that looks like:

    fn load<T: Hash>(
        &self,
        data: T,
        compute: fn(T) -> CacheEntry,
    ) -> CacheEntry;

The goal here is that it guarantees that all the `data` necessary to
`compute` the result of the cache entry is hashable and stored into the
hash key entry. This was previously open-coded and manually managed
where items were hashed explicitly, but this construction guarantees
that everything reasonable `compute` could use to compile the module is
stored in `data`, which is itself hashable.

This refactoring then resulted in a few workarounds and a few fixes,
including the original issue:

* The `Module` type was split into `Module` and `ModuleLocal` where only
  the latter is hashed. The previous hash function for a `Module` left
  out items like the `start_func` and didn't hash items like the imports
  of the module. Omitting the `start_func` was fine since compilation
  didn't actually use it, but omitting imports seemed uncomfortable
  because while compilation didn't use the import values it did use the
  *number* of imports, which seems like it should then be put into the
  cache key. The `ModuleLocal` type now derives `Hash` to guarantee that
  all of its contents affect the hash key.

* The `ModuleTranslationState` from `cranelift-wasm` doesn't implement
  `Hash` which means that we have a manual wrapper to work around that.
  This will be fixed with an upstream implementation, since this state
  affects the generated wasm code. Currently this is just a map of
  signatures, which is present in `Module` anyway, so we should be good
  for the time being.

* Hashing `dyn TargetIsa` was also added, where previously it was not
  fully hashed. Previously only the target name was used as part of the
  cache key, but crucially the flags of compilation were omitted (for
  example the optimization flags). Unfortunately the trait object itself
  is not hashable so we still have to manually write a wrapper to hash
  it, but we likely want to add upstream some utilities to hash isa
  objects into cranelift itself. For now though we can continue to add
  hashed fields as necessary.

Overall the goal here was to use the compiler to expose what we're not
hashing, and then make sure we organize data and write the right code to
ensure everything is hashed, and nothing more.

* Update crates/environ/src/module.rs

Co-Authored-By: Peter Huene <peterhuene@protonmail.com>

* Fix lightbeam

* Fix compilation of tests

* Update the expected structure of the cache

* Revert "Update the expected structure of the cache"

This reverts commit 2b53fee426a4e411c313d8c1e424841ba304a9cd.

* Separate the cache dir a bit

* Add a test the cache is busted with opt levels

* rustfmt

Co-authored-by: Peter Huene <peterhuene@protonmail.com>
2020-02-26 16:18:02 -06:00
Alex Crichton
47d6db0be8 Reel in unsafety around InstanceHandle (#856)
* Reel in unsafety around `InstanceHandle`

This commit is an attempt, or at least is targeted at being a start, at
reeling in the unsafety around the `InstanceHandle` type. Currently this
type represents a sort of moral `Rc<Instance>` but is a bit more
specialized since the underlying memory is allocated through mmap.

Additionally, though, `InstanceHandle` exposes a fundamental flaw in its
safety by safetly allowing mutable access so long as you have `&mut
InstanceHandle`. This type, however, is trivially created by simply
cloning a `InstanceHandle` to get an owned reference. This means that
`&mut InstanceHandle` does not actually provide any guarantees about
uniqueness, so there's no more safety than `&InstanceHandle` itself.

This commit removes all `&mut self` APIs from `InstanceHandle`,
additionally removing some where `&self` was `unsafe` and `&mut self`
was safe (since it was trivial to subvert this "safety"). In doing so
interior mutability patterns are now used much more extensively through
structures such as `Table` and `Memory`. Additionally a number of
methods were refactored to be a bit clearer and use helper functions
where possible.

This is a relatively large commit unfortunately, but it snowballed very
quickly into touching quite a few places. My hope though is that this
will prevent developers working on wasmtime internals as well as
developers still yet to migrate to the `wasmtime` crate from falling
into trivial unsafe traps by accidentally using `&mut` when they can't.
All existing users relying on `&mut` will need to migrate to some form
of interior mutability, such as using `RefCell` or `Cell`.

This commit also additionally marks `InstanceHandle::new` as an `unsafe`
function. The rationale for this is that the `&mut`-safety is only the
beginning for the safety of `InstanceHandle`. In general the wasmtime
internals are extremely unsafe and haven't been audited for appropriate
usage of `unsafe`. Until that's done it's hoped that we can warn users
with this `unsafe` constructor and otherwise push users to the
`wasmtime` crate which we know is safe.

* Fix windows build

* Wrap up mutable memory state in one structure

Rather than having separate fields

* Use `Cell::set`, not `Cell::replace`, where possible

* Add a helper function for offsets from VMContext

* Fix a typo from merging

* rustfmt

* Use try_from, not as

* Tweak style of some setters
2020-01-24 14:20:35 -06:00
Dan Gohman
9a88d3d894 Replace the global-exports mechanism with a caller-vmctx mechanism. (#789)
* Replace the global-exports mechanism with a caller-vmctx mechanism.

This eliminates the global exports mechanism, and instead adds a
caller-vmctx argument to wasm functions so that WASI can obtain the
memory and other things from the caller rather than looking them up in a
global registry.

This replaces #390.

* Fixup some merge conflicts

* Rustfmt

* Ensure VMContext is aligned to 16 bytes

With the removal of `global_exports` it "just so happens" that this
isn't happening naturally any more.

* Fixup some bugs with double vmctx in wasmtime crate

* Trampoline stub needed adjusting
* Use pointer type instead of always using I64 for caller vmctx
* Don't store `ir::Signature` in `Func` since we don't know the pointer
  size at creation time.
* Skip the first 2 arguments in IR signatures since that's the two vmctx
  parameters.

* Update cranelift to 0.56.0

* Handle more merge conflicts

* Rustfmt

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
2020-01-21 14:50:59 -08:00
XAMPPRocky
907e7aac01 Clippy fixes (#692) 2019-12-24 12:50:07 -08:00
Alex Crichton
39e57e3e9a Migrate back to std:: stylistically (#554)
* Migrate back to `std::` stylistically

This commit moves away from idioms such as `alloc::` and `core::` as
imports of standard data structures and types. Instead it migrates all
crates to uniformly use `std::` for importing standard data structures
and types. This also removes the `std` and `core` features from all
crates to and removes any conditional checking for `feature = "std"`

All of this support was previously added in #407 in an effort to make
wasmtime/cranelift "`no_std` compatible". Unfortunately though this
change comes at a cost:

* The usage of `alloc` and `core` isn't idiomatic. Especially trying to
  dual between types like `HashMap` from `std` as well as from
  `hashbrown` causes imports to be surprising in some cases.
* Unfortunately there was no CI check that crates were `no_std`, so none
  of them actually were. Many crates still imported from `std` or
  depended on crates that used `std`.

It's important to note, however, that **this does not mean that wasmtime
will not run in embedded environments**. The style of the code today and
idioms aren't ready in Rust to support this degree of multiplexing and
makes it somewhat difficult to keep up with the style of `wasmtime`.
Instead it's intended that embedded runtime support will be added as
necessary. Currently only `std` is necessary to build `wasmtime`, and
platforms that natively need to execute `wasmtime` will need to use a
Rust target that supports `std`. Note though that not all of `std` needs
to be supported, but instead much of it could be configured off to
return errors, and `wasmtime` would be configured to gracefully handle
errors.

The goal of this PR is to move `wasmtime` back to idiomatic usage of
features/`std`/imports/etc and help development in the short-term.
Long-term when platform concerns arise (if any) they can be addressed by
moving back to `no_std` crates (but fixing the issues mentioned above)
or ensuring that the target in Rust has `std` available.

* Start filling out platform support doc
2019-11-18 22:04:06 -08:00
Dan Gohman
061b453255 Remove unneeded extern crate, macro_use, and tidy uses. 2019-11-08 17:55:38 -08:00
Dan Gohman
1a0ed6e388 Use the more-asserts crate in more places.
This provides assert_le, assert_lt, and so on, which can print the
values of the operands.
2019-11-08 15:24:53 -08:00
Dan Gohman
22641de629 Initial reorg.
This is largely the same as #305, but updated for the current tree.
2019-11-08 06:35:40 -08:00