`VMExternRef` is a reference-counted box for any kind of data that is
external and opaque to running Wasm. Sometimes it might hold a Wasmtime
thing, other times it might hold something from a Wasmtime embedder and is
opaque even to us. It is morally equivalent to `Rc<dyn Any>` in Rust, but
additionally always fits in a pointer-sized word. `VMExternRef` is
non-nullable, but `Option<VMExternRef>` is a null pointer.
The one part of `VMExternRef` that can't ever be opaque to us is the
reference count. Even when we don't know what's inside an `VMExternRef`, we
need to be able to manipulate its reference count as we add and remove
references to it. And we need to do this from compiled Wasm code, so it must
be `repr(C)`!
`VMExternRef` itself is just a pointer to an `VMExternData`, which holds the
opaque, boxed value, its reference count, and its vtable pointer.
The `VMExternData` struct is *preceded* by the dynamically-sized value boxed
up and referenced by one or more `VMExternRef`s:
```ignore
,-------------------------------------------------------.
| |
V |
+----------------------------+-----------+-----------+ |
| dynamically-sized value... | ref_count | value_ptr |---'
+----------------------------+-----------+-----------+
| VMExternData |
+-----------------------+
^
+-------------+ |
| VMExternRef |-------------------+
+-------------+ |
|
+-------------+ |
| VMExternRef |-------------------+
+-------------+ |
|
... ===
|
+-------------+ |
| VMExternRef |-------------------'
+-------------+
```
The `value_ptr` member always points backwards to the start of the
dynamically-sized value (which is also the start of the heap allocation for
this value-and-`VMExternData` pair). Because it is a `dyn` pointer, it is
fat, and also points to the value's `Any` vtable.
The boxed value and the `VMExternRef` footer are held a single heap
allocation. The layout described above is used to make satisfying the
value's alignment easy: we just need to ensure that the heap allocation used
to hold everything satisfies its alignment. It also ensures that we don't
need a ton of excess padding between the `VMExternData` and the value for
values with large alignment.
* Reactor support.
This implements the new WASI ABI described here:
https://github.com/WebAssembly/WASI/blob/master/design/application-abi.md
It adds APIs to `Instance` and `Linker` with support for running
WASI programs, and also simplifies the process of instantiating
WASI API modules.
This currently only includes Rust API support.
* Add comments and fix a typo in a comment.
* Fix a rustdoc warning.
* Tidy an unneeded `mut`.
* Factor out instance initialization with `NewInstance`.
This also separates instantiation from initialization in a manner
similar to https://github.com/bytecodealliance/lucet/pull/506.
* Update fuzzing oracles for the API changes.
* Remove `wasi_linker` and clarify that Commands/Reactors aren't connected to WASI.
* Move Command/Reactor semantics into the Linker.
* C API support.
* Fix fuzzer build.
* Update usage syntax from "::" to "=".
* Remove `NewInstance` and `start()`.
* Elaborate on Commands and Reactors and add a spec link.
* Add more comments.
* Fix wat syntax.
* Fix wat.
* Use the `Debug` formatter to format an anyhow::Error.
* Fix wat.
* Update `Table::grow`'s return to be the previous size
This brings it in line with `Memory::grow` and the `table.grow`
instruction which return the size of the table previously, not the size
of the table currently.
* Comment successful return
Previously we initialized trap handling (signals/etc) once-per-instance
but that's a bit too granular since we only need to do this as
one-time per-program initialization. This moves the initialization to
`Store` instead which means that we'll call this at least once per
thread, which some platforms may need (none currently do, they all only
need per-program initialization, but Fuchsia will need per-thread
initialization).
This commit fixes an issue in Wasmtime where Wasmtime would accidentally
"handle" non-wasm segfaults while executing host imports of wasm
modules. If a host import segfaulted then Wasmtime would recognize that
wasm code is on the stack, so it'd longjmp out of the wasm code. This
papers over real bugs though in host code and erroneously classified
segfaults as wasm traps.
The fix here was to add a check to our wasm signal handler for if the
faulting address falls in JIT code itself. Actually threading through
all the right information for that check to happen is a bit tricky,
though, so this involved some refactoring:
* A closure parameter to `catch_traps` was added. This closure is
responsible for classifying addresses as whether or not they fall in
JIT code. Anything returning `false` means that the trap won't get
handled and we'll forward to the next signal handler.
* To avoid passing tons of context all over the place, the start
function is now no longer automatically invoked by `InstanceHandle`.
This avoids the need for passing all sorts of trap-handling contextual
information like the maximum stack size and "is this a jit address"
closure. Instead creators of `InstanceHandle` (like wasmtime) are now
responsible for invoking the start function.
* To avoid excessive use of `transmute` with lifetimes since the
traphandler state now has a lifetime the per-instance custom signal
handler is now replaced with a per-store custom signal handler. I'm
not entirely certain the purpose of the custom signal handler, though,
so I'd look for feedback on this part.
A new test has been added which ensures that if a host function
segfaults we don't accidentally try to handle it, and instead we
correctly report the segfault.
* Revamp memory management of `InstanceHandle`
This commit fixes a known but in Wasmtime where an instance could still
be used after it was freed. Unfortunately the fix here is a bit of a
hammer, but it's the best that we can do for now. The changes made in
this commit are:
* A `Store` now stores all `InstanceHandle` objects it ever creates.
This keeps all instances alive unconditionally (along with all host
functions and such) until the `Store` is itself dropped. Note that a
`Store` is reference counted so basically everything has to be dropped
to drop anything, there's no longer any partial deallocation of instances.
* The `InstanceHandle` type's own reference counting has been removed.
This is largely redundant with what's already happening in `Store`, so
there's no need to manage two reference counts.
* Each `InstanceHandle` no longer tracks its dependencies in terms of
instance handles. This set was actually inaccurate due to dynamic
updates to tables and such, so we needed to revamp it anyway.
* Initialization of an `InstanceHandle` is now deferred until after
`InstanceHandle::new`. This allows storing the `InstanceHandle` before
side-effectful initialization, such as copying element segments or
running the start function, to ensure that regardless of the result of
instantiation the underlying `InstanceHandle` is still available to
persist in storage.
Overall this should fix a known possible way to safely segfault Wasmtime
today (yay!) and it should also fix some flaikness I've seen on CI.
Turns out one of the spec tests
(bulk-memory-operations/partial-init-table-segment.wast) exercises this
functionality and we were hitting sporating use-after-free, but only on
Windows.
* Shuffle some APIs around
* Comment weak cycle
* Implement interrupting wasm code, reimplement stack overflow
This commit is a relatively large change for wasmtime with two main
goals:
* Primarily this enables interrupting executing wasm code with a trap,
preventing infinite loops in wasm code. Note that resumption of the
wasm code is not a goal of this commit.
* Additionally this commit reimplements how we handle stack overflow to
ensure that host functions always have a reasonable amount of stack to
run on. This fixes an issue where we might longjmp out of a host
function, skipping destructors.
Lots of various odds and ends end up falling out in this commit once the
two goals above were implemented. The strategy for implementing this was
also lifted from Spidermonkey and existing functionality inside of
Cranelift. I've tried to write up thorough documentation of how this all
works in `crates/environ/src/cranelift.rs` where gnarly-ish bits are.
A brief summary of how this works is that each function and each loop
header now checks to see if they're interrupted. Interrupts and the
stack overflow check are actually folded into one now, where function
headers check to see if they've run out of stack and the sentinel value
used to indicate an interrupt, checked in loop headers, tricks functions
into thinking they're out of stack. An interrupt is basically just
writing a value to a location which is read by JIT code.
When interrupts are delivered and what triggers them has been left up to
embedders of the `wasmtime` crate. The `wasmtime::Store` type has a
method to acquire an `InterruptHandle`, where `InterruptHandle` is a
`Send` and `Sync` type which can travel to other threads (or perhaps
even a signal handler) to get notified from. It's intended that this
provides a good degree of flexibility when interrupting wasm code. Note
though that this does have a large caveat where interrupts don't work
when you're interrupting host code, so if you've got a host import
blocking for a long time an interrupt won't actually be received until
the wasm starts running again.
Some fallout included from this change is:
* Unix signal handlers are no longer registered with `SA_ONSTACK`.
Instead they run on the native stack the thread was already using.
This is possible since stack overflow isn't handled by hitting the
guard page, but rather it's explicitly checked for in wasm now. Native
stack overflow will continue to abort the process as usual.
* Unix sigaltstack management is now no longer necessary since we don't
use it any more.
* Windows no longer has any need to reset guard pages since we no longer
try to recover from faults on guard pages.
* On all targets probestack intrinsics are disabled since we use a
different mechanism for catching stack overflow.
* The C API has been updated with interrupts handles. An example has
also been added which shows off how to interrupt a module.
Closes#139Closes#860Closes#900
* Update comment about magical interrupt value
* Store stack limit as a global value, not a closure
* Run rustfmt
* Handle review comments
* Add a comment about SA_ONSTACK
* Use `usize` for type of `INTERRUPTED`
* Parse human-readable durations
* Bring back sigaltstack handling
Allows libstd to print out stack overflow on failure still.
* Add parsing and emission of stack limit-via-preamble
* Fix new example for new apis
* Fix host segfault test in release mode
* Fix new doc example
* Compute instance exports on demand.
Instead having instances eagerly compute a Vec of Externs, and bumping
the refcount for each Extern, compute Externs on demand.
This also enables `Instance::get_export` to avoid doing a linear search.
This also means that the closure returned by `get0` and friends now
holds an `InstanceHandle` to dynamically hold the instance live rather
than being scoped to a lifetime.
* Compute module imports and exports on demand too.
And compute Extern::ty on demand too.
* Add a utility function for computing an ExternType.
* Add a utility function for looking up a function's signature.
* Add a utility function for computing the ValType of a Global.
* Rename wasmtime_environ::Export to EntityIndex.
This helps differentiate it from other Export types in the tree, and
describes what it is.
* Fix a typo in a comment.
* Simplify module imports and exports.
* Make `Instance::exports` return the export names.
This significantly simplifies the public API, as it's relatively common
to need the names, and this avoids the need to do a zip with
`Module::exports`.
This also changes `ImportType` and `ExportType` to have public members
instead of private members and accessors, as I find that simplifies the
usage particularly in cases where there are temporary instances.
* Remove `Instance::module`.
This doesn't quite remove `Instance`'s `module` member, it gets a step
closer.
* Use a InstanceHandle utility function.
* Don't consume self in the `Func::get*` methods.
Instead, just create a closure containing the instance handle and the
export for them to call.
* Use `ExactSizeIterator` to avoid needing separate `num_*` methods.
* Rename `Extern::func()` etc. to `into_func()` etc.
* Revise examples to avoid using `nth`.
* Add convenience methods to instance for getting specific extern types.
* Use the convenience functions in more tests and examples.
* Avoid cloning strings for `ImportType` and `ExportType`.
* Remove more obviated clone() calls.
* Simplify `Func`'s closure state.
* Make wasmtime::Export's fields private.
This makes them more consistent with ExportType.
* Fix compilation error.
* Make a lifetime parameter explicit, and use better lifetime names.
Instead of 'me, use 'instance and 'module to make it clear what the
lifetime is.
* More lifetime cleanups.
- Undo temporary changes to default features (`all-arch`) and a
signal-handler test.
- Remove `SIGTRAP` handler: no longer needed now that we've found an
"undefined opcode" option on ARM64.
- Rename pp.rs to pretty_print.rs in machinst/.
- Only use empty stack-probe on non-x86. As per a comment in
rust-lang/compiler-builtins [1], LLVM only supports stack probes on
x86 and x86-64. Thus, on any other CPU architecture, we cannot refer
to `__rust_probestack`, because it does not exist.
- Rename arm64 to aarch64.
- Use `target` directive in vcode filetests.
- Run the flags verifier, but without encinfo, when using new backends.
- Clean up warning overrides.
- Fix up use of casts: use u32::from(x) and siblings when possible,
u32::try_from(x).unwrap() when not, to avoid silent truncation.
- Take immutable `Function` borrows as input; we don't actually
mutate the input IR.
- Lots of other miscellaneous cleanups.
[1] cae3e6ea23/src/probestack.rs (L39)
* Consolidate trap/frame information
This commit removes `TrapRegistry` in favor of consolidating this
information in the `FRAME_INFO` we already have in the `wasmtime` crate.
This allows us to keep information generally in one place and have one
canonical location for "map this PC to some original wasm stuff". The
intent for this is to next update with enough information to go from a
program counter to a position in the original wasm file.
* Expose module offset information in `FrameInfo`
This commit implements functionality for `FrameInfo`, the wasm stack
trace of a `Trap`, to return the module/function offset. This allows
knowing the precise wasm location of each stack frame, instead of only
the main trap itself. The intention here is to provide more visibility
into the wasm source when something traps, so you know precisely where
calls were and where traps were, in order to assist in debugging.
Eventually we might use this information for mapping back to native
source languages as well (given sufficient debug information).
This change makes a previously-optional artifact of compilation always
computed on the cranelift side of things. This `ModuleAddressMap` is
then propagated to the same store of information other frame information
is stored within. This also removes the need for passing a `SourceLoc`
with wasm traps or to wasm trap creation, since the backtrace's wasm
frames will be able to infer their own `SourceLoc` from the relevant
program counters.
This commit adds a few odds and ends required to build wasmtime on ARM64
with the new backend. In particular, it adds:
- Support for the `Arm64Call` relocation type.
- Support for fetching the trap PC when a signal is received.
- A hook for `SIGTRAP`, which is sent by the `brk` opcode (in contrast to
x86's `SIGILL`).
With the patch sequence up to and including this patch applied,
`wasmtime` can now compile and successfully execute code on arm64. Not
all tests pass yet, but basic Wasm/WASI tests work correctly.
This commit optimizes the codegen of `Func::wrap` such that if you do
something like `Func::wrap(&store, || {})` then the shim generated
contains zero code (as expected). In general this means that the extra
tidbits generated by wasmtime are all eligible to be entirely optimized
away so long as you don't actually rely on something.
* Option for host managed memory
* Rename Allocator to MemoryCreator
* Create LinearMemory and MemoryCreator traits in api
* Leave only one as_ptr function in LinearMemory trait
* Memory creator test
* Update comments/docs for LinearMemory and MemoryCreator traits
* Add guard page to the custom memory example
* Remove mut from LinearMemory trait as_ptr
* Host_memory_grow test
* Increase the size of the sigaltstack.
Rust's stack overflow handler installs a sigaltstack stack with size
SIGSTKSZ, which is too small for some of the things we do in signal
handlers, and as of this writing lacks a guard page. Install bigger
sigaltstack stacks so that we have enough space, and have a guard page.
* Wasmtime 0.15.0 and Cranelift 0.62.0. (#1398)
* Bump more ad-hoc versions.
* Add build.rs to wasi-common's Cargo.toml.
* Update the env var name in more places.
* Remove a redundant echo.
* Bump Wasmtime to 0.14.0.
* Update the publish script for the wiggle crate wiggle.
* More fixes.
* Fix lightbeam depenency version.
* cargo update
* Cargo update wasi-tests too.
And add cargo update to the version-bump scripts.
* Remove C++ dependency from `wasmtime`
This commit removes the last wads of C++ that we have in wasmtime,
meaning that building wasmtime no longer requires a C++ compiler. It
still does require a C toolchain for some minor purposes, but hopefully
we can remove that over time too!
The motivation for doing this is to consolidate all our signal-handling
code into one location in one language so you don't have to keep
crossing back and forth when understanding what's going on. This also
allows us to remove some extra cruft that wasn't necessary from the C++
original implementation. Additionally this should also make building
wasmtime a bit more portable since it's often easier to acquire a C
toolchain than it is to acquire a C++ toolchain. (e.g. if you're
cross-compiling to a musl target)
* Typos
* Enable jitdump profiling support by default
This the result of some of the investigation I was doing for #1017. I've
done a number of refactorings here which culminated in a number of
changes that all amount to what I think should result in jitdump support being
enabled by default:
* Pass in a list of finished functions instead of just a range to
ensure that we're emitting jit dump data for a specific module rather
than a whole `CodeMemory` which may have other modules.
* Define `ProfilingStrategy` in the `wasmtime` crate to have everything
locally-defined
* Add support to the C API to enable profiling
* Documentation added for profiling with jitdump to the book
* Split out supported/unsupported files in `jitdump.rs` to avoid having
lots of `#[cfg]`.
* Make dependencies optional that are only used for `jitdump`.
* Move initialization up-front to `JitDumpAgent::new()` instead of
deferring it to the first module.
* Pass around `Arc<dyn ProfilingAgent>` instead of
`Option<Arc<Mutex<Box<dyn ProfilingAgent>>>>`
The `jitdump` Cargo feature is now enabled by default which means that
our published binaries, C API artifacts, and crates will support
profiling at runtime by default. The support I don't think is fully
fleshed out and working but I think it's probably in a good enough spot
we can get users playing around with it!
* Remove `WrappedCallable` indirection
At this point `Func` has evolved quite a bit since inception and the
`WrappedCallable` trait I don't believe is needed any longer. This
should help clean up a few entry points by having fewer traits in play.
* Remove the `Callable` trait
This commit removes the `wasmtime::Callable` trait, changing the
signature of `Func::new` to take an appropriately typed `Fn`.
Additionally the function now always takes `&Caller` like `Func::wrap`
optionally can, to empower `Func::new` to have the same capabilities of
`Func::wrap`.
* Add a test for an already-fixed issue
Closes#849
* rustfmt
* Update more locations for `Callable`
* rustfmt
* Remove a stray leading borrow
* Review feedback
* Remove unneeded `wasmtime_call_trampoline` shim
* Refactor wasmtime_runtime::Export
Instead of an enumeration with variants that have data fields have an
enumeration where each variant has a struct, and each struct has the
data fields. This allows us to store the structs in the `wasmtime` API
and avoid lots of `panic!` calls and various extraneous matches.
* Pre-generate trampoline functions
The `wasmtime` crate supports calling arbitrary function signatures in
wasm code, and to do this it generates "trampoline functions" which have
a known ABI that then internally convert to a particular signature's ABI
and call it. These trampoline functions are currently generated
on-the-fly and are cached in the global `Store` structure. This,
however, is suboptimal for a few reasons:
* Due to how code memory is managed each trampoline resides in its own
64kb allocation of memory. This means if you have N trampolines you're
using N * 64kb of memory, which is quite a lot of overhead!
* Trampolines are never free'd, even if the referencing module goes
away. This is similar to #925.
* Trampolines are a source of shared state which prevents `Store` from
being easily thread safe.
This commit refactors how trampolines are managed inside of the
`wasmtime` crate and jit/runtime internals. All trampolines are now
allocated in the same pass of `CodeMemory` that the main module is
allocated into. A trampoline is generated per-signature in a module as
well, instead of per-function. This cache of trampolines is stored
directly inside of an `Instance`. Trampolines are stored based on
`VMSharedSignatureIndex` so they can be looked up from the internals of
the `ExportFunction` value.
The `Func` API has been updated with various bits and pieces to ensure
the right trampolines are registered in the right places. Overall this
should ensure that all trampolines necessary are generated up-front
rather than lazily. This allows us to remove the trampoline cache from
the `Compiler` type, and move one step closer to making `Compiler`
threadsafe for usage across multiple threads.
Note that as one small caveat the `Func::wrap*` family of functions
don't need to generate a trampoline at runtime, they actually generate
the trampoline at compile time which gets passed in.
Also in addition to shuffling a lot of code around this fixes one minor
bug found in `code_memory.rs`, where `self.position` was loaded before
allocation, but the allocation may push a new chunk which would cause
`self.position` to be zero instead.
* Pass the `SignatureRegistry` as an argument to where it's needed.
This avoids the need for storing it in an `Arc`.
* Ignore tramoplines for functions with lots of arguments
Co-authored-by: Dan Gohman <sunfish@mozilla.com>
* Enable the already-passing `bulk-memoryoperations/imports.wast` test
* Implement support for the `memory.init` instruction and passive data
This adds support for passive data segments and the `memory.init` instruction
from the bulk memory operations proposal. Passive data segments are stored on
the Wasm module and then `memory.init` instructions copy their contents into
memory.
* Implement the `data.drop` instruction
This allows wasm modules to deallocate passive data segments that it doesn't
need anymore. We keep track of which segments have not been dropped on an
`Instance` and when dropping them, remove the entry from the instance's hash
map. The module always needs all of the segments for new instantiations.
* Enable final bulk memory operations spec test
This requires special casing an expected error message for an `assert_trap`,
since the expected error message contains the index of an uninitialized table
element, but our trap implementation doesn't save that diagnostic information
and shepherd it out.
* rename PassiveElemIndex to ElemIndex and same for PassiveDataIndex (#1411)
* rename PassiveDataIndex to DataIndex
* rename PassiveElemIndex to ElemIndex
* Apply renamings to wasmtime as well
* Run rustfmt
Co-authored-by: csmoe <csmoe@msn.com>
* Add a version to a path dependeency for publishing on crates.io.
* Add a README.md for wasmtime-profiling.
* Add versions to the wasmtime-profiling dependencies.
Essentially, table and memory out of bounds errors are no longer link errors,
but traps after linking. This means that the partail writes / inits are visible.
This adds support for the `table.copy` instruction from the bulk memory
proposal. It also supports multiple tables, which were introduced by the
reference types proposal.
Part of #928
* Improve robustness of cache loading/storing
Today wasmtime incorrectly loads compiled compiled modules from the
global cache when toggling settings such as optimizations. For example
if you execute `wasmtime foo.wasm` that will cache globally an
unoptimized version of the wasm module. If you then execute `wasmtime -O
foo.wasm` it would then reload the unoptimized version from cache, not
realizing the compilation settings were different, and use that instead.
This can lead to very surprising behavior naturally!
This commit updates how the cache is managed in an attempt to make it
much more robust against these sorts of issues. This takes a leaf out of
rustc's playbook and models the cache with a function that looks like:
fn load<T: Hash>(
&self,
data: T,
compute: fn(T) -> CacheEntry,
) -> CacheEntry;
The goal here is that it guarantees that all the `data` necessary to
`compute` the result of the cache entry is hashable and stored into the
hash key entry. This was previously open-coded and manually managed
where items were hashed explicitly, but this construction guarantees
that everything reasonable `compute` could use to compile the module is
stored in `data`, which is itself hashable.
This refactoring then resulted in a few workarounds and a few fixes,
including the original issue:
* The `Module` type was split into `Module` and `ModuleLocal` where only
the latter is hashed. The previous hash function for a `Module` left
out items like the `start_func` and didn't hash items like the imports
of the module. Omitting the `start_func` was fine since compilation
didn't actually use it, but omitting imports seemed uncomfortable
because while compilation didn't use the import values it did use the
*number* of imports, which seems like it should then be put into the
cache key. The `ModuleLocal` type now derives `Hash` to guarantee that
all of its contents affect the hash key.
* The `ModuleTranslationState` from `cranelift-wasm` doesn't implement
`Hash` which means that we have a manual wrapper to work around that.
This will be fixed with an upstream implementation, since this state
affects the generated wasm code. Currently this is just a map of
signatures, which is present in `Module` anyway, so we should be good
for the time being.
* Hashing `dyn TargetIsa` was also added, where previously it was not
fully hashed. Previously only the target name was used as part of the
cache key, but crucially the flags of compilation were omitted (for
example the optimization flags). Unfortunately the trait object itself
is not hashable so we still have to manually write a wrapper to hash
it, but we likely want to add upstream some utilities to hash isa
objects into cranelift itself. For now though we can continue to add
hashed fields as necessary.
Overall the goal here was to use the compiler to expose what we're not
hashing, and then make sure we organize data and write the right code to
ensure everything is hashed, and nothing more.
* Update crates/environ/src/module.rs
Co-Authored-By: Peter Huene <peterhuene@protonmail.com>
* Fix lightbeam
* Fix compilation of tests
* Update the expected structure of the cache
* Revert "Update the expected structure of the cache"
This reverts commit 2b53fee426a4e411c313d8c1e424841ba304a9cd.
* Separate the cache dir a bit
* Add a test the cache is busted with opt levels
* rustfmt
Co-authored-by: Peter Huene <peterhuene@protonmail.com>
Patch adds support for the perf jitdump file specification.
With this patch it should be possible to see profile data for code
generated and maped at runtime. Specifically the patch adds support
for the JIT_CODE_LOAD and the JIT_DEBUG_INFO record as described in
the specification. Dumping jitfiles is enabled with the --jitdump
flag. When the -g flag is also used there is an attempt to dump file
and line number information where this option would be most useful
when the WASM file already includes DWARF debug information.
The generation of the jitdump files has been tested on only a few wasm
files. This patch is expected to be useful/serviceable where currently
there is no means for jit profiling, but future patches may benefit
line mapping and add support for additional jitdump record types.
Usage Example:
Record
sudo perf record -k 1 -e instructions:u target/debug/wasmtime -g
--jitdump test.wasm
Combine
sudo perf inject -v -j -i perf.data -o perf.jit.data
Report
sudo perf report -i perf.jit.data -F+period,srcline
* Add API to statically assert signature of a `Func`
This commit add a family of APIs to `Func` named `getN` where `N` is the
number of arguments. Each function will attempt to statically assert the
signature of a `Func` and, if matching, returns a corresponding closure
which can be used to invoke the underlying function.
The purpose of this commit is to add a highly optimized way to enter a
wasm module, performing type checks up front and avoiding all the costs
of boxing and unboxing arguments within a `Val`. In general this should
be much more optimized than the previous `call` API for entering a wasm
module, if the signature is statically known.
* rustfmt
* Remove stray debugging
Previously `Instance` was always allocated with `mmap`. This was done to
future-proof `Instance` for allowing storing the memory itself inline
with an `Instance` allocation, but this can actually be done with
`alloc`/`dealloc` since they take an alignment. By using `malloc`/`free`
we can avoid fragmentation as well as hook into standard leak tracking
mechanisms.
* Generate trampolines based on signatures
Instead of generating a trampoline-per-function generate a
trampoline-per-signature. This should hopefully greatly increase the
cache hit rate on trampolines within a module and avoid generating a
function-per-function.
* Update crates/runtime/src/traphandlers.rs
Co-Authored-By: Sergei Pepyakin <s.pepyakin@gmail.com>
Co-authored-by: Sergei Pepyakin <s.pepyakin@gmail.com>
* Update cranelift to 0.58.0
* Update `wasmprinter` dep to require 0.2.1
We already had it in the lock file, but this ensures we won't ever go back down.
* Ensure that our error messages match `assert_invalid`'s
The bulk of this work was done in
https://github.com/bytecodealliance/wasmparser/pull/186 but now we can test it
at the `wasmtime` level as well.
Fixes#492
* Stop feeling guilty about not matching `assert_malformed` messages
Remove the "TODO" and stop printing warning messages. These would just be busy
work to implement, and getting all the messages the exact same relies on using
the same structure as the spec interpreter's parser, which means that where you
have a helper function and they don't, then things go wrong, and vice versa. Not
worth it.
Fixes#492
* Enable (but ignore) the reference-types proposal tests
* Match test suite directly, instead of roundabout starts/endswith
* Enable (but ignore) bulk memory operations proposal test suite
* Remove the `jit_function_registry` global state
This commit removes on the final pieces of global state in wasmtime
today, the `jit_function_registry` module. The purpose of this module is
to help translate a native backtrace with native program counters into a
wasm backtrace with module names, function names, and wasm module
indices. To that end this module retained a global map of function
ranges to this metadata information for each compiled function.
It turns out that we already had a `NAMES` global in the `wasmtime`
crate for symbolicating backtrace addresses, so this commit moves that
global into its own file and restructures the internals to account for
program counter ranges as well. The general set of changes here are:
* Remove `jit_function_registry`
* Remove `NAMES`
* Create a new `frame_info` module which has a singleton global
registering compiled module's frame information.
* Update traps to use the `frame_info` module to symbolicate pcs,
directly extracting a `FrameInfo` from the module.
* Register and unregister information on a module level instead of on a
per-function level (at least in terms of locking granluarity).
This commit leaves the new `FRAME_INFO` global variable as the only
remaining "critical" global variable in `wasmtime`, which only exists
due to the API of `Trap` where it doesn't take in any extra context when
capturing a stack trace through which we could hang off frame
information. I'm thinking though that this is ok, and we can always
tweak the API of `Trap` in the future if necessary if we truly need to
accomodate this.
* Remove a lazy_static dep
* Add some comments and restructure