This commit is intended to do almost everything necessary for processing
the alias section of module linking. Most of this is internal
refactoring, the highlights being:
* Type contents are now stored separately from a `wasmtime_env::Module`.
Given that modules can freely alias types and have them used all over
the place, it seemed best to have one canonical location to type
storage which everywhere else points to (with indices). A new
`TypeTables` structure is produced during compilation which is shared
amongst all member modules in a wasm blob.
* Instantiation is heavily refactored to account for module linking. The
main gotcha here is that imports are now listed as "initializers". We
have a sort of pseudo-bytecode-interpreter which interprets the
initialization of a module. This is more complicated than just
matching imports at this point because in the module linking proposal
the module, alias, import, and instance sections may all be
interleaved. This means that imports aren't guaranteed to show up at
the beginning of the address space for modules/instances.
Otherwise most of the changes here largely fell out from these two
design points. Aliases are recorded as initializers in this scheme.
Copying around type information and/or just knowing type information
during compilation is also pretty easy since everything is just a
pointer into a `TypeTables` and we don't have to actually copy any types
themselves. Lots of various refactorings were necessary to accomodate
these changes.
Tests are hoped to cover a breadth of functionality here, but not
necessarily a depth. There's still one more piece of the module linking
proposal missing which is exporting instances/modules, which will come
in a future PR.
It's also worth nothing that there's one large TODO which isn't
implemented in this change that I plan on opening an issue for.
With module linking when a set of modules comes back from compilation
each modules has all the trampolines for the entire set of modules. This
is quite a lot of duplicate trampolines across module-linking modules.
We'll want to refactor this at some point to instead have only one set
of trampolines per set of module linking modules and have them shared
from there. I figured it was best to separate out this change, however,
since it's purely related to resource usage, and doesn't impact
non-module-linking modules at all.
cc #2094
* Provide filename/line number information in `Trap`
This commit extends the `Trap` type and `Store` to retain DWARF debug
information found in a wasm file unconditionally, if it's present. This
then enables us to print filenames and line numbers which point back to
actual source code when a trap backtrace is printed. Additionally the
`FrameInfo` type has been souped up to return filename/line number
information as well.
The implementation here is pretty simplistic currently. The meat of all
the work happens in `gimli` and `addr2line`, and otherwise wasmtime is
just schlepping around bytes of dwarf debuginfo here and there!
The general goal here is to assist with debugging when using wasmtime
because filenames and line numbers are generally orders of magnitude
better even when you already have a stack trace. Another nicety here is
that backtraces will display inlined frames (learned through debug
information), improving the experience in release mode as well.
An example of this is that with this file:
```rust
fn main() {
panic!("hello");
}
```
we get this stack trace:
```
$ rustc foo.rs --target wasm32-wasi -g
$ cargo run foo.wasm
Finished dev [unoptimized + debuginfo] target(s) in 0.16s
Running `target/debug/wasmtime foo.wasm`
thread 'main' panicked at 'hello', foo.rs:2:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Error: failed to run main module `foo.wasm`
Caused by:
0: failed to invoke command default
1: wasm trap: unreachable
wasm backtrace:
0: 0x6c1c - panic_abort::__rust_start_panic::abort::h2d60298621b1ccbf
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/panic_abort/src/lib.rs:77:17
- __rust_start_panic
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/panic_abort/src/lib.rs:32:5
1: 0x68c7 - rust_panic
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/panicking.rs:626:9
2: 0x65a1 - std::panicking::rust_panic_with_hook::h2345fb0909b53e12
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/panicking.rs:596:5
3: 0x1436 - std::panicking::begin_panic::{{closure}}::h106f151a6db8c8fb
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/panicking.rs:506:9
4: 0xda8 - std::sys_common::backtrace::__rust_end_short_backtrace::he55aa13f22782798
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/sys_common/backtrace.rs:153:18
5: 0x1324 - std::panicking::begin_panic::h1727e7d1d719c76f
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/panicking.rs:505:12
6: 0xfde - foo::main::h2db1313a64510850
at /Users/acrichton/code/wasmtime/foo.rs:2:5
7: 0x11d5 - core::ops::function::FnOnce::call_once::h20ee1cc04aeff1fc
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/core/src/ops/function.rs:227:5
8: 0xddf - std::sys_common::backtrace::__rust_begin_short_backtrace::h054493e41e27e69c
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/sys_common/backtrace.rs:137:18
9: 0x1d5a - std::rt::lang_start::{{closure}}::hd83784448d3fcb42
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/rt.rs:66:18
10: 0x69d8 - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h564d3dad35014917
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/core/src/ops/function.rs:259:13
- std::panicking::try::do_call::hdca4832ace5a8603
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/panicking.rs:381:40
- std::panicking::try::ha8624a1a6854b456
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/panicking.rs:345:19
- std::panic::catch_unwind::h71421f57cf2bc688
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/panic.rs:382:14
- std::rt::lang_start_internal::h260050c92cd470af
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/rt.rs:51:25
11: 0x1d0c - std::rt::lang_start::h0b4bcf3c5e498224
at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/rt.rs:65:5
12: 0xffc - <unknown>!__original_main
13: 0x393 - __muloti4
at /cargo/registry/src/github.com-1ecc6299db9ec823/compiler_builtins-0.1.35/src/macros.rs:269
```
This is relatively noisy by default but there's filenames and line
numbers! Additionally frame 10 can be seen to have lots of frames
inlined into it. All information is always available to the embedder but
we could try to handle the `__rust_begin_short_backtrace` and
`__rust_end_short_backtrace` markers to trim the backtrace by default as
well.
The only gotcha here is that it looks like `__muloti4` is out of place.
That's because the libc that Rust ships with doesn't have dwarf
information, although I'm not sure why we land in that function for
symbolizing it...
* Add a configuration switch for debuginfo
* Control debuginfo by default with `WASM_BACKTRACE_DETAILS`
* Try cpp_demangle on demangling as well
* Rename to WASMTIME_BACKTRACE_DETAILS
This commit adds lots of plumbing to get the type section from the
module linking proposal plumbed all the way through to the `wasmtime`
crate and the `wasmtime-c-api` crate. This isn't all that useful right
now because Wasmtime doesn't support imported/exported
modules/instances, but this is all necessary groundwork to getting that
exported at some point. I've added some light tests but I suspect the
bulk of the testing will come in a future commit.
One major change in this commit is that `SignatureIndex` no longer
follows type type index space in a wasm module. Instead a new
`TypeIndex` type is used to track that. Function signatures, still
indexed by `SignatureIndex`, are then packed together tightly.
This commit is intended to be the first of many in implementing the
module linking proposal. At this time this builds on #2059 so it
shouldn't land yet. The goal of this commit is to compile bare-bones
modules which use module linking, e.g. those with nested modules.
My hope with module linking is that almost everything in wasmtime only
needs mild refactorings to handle it. The goal is that all per-module
structures are still per-module and at the top level there's just a
`Vec` containing a bunch of modules. That's implemented currently where
`wasmtime::Module` contains `Arc<[CompiledModule]>` and an index of
which one it's pointing to. This should enable
serialization/deserialization of any module in a nested modules
scenario, no matter how you got it.
Tons of features of the module linking proposal are missing from this
commit. For example instantiation flat out doesn't work, nor does
import/export of modules or instances. That'll be coming as future
commits, but the purpose here is to start laying groundwork in Wasmtime
for handling lots of modules in lots of places.
After compilation there's actually no need to hold onto the native
signature for a wasm function type, so this commit moves out the
`ir::Signature` value from a `Module` into a separate field that's
deallocated when compilation is finished. This simplifies the
`SignatureRegistry` because it only needs to track wasm functino types
and it also means less work is done for `Func::wrap`.
Some platform ABIs require i32 values to be zero- or sign-extended
to the full register width. The extension is implemented by the
cranelift codegen backend, but this happens only if the appropriate
"uext" or "sext" attribute is present in the cranelift IR.
For calls to builtin functions, that IR is synthesized by the code
in func_environ.rs -- to ensure correct codegen for the target ABI,
this code needs to add those attributes as necessary.
This commit reduces the size of `InstructionAddressMap` from 24 bytes to
8 bytes by dropping the `code_len` field and reducing `code_offset` to
`u32` instead of `usize`. The intention is to primarily make the
in-memory version take up less space, and the hunch is that the
`code_len` is largely not necessary since most entries in this map are
always adjacent to one another. The `code_len` field is now implied by
the `code_offset` field of the next entry in the map.
This isn't as big of an improvement to serialized module size as #2321
or #2322, primarily because of the switch to variable-length encoding.
Despite this though it shaves about 10MB off the encoded size of the
module from #2318
This fixes an issue where `ensure_inserted_block()` wasn't called before
we do some block manipulation in the Wasmtime translation of some
table-related instructions. It looks like `ensure_inserted_block()` is
otherwise called on most instructions being added, so we just need to
call it explicitly it seems here.
Closes#2347
Turns out this wasn't needed anywhere! Additionally we can construct it
from `InstructionAddressMap` anyway. There's so many pieces of trap
information that it's best to keep these structures small as well.
This commit compresses `FunctionAddressMap` by performing a simple
coalescing of adjacent `InstructionAddressMap` descriptors if they
describe the same source location. This is intended to handle the common
case where a sequene of machine instructions describes a high-level wasm
instruction.
For the module on #2318 this reduces the cache entry size from 306MB to
161MB.
This commit adds initial (gated) support for the multi-memory wasm
proposal. This was actually quite easy since almost all of wasmtime
already expected multi-memory to be implemented one day. The only real
substantive change is the `memory.copy` intrinsic changes, which now
accounts for the source/destination memories possibly being different.
* Validate modules while translating
This commit is a change to cranelift-wasm to validate each function body
as it is translated. Additionally top-level module translation functions
will perform module validation. This commit builds on changes in
wasmparser to perform module validation interwtwined with parsing and
translation. This will be necessary for future wasm features such as
module linking where the type behind a function index, for example, can
be far away in another module. Additionally this also brings a nice
benefit where parsing the binary only happens once (instead of having an
up-front serial validation step) and validation can happen in parallel
for each function.
Most of the changes in this commit are plumbing to make sure everything
lines up right. The major functional change here is that module
compilation should be faster by validating in parallel (or skipping
function validation entirely in the case of a cache hit). Otherwise from
a user-facing perspective nothing should be that different.
This commit does mean that cranelift's translation now inherently
validates the input wasm module. This means that the Spidermonkey
integration of cranelift-wasm will also be validating the function as
it's being translated with cranelift. The associated PR for wasmparser
(bytecodealliance/wasmparser#62) provides the necessary tools to create
a `FuncValidator` for Gecko, but this is something I'll want careful
review for before landing!
* Read function operators until EOF
This way we can let the validator take care of any issues with
mismatched `end` instructions and/or trailing operators/bytes.
This commit extracts the two implementations of `Compiler` into two
separate crates, `wasmtime-cranelfit` and `wasmtime-lightbeam`. The
`wasmtime-jit` crate then depends on these two and instantiates them
appropriately. The goal here is to start reducing the weight of the
`wasmtime-environ` crate, which currently serves as a common set of
types between all `wasmtime-*` crates. Long-term I'd like to remove the
dependency on Cranelift from `wasmtime-environ`, but that's going to
take a lot more work.
In the meantime I figure it's a good way to get started by separating
out the lightbeam/cranelift function compilers from the
`wasmtime-environ` crate. We can continue to iterate on moving things
out in the future, too.