wasmtime

Author	SHA1	Message	Date
Alex Crichton	038383dc42	Implement support for outer core type aliases (#4385 ) Fill in the gaps of the implementation left after #4380.	2022-07-07 09:38:27 -07:00
Nick Fitzgerald	7000b0a4cf	wasmtime: Add criterion micro benchmarks for traps (#4398 ) * wasmtime: Rename host->wasm trampolines As we introduce new types of trampolines, having clear names for our existing trampolines will be helpful. * Fix typo in docs for `VMCOMPONENT_MAGIC` * wasmtime: Add criterion micro benchmarks for traps	2022-07-07 00:20:40 +00:00
Alex Crichton	41ba851a95	Bump versions of wasm-tools crates (#4380 ) * Bump versions of wasm-tools crates Note that this leaves new features in the component model, outer type aliases for core wasm types, unimplemented for now. * Move to crates.io-based versions of tools	2022-07-05 14:23:03 -05:00
Alex Crichton	76a2545a7f	Implement nested instance exports for components (#4364 ) This commit adds support to Wasmtime for components which themselves export instances. The support here adds new APIs for how instance exports are accessed in the embedding API. For now this is mostly just a first-pass where the API is somewhat confusing and has a lot of lifetimes. I'm hoping that over time we can figure out how to simplify this but for now it should at least be expressive enough for exploring the exports of an instance.	2022-07-05 16:04:54 +00:00
wasmtime-publish	7c428bbd62	Bump Wasmtime to 0.40.0 (#4378 ) Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	2022-07-05 09:10:52 -05:00
Alex Crichton	f0278c5db7	Implement `canon lower` of a `canon lift` function in the same component (#4347 ) * Implement `canon lower` of a `canon lift` function in the same component This commit implements the "degenerate" logic for implementing a function within a component that is lifted and then immediately lowered again. In this situation the lowered function will immediately generate a trap and doesn't need to implement anything else. The implementation in this commit is somewhat heavyweight but I think is probably justified moreso in future additions to the component model rather than what exactly is here right now. It's not expected that this "always trap" functionality will really be used all that often since it would generally mean a buggy component, but the functionality plumbed through here is hopefully going to be useful for implementing component-to-component adapter trampolines. Specifically this commit implements a strategy where the `canon.lower`'d function is generated by Cranelift and simply has a single trap instruction when called, doing nothing else. The main complexity comes from juggling around all the data associated with these functions, primarily plumbing through the traps into the `ModuleRegistry` to ensure that the global `is_wasm_trap_pc` function returns `true` and at runtime when we lookup information about the trap it's all readily available (e.g. translating the trapping pc to a `TrapCode`). * Fix non-component build * Fix some offset calculations * Only create one "always trap" per signature Use an internal map to deduplicate during compilation.	2022-06-29 16:35:37 +00:00
Alex Crichton	eef1758d19	Implement a first-class error for reexported component functions (#4348 ) Currently I don't know how we can reasonably implement this. Given all the signatures of how we call functions and how functions are called on the host there's no real feasible way that I know of to hook these two up "seamlessly". This means that a component which reexports an imported function can't be run in Wasmtime. One of the main reasons for this is that when calling a component function Wasmtime wants to lower arguments first and then have them lifted when the host is called. With a reexport though there's not actually anything to lower into so we'd sort of need something similar to a table on the side or maybe a linear memory and that seems like it'd get quite complicated quite quickly for not really all that much benefit. As-such for now this simply returns a first-class error (rather than the current panic) in situations like this.	2022-06-29 09:05:40 -05:00
Alex Crichton	c1b3962f7b	Implement lowered-then-lifted functions (#4327 ) * Implement lowered-then-lifted functions This commit is a few features bundled into one, culminating in the implementation of lowered-then-lifted functions for the component model. It's probably not going to be used all that often but this is possible within a valid component so Wasmtime needs to do something relatively reasonable. The main things implemented in this commit are: * Component instances are now assigned a `RuntimeComponentInstanceIndex` to differentiate each one. This will be used in the future to detect fusion (one instance lowering a function from another instance). For now it's used to allocate separate `VMComponentFlags` for each internal component instance. * The `CoreExport<FuncIndex>` of lowered functions was changed to a `CoreDef` since technically a lowered function can use another lowered function as the callee. This ended up being not too difficult to plumb through as everything else was already in place. * A need arose to compile host-to-wasm trampolines which weren't already present. Currently wasm in a component is always entered through a host-to-wasm trampoline but core wasm modules are the source of all the trampolines. In the case of a lowered-then-lifted function there may not actually be any core wasm modules, so component objects now contain necessary trampolines not otherwise provided by the core wasm objects. This feature required splitting a new function into the `Compiler` trait for creating a host-to-wasm trampoline. After doing this core wasm compilation was also updated to leverage this which further enabled compiling trampolines in parallel as opposed to the previous synchronous compilation. * Review comments	2022-06-28 18:50:08 +00:00
Alex Crichton	3339dd1f01	Implement the post-return attribute (#4297 ) This commit implements the `post-return` feature of the canonical ABI in the component model. This attribute is an optionally-specified function which is to be executed after the return value has been processed by the caller to optionally clean-up the return value. This enables, for example, returning an allocated string and the host then knows how to clean it up to prevent memory leaks in the original module. The API exposed in this PR changes the prior `TypedFunc::call` API in behavior but not in its signature. Previously the `TypedFunc::call` method would set the `may_enter` flag on the way out, but now that operation is deferred until a new `TypedFunc::post_return` method is called. This means that once a method on an instance is invoked then nothing else can be done on the instance until the `post_return` method is called. Note that the method must be called irrespective of whether the `post-return` canonical ABI option was specified or not. Internally wasm will be invoked if necessary. This is a pretty wonky and unergonomic API to work with. For now I couldn't think of a better alternative that improved on the ergonomics. In the theory that the raw Wasmtime bindings for a component may not be used all that heavily (instead `wit-bindgen` would largely be used) I'm hoping that this isn't too much of an issue in the future. cc #4185	2022-06-23 14:36:21 -05:00
Alex Crichton	651f40855f	Add support for nested components (#4285 ) * Add support for nested components This commit is an implementation of a number of features of the component model including: * Defining nested components * Outer aliases to components and modules * Instantiating nested components The implementation here is intended to be a foundational pillar of Wasmtime's component model support since recursion and nested components are the bread-and-butter of the component model. At a high level the intention for the component model implementation in Wasmtime has long been that the recursive nature of components is "erased" at compile time to something that's more optimized and efficient to process. This commit ended up exemplifying this quite well where the vast majority of the internal changes here are in the "compilation" phase of a component rather than the runtime instantiation phase. The support in the `wasmtime` crate, the runtime instantiation support, only had minor updates here while the internals of translation have seen heavy updates. The `translate` module was greatly refactored here in this commit. Previously it would, as a component is parsed, create a final `Component` to hand off to trampoline compilation and get persisted at runtime. Instead now it's a thin layer over `wasmparser` which simply records a list of `LocalInitializer` entries for how to instantiate the component and its index spaces are built. This internal representation of the instantiation of a component is pretty close to the binary format intentionally. Instead of performing dataflow legwork the `translate` phase of a component is now responsible for two primary tasks: 1. All components and modules are discovered within a component. They're assigned `Static{Component,Module}Index` depending on where they're found and a `{Module,}Translation` is prepared for each one. This "flattens" the recursive structure of the binary into an indexed list processable later. 2. The lexical scope of components is managed here to implement outer module and component aliases. This is a significant design implementation because when closing over an outer component or module that item may actually be imported or something like the result of a previous instantiation. This means that the capture of modules and components is both a lexical concern as well as a runtime concern. The handling of the "runtime" bits are handled in the next phase of compilation. The next and currently final phase of compilation is a new pass where much of the historical code in `translate.rs` has been moved to (but heavily refactored). The goal of compilation is to produce one "flat" list of initializers for a component (as happens prior to this PR) and to achieve this an "inliner" phase runs which runs through the instantiation process at compile time to produce a list of initializers. This `inline` module is the main addition as part of this PR and is now the workhorse for dataflow analysis and tracking what's actually referring to what. During the `inline` phase the local initializers recorded in the `translate` phase are processed, in sequence, to instantiate a component. Definitions of items are tracked to correspond to their root definition which allows seeing across instantiation argument boundaries and such. Handling "upvars" for component outer aliases is handled in the `inline` phase as well by creating state for a component whenever a component is defined as was recorded during the `translate` phase. Finally this phase is chiefly responsible for doing all string-based name resolution at compile time that it can. This means that at runtime no string maps will need to be consulted for item exports and such. The final result of inlining is a list of "global initializers" which is a flat list processed during instantiation time. These are almost identical to the initializers that were processed prior to this PR. There are certainly still more gaps of the component model to implement but this should be a major leg up in terms of functionality that Wasmtime implements. This commit, however leaves behind a "hole" which is not intended to be filled in at this time, namely importing and exporting components at the "root" level from and to the host. This is tracked and explained in more detail as part of #4283. cc #4185 as this completes a number of items there * Tweak code to work on stable without warning * Review comments	2022-06-21 13:48:56 -05:00
Pure White	258dc9de42	fix(wasmtime):`Config` methods should be idempotent (#4252 ) This commit refactored `Config` to use a seperate `CompilerConfig` field instead of operating on `CompilerBuilder` directly to make all its methods idempotent. Fixes #4189	2022-06-13 08:54:31 -05:00
Andrew Brown	0dcda643ea	runtime: vmoffsets must be checked in reverse order (#4253 ) When adding shared memory, memories owned by the module were added to a `owned_memories` array placed immediately after the `defined_memories` array. When checking the size of each array with `region_sizes`, the size of `defined_memories` and `owned_memories` were checked in this order. But `region_sizes` is iterating through the fields in the reverse order. This change reverses the field order to fix the associated fuzz bug.	2022-06-09 19:53:11 -05:00
Alex Crichton	7d7ddceb17	Update wasm-tools crates (#4246 ) This commit updates the wasm-tools family of crates, notably pulling in the refactorings and updates from bytecodealliance/wasm-tools#621 for the latest iteration of the component model. This commit additionally updates all support for the component model for these changes, notably: * Many bits and pieces of type information was refactored. Many `FooTypeIndex` namings are now `TypeFooIndex`. Additionally there is now `TypeIndex` as well as `ComponentTypeIndex` for the two type index spaces in a component. * A number of new sections are now processed to handle the core and component variants. * Internal maps were split such as the `funcs` map into `component_funcs` and `funcs` (same for `instances`). * Canonical options are now processed individually instead of one bulk `into` definition. Overall this was not a major update to the internals of handling the component model in Wasmtime. Instead this was mostly a surface-level refactoring to make sure that everything lines up with the new binary format for components. * All text syntax used in tests was updated to the new syntax.	2022-06-09 11:16:07 -05:00
Andrew Brown	2b52f47b83	Add shared memories (#4187 ) * Add shared memories This change adds the ability to use shared memories in Wasmtime when the [threads proposal] is enabled. Shared memories are annotated as `shared` in the WebAssembly syntax, e.g., `(memory 1 1 shared)`, and are protected from concurrent access during `memory.size` and `memory.grow`. [threads proposal]: https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md In order to implement this in Wasmtime, there are two main cases to cover: - a program may simply create a shared memory and possibly export it; this means that Wasmtime itself must be able to create shared memories - a user may create a shared memory externally and pass it in as an import during instantiation; this is the case when the program contains code like `(import "env" "memory" (memory 1 1 shared))`--this case is handled by a new Wasmtime API type--`SharedMemory` Because of the first case, this change allows any of the current memory-creation mechanisms to work as-is. Wasmtime can still create either static or dynamic memories in either on-demand or pooling modes, and any of these memories can be considered shared. When shared, the `Memory` runtime container will lock appropriately during `memory.size` and `memory.grow` operations; since all memories use this container, it is an ideal place for implementing the locking once and once only. The second case is covered by the new `SharedMemory` structure. It uses the same `Mmap` allocation under the hood as non-shared memories, but allows the user to perform the allocation externally to Wasmtime and share the memory across threads (via an `Arc`). The pointer address to the actual memory is carefully wired through and owned by the `SharedMemory` structure itself. This means that there are differing views of where to access the pointer (i.e., `VMMemoryDefinition`): for owned memories (the default), the `VMMemoryDefinition` is stored directly by the `VMContext`; in the `SharedMemory` case, however, this `VMContext` must point to this separate structure. To ensure that the `VMContext` can always point to the correct `VMMemoryDefinition`, this change alters the `VMContext` structure. Since a `SharedMemory` owns its own `VMMemoryDefinition`, the `defined_memories` table in the `VMContext` becomes a sequence of pointers--in the shared memory case, they point to the `VMMemoryDefinition` owned by the `SharedMemory` and in the owned memory case (i.e., not shared) they point to `VMMemoryDefinition`s stored in a new table, `owned_memories`. This change adds an additional indirection (through the `mut VMMemoryDefinition` pointer) that could add overhead. Using an imported memory as a proxy, we measured a 1-3% overhead of this approach on the `pulldown-cmark` benchmark. To avoid this, Cranelift-generated code will special-case the owned memory access (i.e., load a pointer directly to the `owned_memories` entry) for `memory.size` so that only shared memories (and imported memories, as before) incur the indirection cost. review: remove thread feature check * review: swap wasmtime-types dependency for existing wasmtime-environ use * review: remove unused VMMemoryUnion * review: reword cross-engine error message * review: improve tests * review: refactor to separate prevent Memory <-> SharedMemory conversion * review: into_shared_memory -> as_shared_memory * review: remove commented out code * review: limit shared min/max to 32 bits * review: skip imported memories * review: imported memories are not owned * review: remove TODO * review: document unsafe send + sync * review: add limiter assertion * review: remove TODO * review: improve tests * review: fix doc test * fix: fixes based on discussion with Alex This changes several key parts: - adds memory indexes to imports and exports - makes `VMMemoryDefinition::current_length` an atomic usize * review: add `Extern::SharedMemory` * review: remove TODO * review: atomically load from VMMemoryDescription in JIT-generated code * review: add test probing the last available memory slot across threads * fix: move assertion to new location due to rebase * fix: doc link * fix: add TODOs to c-api * fix: broken doc link * fix: modify pooling allocator messages in tests * review: make owned_memory_index panic instead of returning an option * review: clarify calculation of num_owned_memories * review: move 'use' to top of file * review: change 'const [u8]' to 'mut [u8]' * review: remove TODO * review: avoid hard-coding memory index * review: remove 'preallocation' parameter from 'Memory::_new' * fix: component model memory length * review: check that shared memory plans are static * review: ignore growth limits for shared memory * review: improve atomic store comment * review: add FIXME for memory growth failure * review: add comment about absence of bounds-checked 'memory.size' * review: make 'current_length()' doc comment more precise * review: more comments related to memory.size non-determinism * review: make 'vmmemory' unreachable for shared memory * review: move code around * review: thread plan through to 'wrap()' * review: disallow shared memory allocation with the pooling allocator	2022-06-08 12:13:40 -05:00
wasmtime-publish	55946704cb	Bump Wasmtime to 0.39.0 (#4225 ) Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	2022-06-06 09:12:47 -05:00
Alex Crichton	2af358dd9c	Add a `VMComponentContext` type and create it on instantiation (#4215 ) * Add a `VMComponentContext` type and create it on instantiation This commit fills out the `wasmtime-runtime` crate's support for `VMComponentContext` and creates it as part of the instantiation process. This moves a few maps that were temporarily allocated in an `InstanceData` into the `VMComponentContext` and additionally reads the canonical options data from there instead. This type still won't be used in its "full glory" until the lowering of host functions is completely implemented, however, which will be coming in a future commit. * Remove `DerefMut` implementation * Rebase conflicts	2022-06-03 13:34:50 -05:00
Alex Crichton	3ed6fae7b3	Add trampoline compilation support for lowered imports (#4206 ) * Add trampoline compilation support for lowered imports This commit adds support to the component model implementation for compiling trampolines suitable for calling host imports. Currently this is purely just the compilation side of things, modifying the wasmtime-cranelift crate and additionally filling out a new `VMComponentOffsets` type (similar to `VMOffsets`). The actual creation of a `VMComponentContext` is still not performed and will be a subsequent PR. Internally though some tests are actually possible with this where we at least assert that compilation of a component and creation of everything in-memory doesn't panic or trip any assertions, so some tests are added here for that as well. * Fix some test errors	2022-06-03 10:01:42 -05:00
Alex Crichton	b49c5c878e	Implement module imports into components (#4208 ) * Implement module imports into components As a step towards implementing function imports into a component this commit implements importing modules into a component. This fills out missing pieces of functionality such as exporting modules as well. The previous translation code had initial support for translating imported modules but some of the AST type information was restructured with feedback from this implementation, namely splitting the `InstantiateModule` initializer into separate upvar/import variants to clarify that the item orderings for imports are resolved differently at runtime. Much of this commit is also adding infrastructure for any imports at all into a component. For example a `Linker` type (analagous to `wasmtime::Linker`) was added here as well. For now this type is quite limited due to the inability to define host functions (it can only work with instances and instances-of-modules) but it's enough to start writing `.wast` tests which exercise lots of module-related functionality. Fix a warning	2022-06-03 09:33:18 -05:00
Alex Crichton	9f5f978baa	Fix double-counting imports in `VMOffsets` calculations (#4209 ) * Fix double-counting imports in `VMOffsets` calculations This fixes an oversight in the initial creation of `VMOffsets` for a module to avoid double-counting imported globals, tables, and memories for calculating the size of the `VMContext`. Prior to this PR imported items are accidentally also counted as defined items for sizing calculations meaning that when a memory is imported but not defined, for example, the `VMContext` will have a space for an inline `VMMemoryDefinition` when it doesn't need to. Auditing where all this relates to it appears that the only issue from this mistake is that `VMContext` is a bit larger than it would otherwise need to be. Extra slots are uninitialized memory but nothing in Wasmtime ever actually accesses the memory either, so it should be harmless to have extra space here. Nevertheless it seems better to shrink the size as much as possible to avoid wasting space where we can. * Fix tests	2022-06-02 13:39:38 -05:00
Alex Crichton	0cf0230432	Add dataflow processing to component translation for imports (#4205 ) This commit enhances the processing of components to track all the dataflow for the processing of `canon.lower`'d functions. At the same time this fills out a few other missing details to component processing such as aliasing from some kinds of component instances and similar. The major changes contained within this are the updates the `info` submodule which has the AST of component type information. This has been significantly refactored to prepare for representing lowered functions and implementing those. The major change is from an `Instantiation` list to an `Initializer` list which abstractly represents a few other initialization actions. This work is split off from my main work to implement component imports of host functions. This is incomplete in the sense that it doesn't actually finish everything necessary to define host functions and import them into components. Instead this is only the changes necessary at the translation layer (so far). Consequently this commit does not have tests and also namely doesn't actually include the `VMComponentContext` initialization and usage. The full body of work is still a bit too messy to PR just yet so I'm hoping that this is a slimmed-down-enough piece to adequately be reviewed.	2022-06-01 16:27:49 -05:00
Alex Crichton	f638b390b6	Refactor some internals of wasmtime-cranelift (#4202 ) * Split `wasm_to_host_trampoline` into pieces In the upcoming component model supoprt for imports my plan is to reuse some of these pieces but not the entirety of the current `wasm_to_host_trampoline`. In an effort to make that diff smaller this commit splits up the function preemptively into pieces to get reused later. * Delete unused `for_each_libcall` macros Came across this when working in the object support for cranelift. * Refactor some object creation details This commit refactors some of the internals around creating an object file in the wasmtime-cranelift integration. The old `ObjectBuilder` is now named `ModuleTextBuilder` and is only used to create the text section rather than other sections too. This helps maintain the invariant that the unwind information section is placed directly after the text section without having an odd API for doing this. Additionally the unwind information creation is moved externally from the `ModuleTextBuilder` to a standalone structure. This separate structure is currently in use in the component model work I'm doing although I may change that to using the `ModuleTextBuilder` instead. In any case it seemed nice to encapsulate all of the unwinding information into one standalone structure. Finally, the insertion of native debug information has been refactored to happen in a new `append_dwarf` method to keep all the dwarf-related stuff together in one place as much as possible. * Fix a doctest * Fix a typo	2022-06-01 15:39:53 -05:00
Alex Crichton	2a4851ad2b	Change some `VMContext` pointers to `()` pointers (#4190 ) * Change some `VMContext` pointers to `()` pointers This commit is motivated by my work on the component model implementation for imported functions. Currently all context pointers in wasm are `mut VMContext` but with the component model my plan is to make some pointers instead along the lines of `mut VMComponentContext`. In doing this though one worry I have is breaking what has otherwise been a core invariant of Wasmtime for quite some time, subtly introducing bugs by accident. To help assuage my worry I've opted here to erase knowledge of `mut VMContext` where possible. Instead where applicable a context pointer is simply known as `mut ()` and the embedder doesn't actually know anything about this context beyond the value of the pointer. This will help prevent Wasmtime from accidentally ever trying to interpret this context pointer as an actual `VMContext` when it might instead be a `VMComponentContext`. Overall this was a pretty smooth transition. The main change here is that the `VMTrampoline` (now sporting more docs) has its first argument changed to `mut ()`. The second argument, the caller context, is still configured as `mut VMContext` though because all functions are always called from wasm still. Eventually for component-to-component calls I think we'll probably "fake" the second argument as the same as the first argument, losing track of the original caller, as an intentional way of isolating components from each other. Along the way there are a few host locations which do actually assume that the first argument is indeed a `VMContext`. These are valid assumptions that are upheld from a correct implementation, but I opted to add a "magic" field to `VMContext` to assert this in debug mode. This new "magic" field is inintialized during normal vmcontext initialization and it's checked whenever a `VMContext` is reinterpreted as an `Instance` (but only in debug mode). My hope here is to catch any future accidental mistakes, if ever. * Use a VMOpaqueContext wrapper * Fix typos	2022-06-01 11:00:43 -05:00
Alex Crichton	140b83597b	components: Implement the ability to call component exports (#4039 ) * components: Implement the ability to call component exports This commit is an implementation of the typed method of calling component exports. This is intended to represent the most efficient way of calling a component in Wasmtime, similar to what `TypedFunc` represents today for core wasm. Internally this contains all the traits and implementations necessary to invoke component exports with any type signature (e.g. arbitrary parameters and/or results). The expectation is that for results we'll reuse all of this infrastructure except in reverse (arguments and results will be swapped when defining imports). Some features of this implementation are: * Arbitrary type hierarchies are supported * The Rust-standard `Option`, `Result`, `String`, `Vec<T>`, and tuple types all map down to the corresponding type in the component model. * Basic utf-16 string support is implemented as proof-of-concept to show what handling might look like. This will need further testing and benchmarking. * Arguments can be behind "smart pointers", so for example `&Rc<Arc<[u8]>>` corresponds to `list<u8>` in interface types. * Bulk copies from linear memory never happen unless explicitly instructed to do so. The goal of this commit is to create the ability to actually invoke wasm components. This represents what is expected to be the performance threshold for these calls where it ideally should be optimal how WebAssembly is invoked. One major missing piece of this is a `#[derive]` of some sort to generate Rust types for arbitrary `.wit` types such as custom records, variants, flags, unions, etc. The current trait impls for tuples and `Result<T, E>` are expected to have fleshed out most of what such a derive would look like. There are some downsides and missing pieces to this commit and method of calling components, however, such as: Passing `&[u8]` to WebAssembly is currently not optimal. Ideally this compiles down to a `memcpy`-equivalent somewhere but that currently doesn't happen due to all the bounds checks of copying data into memory. I have been unsuccessful so far at getting these bounds checks to be removed. * There is no finalization at this time (the "post return" functionality in the canonical ABI). Implementing this should be relatively straightforward but at this time requires `wasmparser` changes to catch up with the current canonical ABI. * There is no guarantee that results of a wasm function will be validated. As results are consumed they are validated but this means that if function returns an invalid string which the host doesn't look at then no trap will be generated. This is probably not the intended semantics of hosts in the component model. * At this time there's no support for memory64 memories, just a bunch of `FIXME`s to get around to. It's expected that this won't be too onerous, however. Some extra care will need to ensure that the various methods related to size/alignment all optimize to the same thing they do today (e.g. constants). * The return value of a typed component function is either `T` or `Value<T>`, and it depends on the ABI details of `T` and whether it takes up more than one return value slot or not. This is an ABI-implementation detail which is being forced through to the API layer which is pretty unfortunate. For example if you say the return value of a function is `(u8, u32)` then it's a runtime type-checking error. I don't know of a great way to solve this at this time. Overall I'm feeling optimistic about this trajectory of implementing value lifting/lowering in Wasmtime. While there are a number of downsides none seem completely insurmountable. There's naturally still a good deal of work with the component model but this should be a significant step up towards implementing and testing the component model. * Review comments * Write tests for calling functions This commit adds a new test file for actually executing functions and testing their results. This is not written as a `.wast` test yet since it's not 100% clear if that's the best way to do that for now (given that dynamic signatures aren't supported yet). The tests themselves could all largely be translated to `.wast` testing in the future, though, if supported. Along the way a number of minor issues were fixed with lowerings with the bugs exposed here. * Fix an endian mistake * Fix a typo and the `memory.fill` instruction	2022-05-24 17:02:31 -05:00
Alex Crichton	fcf6208750	Initial skeleton of some component model processing (#4005 ) * Initial skeleton of some component model processing This commit is the first of what will likely be many to implement the component model proposal in Wasmtime. This will be structured as a series of incremental commits, most of which haven't been written yet. My hope is to make this incremental and over time to make this easier to review and easier to test each step in isolation. Here much of the skeleton of how components are going to work in Wasmtime is sketched out. This is not a complete implementation of the component model so it's not all that useful yet, but some things you can do are: * Process the type section into a representation amenable for working with in Wasmtime. * Process the module section and register core wasm modules. * Process the instance section for core wasm modules. * Process core wasm module imports. * Process core wasm instance aliasing. * Ability to compile a component with core wasm embedded. * Ability to instantiate a component with no imports. * Ability to get functions from this component. This is already starting to diverge from the previous module linking representation where a `Component` will try to avoid unnecessary metadata about the component and instead internally only have the bare minimum necessary to instantiate the module. My hope is we can avoid constructing most of the index spaces during instantiation only for it to all ge thrown away. Additionally I'm predicting that we'll need to see through processing where possible to know how to generate adapters and where they are fused. At this time you can't actually call a component's functions, and that's the next PR that I would like to make. * Add tests for the component model support This commit uses the recently updated wasm-tools crates to add tests for the component model added in the previous commit. This involved updating the `wasmtime-wast` crate for component-model changes. Currently the component support there is quite primitive, but enough to at least instantiate components and verify the internals of Wasmtime are all working correctly. Additionally some simple tests for the embedding API have also been added.	2022-05-20 15:33:18 -05:00
Alex Crichton	89ccc56e46	Update the wasm-tools family of crates (#4165 ) * Update the wasm-tools family of crates This commit updates these crates as used by Wasmtime for the recently published versions to pull in changes necessary to support the component model. I've split this out from #4005 to make it clear what's impacted here and #4005 can simply rebase on top of this to pick up the necessary changes. * More test fixes	2022-05-19 14:13:04 -05:00
Alex Crichton	ccf834b473	Fix an issue where massive memory images are created (#4112 ) This commit fixes an issue introduced in #4046 where the checks for ensuring that the memory initialization image for a module was constrained in its size failed to trigger and a very small module could produce an arbitrarily large memory image. The bug in question was that if a module only had empty data segments at arbitrarily small and large addresses then the loop which checks whether or not the image is allowed was skipped entirely since it was seen that the memory had no data size. The fix here is to skip segments that are empty to ensure that if the validation loop is skipped then no data segments will be processed to create the image (and the module won't end up having an image in the end).	2022-05-09 11:04:56 -05:00
wasmtime-publish	9a6854456d	Bump Wasmtime to 0.38.0 (#4103 ) Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	2022-05-05 13:43:02 -05:00
Alex Crichton	7fdc616368	Remove the `Paged` memory initialization variant (#4046 ) * Remove the `Paged` memory initialization variant This commit simplifies the `MemoryInitialization` enum by removing the `Paged` variant. The `Paged` variant was originally added for uffd, but that support has now been removed in #4040. This is no longer necessary but is still used as an intermediate step of becoming a `Static` variant of initialized memory (which copy-on-write uses). As a result this commit largely modifies the static initialization of memory steps and folds the two methods together. * Apply suggestions from code review Co-authored-by: Peter Huene <peter@huene.dev> Co-authored-by: Peter Huene <peter@huene.dev>	2022-05-05 09:44:48 -05:00
Alex Crichton	871a9d93f2	Update some dependencies in `Cargo.lock` (#4081 ) * Run a `cargo update` over our dependencies This'll notably fix a `cargo audit` error where we have a pinned version of the `regex` crate which has a CVE assigned to it. * Update to `object` and `hashbrown` crates Prune some duplicate versions showing up from the previous `cargo update`	2022-04-28 11:12:58 -05:00
Alex Crichton	d147802d51	Update wasm-tools crates (#3997 ) * Update wasm-tools crates This commit updates the wasm-tools family of crates as used in Wasmtime. Notably this brings in the update which removes module linking support as well as a number of internal refactorings around names and such within wasmparser itself. This updates all of the wasm translation support which binds to wasmparser as appropriate. Other crates all had API-compatible changes for at least what Wasmtime used so no further changes were necessary beyond updating version requirements. * Update a test expectation	2022-04-05 14:32:33 -05:00
wasmtime-publish	78a595ac88	Bump Wasmtime to 0.37.0 (#3994 ) Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	2022-04-05 09:24:28 -05:00
Alex Crichton	7b5176baea	Upgrade all crates to the Rust 2021 edition (#3991 ) * Upgrade all crates to the Rust 2021 edition I've personally started using the new format strings for things like `panic!("some message {foo}")` or similar and have been upgrading crates on a case-by-case basis, but I think it probably makes more sense to go ahead and blanket upgrade everything so 2021 features are always available. * Fix compile of the C API * Fix a warning * Fix another warning	2022-04-04 12:27:12 -05:00
Alex Crichton	c89dc55108	Add a two-week delay to Wasmtime's release process (#3955 ) * Bump to 0.36.0 * Add a two-week delay to Wasmtime's release process This commit is a proposal to update Wasmtime's release process with a two-week delay from branching a release until it's actually officially released. We've had two issues lately that came up which led to this proposal: * In #3915 it was realized that changes just before the 0.35.0 release weren't enough for an embedding use case, but the PR didn't meet the expectations for a full patch release. * At Fastly we were about to start rolling out a new version of Wasmtime when over the weekend the fuzz bug #3951 was found. This led to the desire internally to have a "must have been fuzzed for this long" period of time for Wasmtime changes which we felt were better reflected in the release process itself rather than something about Fastly's own integration with Wasmtime. This commit updates the automation for releases to unconditionally create a `release-X.Y.Z` branch on the 5th of every month. The actual release from this branch is then performed on the 20th of every month, roughly two weeks later. This should provide a period of time to ensure that all changes in a release are fuzzed for at least two weeks and avoid any further surprises. This should also help with any last-minute changes made just before a release if they need tweaking since backporting to a not-yet-released branch is much easier. Overall there are some new properties about Wasmtime with this proposal as well: * The `main` branch will always have a section in `RELEASES.md` which is listed as "Unreleased" for us to fill out. * The `main` branch will always be a version ahead of the latest release. For example it will be bump pre-emptively as part of the release process on the 5th where if `release-2.0.0` was created then the `main` branch will have 3.0.0 Wasmtime. * Dates for major versions are automatically updated in the `RELEASES.md` notes. The associated documentation for our release process is updated and the various scripts should all be updated now as well with this commit. * Add notes on a security patch * Clarify security fixes shouldn't be previewed early on CI	2022-04-01 13:11:10 -05:00
Alex Crichton	453feb6f82	Remove some dead code (#3970 ) This commit removes methods that are never used between crates or trait impls like `Clone` which may have been used one day but are no longer used.	2022-03-30 13:51:34 -05:00
Alex Crichton	d1d10dc8da	Refactor the `TypeTables` type (#3971 ) * Remove duplicate `TypeTables` type This was once needed historically but it is no longer needed. * Make the internals of `TypeTables` private Instead of reaching internally for the `wasm_signatures` map an `Index` implementation now exists to indirect accesses through the type of the index being accessed. For the component model this table of types will grow a number of other tables and this'll assist in consuming sites not having to worry so much about which map they're reaching into.	2022-03-30 13:51:25 -05:00
Alex Crichton	76b82910c9	Remove the module linking implementation in Wasmtime (#3958 ) * Remove the module linking implementation in Wasmtime This commit removes the experimental implementation of the module linking WebAssembly proposal from Wasmtime. The module linking is no longer intended for core WebAssembly but is instead incorporated into the component model now at this point. This means that very large parts of Wasmtime's implementation of module linking are no longer applicable and would change greatly with an implementation of the component model. The main purpose of this is to remove Wasmtime's reliance on the support for module-linking in `wasmparser` and tooling crates. With this reliance removed we can move over to the `component-model` branch of `wasmparser` and use the updated support for the component model. Additionally given the trajectory of the component model proposal the embedding API of Wasmtime will not look like what it looks like today for WebAssembly. For example the core wasm `Instance` will not change and instead a `Component` is likely to be added instead. Some more rationale for this is in #3941, but the basic idea is that I feel that it's not going to be viable to develop support for the component model on a non-`main` branch of Wasmtime. Additionaly I don't think it's viable, for the same reasons as `wasm-tools`, to support the old module linking proposal and the new component model at the same time. This commit takes a moment to not only delete the existing module linking implementation but some abstractions are also simplified. For example module serialization is a bit simpler that there's only one module. Additionally instantiation is much simpler since the only initializer we have to deal with are imports and nothing else. Closes #3941 * Fix doc link * Update comments	2022-03-23 14:57:34 -05:00
Alex Crichton	c22033bf93	Delete historical interruptable support in Wasmtime (#3925 ) * Delete historical interruptable support in Wasmtime This commit removes the `Config::interruptable` configuration along with the `InterruptHandle` type from the `wasmtime` crate. The original support for adding interruption to WebAssembly was added pretty early on in the history of Wasmtime when there was no other method to prevent an infinite loop from the host. Nowadays, however, there are alternative methods for interruption such as fuel or epoch-based interruption. One of the major downsides of `Config::interruptable` is that even when it's not enabled it forces an atomic swap to happen when entering WebAssembly code. This technically could be a non-atomic swap if the configuration option isn't enabled but that produces even more branch-y code on entry into WebAssembly which is already something we try to optimize. Calling into WebAssembly is on the order of a dozens of nanoseconds at this time and an atomic swap, even uncontended, can add up to 5ns on some platforms. The main goal of this PR is to remove this atomic swap on entry into WebAssembly. This is done by removing the `Config::interruptable` field entirely, moving all existing consumers to epochs instead which are suitable for the same purposes. This means that the stack overflow check is no longer entangled with the interruption check and perhaps one day we could continue to optimize that further as well. Some consequences of this change are: * Epochs are now the only method of remote-thread interruption. * There are no more Wasmtime traps that produces the `Interrupted` trap code, although we may wish to move future traps to this so I left it in place. * The C API support for interrupt handles was also removed and bindings for epoch methods were added. * Function-entry checks for interruption are a tiny bit less efficient since one check is performed for the stack limit and a second is performed for the epoch as opposed to the `Config::interruptable` style of bundling the stack limit and the interrupt check in one. It's expected though that this is likely to not really be measurable. * The old `VMInterrupts` structure is renamed to `VMRuntimeLimits`.	2022-03-14 15:25:11 -05:00
Alex Crichton	2f4419cc6c	Implement runtime checks for compilation settings (#3899 ) * Implement runtime checks for compilation settings This commit fills out a few FIXME annotations by implementing run-time checks that when a `Module` is created it has compatible codegen settings for the current host (as `Module` is proof of "this code can run"). This is done by implementing new `Engine`-level methods which validate compiler settings. These settings are validated on `Module::new` as well as when loading serialized modules. Settings are split into two categories, one for "shared" top-level settings and one for ISA-specific settings. Both categories now have allow-lists hardcoded into `Engine` which indicate the acceptable values for each setting (if applicable). ISA-specific settings are checked with the Rust standard library's `std::is_x86_feature_detected!` macro. Other macros for other platforms are not stable at this time but can be added here if necessary. Closes #3897 * Fix fall-through logic to actually be correct * Use a `OnceCell`, not an `AtomicBool` * Fix some broken tests	2022-03-09 09:46:25 -06:00
wasmtime-publish	9137b4a50e	Bump Wasmtime to 0.35.0 (#3885 ) [automatically-tag-and-release-this-commit] Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	2022-03-07 15:18:34 -06:00
Alex Crichton	2a6969d2bd	Shrink the size of the anyfunc table in `VMContext` (#3850 ) * Shrink the size of the anyfunc table in `VMContext` This commit shrinks the size of the `VMCallerCheckedAnyfunc` table allocated into a `VMContext` to be the size of the number of "escaped" functions in a module rather than the number of functions in a module. Escaped functions include exports, table elements, etc, and are typically an order of magnitude smaller than the number of functions in general. This should greatly shrink the `VMContext` for some modules which while we aren't necessarily having any problems with that today shouldn't cause any problems in the future. The original motivation for this was that this came up during the recent lazy-table-initialization work and while it no longer has a direct performance benefit since tables aren't initialized at all on instantiation it should still improve long-running instances theoretically with smaller `VMContext` allocations as well as better locality between anyfuncs. * Fix some tests * Remove redundant hash set * Use a helper for pushing function type information * Use a more descriptive `is_escaping` method * Clarify a comment * Fix condition	2022-02-28 10:11:04 -06:00
Alex Crichton	15bb0c6903	Remove the `ModuleLimits` pooling configuration structure (#3837 ) * Remove the `ModuleLimits` pooling configuration structure This commit is an attempt to improve the usability of the pooling allocator by removing the need to configure a `ModuleLimits` structure. Internally this structure has limits on all forms of wasm constructs but this largely bottoms out in the size of an allocation for an instance in the instance pooling allocator. Maintaining this list of limits can be cumbersome as modules may get tweaked over time and there's otherwise no real reason to limit the number of globals in a module since the main goal is to limit the memory consumption of a `VMContext` which can be done with a memory allocation limit rather than fine-tuned control over each maximum and minimum. The new approach taken in this commit is to remove `ModuleLimits`. Some fields, such as `tables`, `table_elements` , `memories`, and `memory_pages` are moved to `InstanceLimits` since they're still enforced at runtime. A new field `size` is added to `InstanceLimits` which indicates, in bytes, the maximum size of the `VMContext` allocation. If the size of a `VMContext` for a module exceeds this value then instantiation will fail. This involved adding a few more checks to `{Table, Memory}::new_static` to ensure that the minimum size is able to fit in the allocation, since previously modules were validated at compile time of the module that everything fit and that validation no longer happens (it happens at runtime). A consequence of this commit is that Wasmtime will have no built-in way to reject modules at compile time if they'll fail to be instantiated within a particular pooling allocator configuration. Instead a module must attempt instantiation see if a failure happens. * Fix benchmark compiles * Fix some doc links * Fix a panic by ensuring modules have limited tables/memories * Review comments * Add back validation at `Module` time instantiation is possible This allows for getting an early signal at compile time that a module will never be instantiable in an engine with matching settings. * Provide a better error message when sizes are exceeded Improve the error message when an instance size exceeds the maximum by providing a breakdown of where the bytes are all going and why the large size is being requested. * Try to fix test in qemu * Flag new test as 64-bit only Sizes are all specific to 64-bit right now	2022-02-25 09:11:51 -06:00
Nick Fitzgerald	bad9a35418	`wasm-mutate` fuzz targets (#3836 ) * fuzzing: Add a custom mutator based on `wasm-mutate` * fuzz: Add a version of the `compile` fuzz target that uses `wasm-mutate` * Update `wasmparser` dependencies	2022-02-23 12:14:11 -08:00
Chris Fallin	43d31c5bf7	memfd: make "dense image" heuristic limit configurable. (#3831 ) In #3820 we see an issue with the new heuristics that control use of memfd: it's entirely possible for a reasonable Wasm module produced by a snapshotting system to have a relatively sparse heap (less than 50% filled). A system that avoids memfd because of this would have an undesirable performance reduction on such modules. Ultimately we should try to implement a hybrid scheme where we support outlier/leftover initializers, but for now this PR makes the "always allow dense" limit configurable. This way, embedders that want to ensure that memfd is used can do so, if they have other knowledge about the maximum heap size allowed in their system. (Partially addresses #3820 but let's leave it open to track the hybrid idea)	2022-02-22 12:40:43 -06:00
bjorn3	bbd52772de	Make VMOffset calculation more readable (#3793 ) * Fix typo * Move vmoffset field size and field name together The previous code was quite confusing about what applied to which field. The new code also makes it easier to move fields around and insert and delete fields. * Move builtin_functions before all variable sized fields This allows the offset to be calculated at compile time * Add cadd and cmul convenience functions * Remove comment * Change fields! syntax as per review * Add implicit u32::from to fields!	2022-02-22 09:48:53 -06:00
Alex Crichton	b62fe21914	Update memfd image construction to avoid excessively large images (#3819 ) * Update memfd image construction to avoid excessively large images Previously memfd-based image construction had a hard limit of a 1GB memory image but this mean that tiny wasm modules could allocate up to 1GB of memory which became a bit excessive especially in terms of memory usage during fuzzing. To fix this the conversion to a static memory image has been updated to first do a conversion to paged memory initialization, which is sparse, followed by a second conversion to static memory initialization. The sparse construction for the paged step should make it such that the upper/lower bounds of the initialization image are easily computed, and then afterwards this limit can be checked against some heuristics to determine if we're willing to commit to building up a whole static image for that module. The heuristics have been tweaked from "must be less than 1GB" to one of two conditions must be true: * Either the total memory image size is at most twice the size of the original paged data itself. * Otherwise the memory image size must be smaller than a reasonable threshold, currently 1MB. We'll likely need to tweak this over time and it's still possible to cause a lot of extra memory consumption, but for now this should be enough to appease the fuzzers. Closes #3815 * Review comments	2022-02-17 10:37:17 -06:00
Chris Fallin	1c014d129a	Cranelift: ensure ISA level needed for SIMD is present when SIMD is enabled. (#3816 ) Addresses #3809: when we are asked to create a Cranelift backend with shared flags that indicate support for SIMD, we should check that the ISA level needed for our SIMD lowerings is present.	2022-02-16 17:29:30 -08:00
Alex Crichton	b438617e12	Further minor optimizations to instantiation (#3791 ) * Shrink the size of `FuncData` Before this commit on a 64-bit system the `FuncData` type had a size of 88 bytes and after this commit it has a size of 32 bytes. A `FuncData` is required for all host functions in a store, including those inserted from a `Linker` into a store used during linking. This means that instantiation ends up creating a nontrivial number of these types and pushing them into the store. Looking at some profiles there were some surprisingly expensive movements of `FuncData` from the stack to a vector for moves-by-value generated by Rust. Shrinking this type enables more efficient code to be generated and additionally means less storage is needed in a store's function array. For instantiating the spidermonkey and rustpython modules this improves instantiation by 10% since they each import a fair number of host functions and the speedup here is relative to the number of items imported. * Use `ptr::copy_nonoverlapping` during initialization Prevoiusly `ptr::copy` was used for copying imports into place which translates to `memmove`, but `ptr::copy_nonoverlapping` can be used here since it's statically known these areas don't overlap. While this doesn't end up having a performance difference it's something I kept noticing while looking at the disassembly of `initialize_vmcontext` so I figured I'd go ahead and implement. * Indirect shared signature ids in the VMContext This commit is a small improvement for the instantiation time of modules by avoiding copying a list of `VMSharedSignatureIndex` entries into each `VMContext`, instead building one inside of a module and sharing that amongst all instances. This involves less lookups at instantiation time and less movement of data during instantiation. The downside is that type-checks on `call_indirect` now involve an additionally load, but I'm assuming that these are somewhat pessimized enough as-is that the runtime impact won't be much there. For instantiation performance this is a 5-10% win with rustpyhon/spidermonky instantiation. This should also reduce the size of each `VMContext` for an instantiation since signatures are no longer stored inline but shared amongst all instances with one module. Note that one subtle change here is that the array of `VMSharedSignatureIndex` was previously indexed by `TypeIndex`, and now it's indexed by `SignaturedIndex` which is a deduplicated form of `TypeIndex`. This is done because we already had a list of those lying around in `Module`, so it was easier to reuse that than to build a separate array and store it somewhere. * Reserve space in `Store<T>` with `InstancePre` This commit updates the instantiation process to reserve space in a `Store<T>` for the functions that an `InstancePre<T>`, as part of instantiation, will insert into it. Using an `InstancePre<T>` to instantiate allows pre-computing the number of host functions that will be inserted into a store, and by pre-reserving space we can avoid costly reallocations during instantiation by ensuring the function vector has enough space to fit everything during the instantiation process. Overall this makes instantiation of rustpython/spidermonkey about 8% faster locally. * Fix tests * Use checked arithmetic	2022-02-11 09:55:08 -06:00
Alex Crichton	c0c368d151	Use mmap'd `.cwasm` as a source for memory initialization images (#3787 ) Skip memfd creation with precompiled modules This commit updates the memfd support internally to not actually use a memfd if a compiled module originally came from disk via the `wasmtime::Module::deserialize_file` API. In this situation we already have a file descriptor open and there's no need to copy a module's heap image to a new file descriptor. To facilitate a new source of `mmap` the currently-memfd-specific-logic of creating a heap image is generalized to a new form of `MemoryInitialization` which is attempted for all modules at module-compile-time. This means that the serialized artifact to disk will have the memory image in its entirety waiting for us. Furthermore the memory image is ensured to be padded and aligned carefully to the target system's page size, notably meaning that the data section in the final object file is page-aligned and the size of the data section is also page aligned. This means that when a precompiled module is mapped from disk we can reuse the underlying `File` to mmap all initial memory images. This means that the offset-within-the-memory-mapped-file can differ for memfd-vs-not, but that's just another piece of state to track in the memfd implementation. In the limit this waters down the term "memfd" for this technique of quickly initializing memory because we no longer use memfd unconditionally (only when the backing file isn't available). This does however open up an avenue in the future to porting this support to other OSes because while `memfd_create` is Linux-specific both macOS and Windows support mapping a file with copy-on-write. This porting isn't done in this PR and is left for a future refactoring. Closes #3758 * Enable "memfd" support on all unix systems Cordon off the Linux-specific bits and enable the memfd support to compile and run on platforms like macOS which have a Linux-like `mmap`. This only works if a module is mapped from a precompiled module file on disk, but that's better than not supporting it at all! * Fix linux compile * Use `Arc<File>` instead of `MmapVecFileBacking` * Use a named struct instead of mysterious tuples * Comment about unsafety in `Module::deserialize_file` * Fix tests * Fix uffd compile * Always align data segments No need to have conditional alignment since their sizes are all aligned anyway * Update comment in build.rs * Use rustix, not `region` * Fix some confusing logic/names around memory indexes These functions all work with memory indexes, not specifically defined memory indexes.	2022-02-10 15:40:40 -06:00
Alex Crichton	520a7f26d7	Move function names out of `Module` (#3789 ) * Move function names out of `Module` This commit moves function names in a module out of the `wasmtime_environ::Module` type and into separate sections stored in the final compiled artifact. Spurred on by #3787 to look at module load times I noticed that a huge amount of time was spent in deserializing this map. The `spidermonkey.wasm` file, for example, has a 3MB name section which is a lot of unnecessary data to deserialize at module load time. The names of functions are now split out into their own dedicated section of the compiled artifact and metadata about them is stored in a more compact format at runtime by avoiding a `BTreeMap` and instead using a sorted array. Overall this improves deserialize times by up to 80% for modules with large name sections since the name section is no longer deserialized at load time and it's lazily paged in as names are actually referenced. * Fix a typo * Fix compiled module determinism Need to not only sort afterwards but also first to ensure the data of the name section is consistent.	2022-02-10 14:34:48 -06:00
Chris Fallin	39a52ceb4f	Implement lazy funcref table and anyfunc initialization. (#3733 ) During instance initialization, we build two sorts of arrays eagerly: - We create an "anyfunc" (a `VMCallerCheckedAnyfunc`) for every function in an instance. - We initialize every element of a funcref table with an initializer to a pointer to one of these anyfuncs. Most instances will not touch (via call_indirect or table.get) all funcref table elements. And most anyfuncs will never be referenced, because most functions are never placed in tables or used with `ref.func`. Thus, both of these initialization tasks are quite wasteful. Profiling shows that a significant fraction of the remaining instance-initialization time after our other recent optimizations is going into these two tasks. This PR implements two basic ideas: - The anyfunc array can be lazily initialized as long as we retain the information needed to do so. For now, in this PR, we just recreate the anyfunc whenever a pointer is taken to it, because doing so is fast enough; in the future we could keep some state to know whether the anyfunc has been written yet and skip this work if redundant. This technique allows us to leave the anyfunc array as uninitialized memory, which can be a significant savings. Filling it with initialized anyfuncs is very expensive, but even zeroing it is expensive: e.g. in a large module, it can be >500KB. - A funcref table can be lazily initialized as long as we retain a link to its corresponding instance and function index for each element. A zero in a table element means "uninitialized", and a slowpath does the initialization. Funcref tables are a little tricky because funcrefs can be null. We need to distinguish "element was initially non-null, but user stored explicit null later" from "element never touched" (ie the lazy init should not blow away an explicitly stored null). We solve this by stealing the LSB from every funcref (anyfunc pointer): when the LSB is set, the funcref is initialized and we don't hit the lazy-init slowpath. We insert the bit on storing to the table and mask it off after loading. We do have to set up a precomputed array of `FuncIndex`s for the table in order for this to work. We do this as part of the module compilation. This PR also refactors the way that the runtime crate gains access to information computed during module compilation. Performance effect measured with in-tree benches/instantiation.rs, using SpiderMonkey built for WASI, and with memfd enabled: ``` BEFORE: sequential/default/spidermonkey.wasm time: [68.569 us 68.696 us 68.856 us] sequential/pooling/spidermonkey.wasm time: [69.406 us 69.435 us 69.465 us] parallel/default/spidermonkey.wasm: with 1 background thread time: [69.444 us 69.470 us 69.497 us] parallel/default/spidermonkey.wasm: with 16 background threads time: [183.72 us 184.31 us 184.89 us] parallel/pooling/spidermonkey.wasm: with 1 background thread time: [69.018 us 69.070 us 69.136 us] parallel/pooling/spidermonkey.wasm: with 16 background threads time: [326.81 us 337.32 us 347.01 us] WITH THIS PR: sequential/default/spidermonkey.wasm time: [6.7821 us 6.8096 us 6.8397 us] change: [-90.245% -90.193% -90.142%] (p = 0.00 < 0.05) Performance has improved. sequential/pooling/spidermonkey.wasm time: [3.0410 us 3.0558 us 3.0724 us] change: [-95.566% -95.552% -95.537%] (p = 0.00 < 0.05) Performance has improved. parallel/default/spidermonkey.wasm: with 1 background thread time: [7.2643 us 7.2689 us 7.2735 us] change: [-89.541% -89.533% -89.525%] (p = 0.00 < 0.05) Performance has improved. parallel/default/spidermonkey.wasm: with 16 background threads time: [147.36 us 148.99 us 150.74 us] change: [-18.997% -18.081% -17.285%] (p = 0.00 < 0.05) Performance has improved. parallel/pooling/spidermonkey.wasm: with 1 background thread time: [3.1009 us 3.1021 us 3.1033 us] change: [-95.517% -95.511% -95.506%] (p = 0.00 < 0.05) Performance has improved. parallel/pooling/spidermonkey.wasm: with 16 background threads time: [49.449 us 50.475 us 51.540 us] change: [-85.423% -84.964% -84.465%] (p = 0.00 < 0.05) Performance has improved. ``` So an improvement of something like 80-95% for a very large module (7420 functions in its one funcref table, 31928 functions total).	2022-02-09 13:56:53 -08:00

1 2 3 4 5 ...

338 Commits