wasmtime

Author	SHA1	Message	Date
Alex Crichton	18dd82ba7d	Improve signature lookup happening during instantiation (#2818 ) This commit is intended to be a perf improvement for instantiation of modules with lots of functions. Previously the `lookup_shared_signature` callback was showing up quite high in profiles as part of instantiation. As some background, this callback is used to translate from a module's `SignatureIndex` to a `VMSharedSignatureIndex` which the instance stores. This callback is called for two reasons, one is to translate all of the module's own types into `VMSharedSignatureIndex` for the purposes of `call_indirect` (the translation of that loads from this table to compare indices). The second reason is that a `VMCallerCheckedAnyfunc` is prepared for all functions and this embeds a `VMSharedSignatureIndex` inside of it. The slow part today is that the lookup callback was called once-per-function and each lookup involved hashing a full `WasmFuncType`. Albeit our hash algorithm is still Rust's default SipHash algorithm which is quite slow, but we also shouldn't need to re-hash each signature if we see it multiple times anyway. The fix applied in this commit is to change this lookup callback to an `enum` where one variant is that there's a table to lookup from. This table is a `PrimaryMap` which means that lookup is quite fast. The only thing we need to do is to prepare the table ahead of time. Currently this happens on the instantiation path because in my measurments the creation of the table is quite fast compared to the rest of instantiation. If this becomes an issue, though, we can look into creating the table as part of `SigRegistry::register_module` and caching it somewhere (I'm not entirely sure where but I'm sure we can figure it out). There's in generally not a ton of efficiency around the `SigRegistry` type. I'm hoping though that this fixes the next-lowest-hanging-fruit in terms of performance without complicating the implementation too much. I tried a few variants and this change seemed like the best balance between simplicity and still a nice performance gain. Locally I measured an improvement in instantiation time for a large-ish module by reducing the time from ~3ms to ~2.6ms per instance.	2021-04-08 15:04:18 -05:00
Peter Huene	45a500701f	Merge pull request #2811 from peterhuene/improve-store-registration Refactor store frame information.	2021-04-07 18:16:10 -07:00
Peter Huene	ad9fa11d48	Code review feedback. * Remove `once-cell` dependency. * Remove function address `BTreeMap` from `CompiledModule` in favor of binary searching finished functions directly. * Use `with_capacity` when populating `CompiledModule` finished functions and trampolines.	2021-04-07 16:37:04 -07:00
Peter Huene	875cb92cf0	Refactor store frame information. This commit refactors the store frame information to eliminate the copying of data out from `CompiledModule`. It also moves the population of a `BTreeMap` out of the frame information and into `CompiledModule` where it is only ever calculated once rather than at every new module instantiation into a `Store`. The map is also lazy-initialized so the cost of populating the map is incurred only when a trap occurs. This should help improve instantiation time of modules with a large number of functions and functions with lots of instructions.	2021-04-07 12:47:04 -07:00
Alex Crichton	195bf0e29a	Fully support multiple returns in Wasmtime (#2806 ) * Fully support multiple returns in Wasmtime For quite some time now Wasmtime has "supported" multiple return values, but only in the mose bare bones ways. Up until recently you couldn't get a typed version of functions with multiple return values, and never have you been able to use `Func::wrap` with functions that return multiple values. Even recently where `Func::typed` can call functions that return multiple values it uses a double-indirection by calling a trampoline which calls the real function. The underlying reason for this lack of support is that cranelift's ABI for returning multiple values is not possible to write in Rust. For example if a wasm function returns two `i32` values there is no Rust (or C!) function you can write to correspond to that. This commit, however fixes that. This commit adds two new ABIs to Cranelift: `WasmtimeSystemV` and `WasmtimeFastcall`. The intention is that these Wasmtime-specific ABIs match their corresponding ABI (e.g. `SystemV` or `WindowsFastcall`) for everything except how multiple values are returned. For multiple return values we simply define our own version of the ABI which Wasmtime implements, which is that for N return values the first is returned as if the function only returned that and the latter N-1 return values are returned via an out-ptr that's the last parameter to the function. These custom ABIs provides the ability for Wasmtime to bind these in Rust meaning that `Func::wrap` can now wrap functions that return multiple values and `Func::typed` no longer uses trampolines when calling functions that return multiple values. Although there's lots of internal changes there's no actual changes in the API surface area of Wasmtime, just a few more impls of more public traits which means that more types are supported in more places! Another change made with this PR is a consolidation of how the ABI of each function in a wasm module is selected. The native `SystemV` ABI, for example, is more efficient at returning multiple values than the wasmtime version of the ABI (since more things are in more registers). To continue to take advantage of this Wasmtime will now classify some functions in a wasm module with the "fast" ABI. Only functions that are not reachable externally from the module are classified with the fast ABI (e.g. those not exported, used in tables, or used with `ref.func`). This should enable purely internal functions of modules to have a faster calling convention than those which might be exposed to Wasmtime itself. Closes #1178 * Tweak some names and add docs * "fix" lightbeam compile * Fix TODO with dummy environ * Unwind info is a property of the target, not the ABI * Remove lightbeam unused imports * Attempt to fix arm64 * Document new ABIs aren't stable * Fix filetests to use the right target * Don't always do 64-bit stores with cranelift This was overwriting upper bits when 32-bit registers were being stored into return values, so fix the code inline to do a sized store instead of one-size-fits-all store. * At least get tests passing on the old backend * Fix a typo * Add some filetests with mixed abi calls * Get `multi` example working * Fix doctests on old x86 backend * Add a mixture of wasmtime/system_v tests	2021-04-07 12:34:26 -05:00
Chris Fallin	6bec13da04	Bump versions: Wasmtime to 0.26.0, Cranelift to 0.73.0.	2021-04-05 10:48:42 -07:00
Chris Fallin	8d78212a15	Merge pull request #2718 from cfallin/new-backend Switch default to new x86_64 backend.	2021-04-05 09:38:08 -07:00
Alex Crichton	04bf6e5bbb	Move some scopes around to fix a leak on raising a trap (#2803 ) Some recent refactorings accidentally had a local `Store` on the stack when a longjmp was initiated, bypassing its destructor and causing `Store` to leak. Closes #2802	2021-04-05 10:29:18 -05:00
Chris Fallin	cb48ea406e	Switch default to new x86_64 backend. This PR switches the default backend on x86, for both the `cranelift-codegen` crate and for Wasmtime, to the new (`MachInst`-style, `VCode`-based) backend that has been under development and testing for some time now. The old backend is still available by default in builds with the `old-x86-backend` feature, or by requesting `BackendVariant::Legacy` from the appropriate APIs. As part of that switch, it adds some more runtime-configurable plumbing to the testing infrastructure so that tests can be run using the appropriate backend. `clif-util test` is now capable of parsing a backend selector option from filetests and instantiating the correct backend. CI has been updated so that the old x86 backend continues to run its tests, just as we used to run the new x64 backend separately. At some point, we will remove the old x86 backend entirely, once we are satisfied that the new backend has not caused any unforeseen issues and we do not need to revert.	2021-04-02 11:35:53 -07:00
Peter Huene	0ddfe97a09	Change how flags are stored in serialized modules. This commit changes how both the shared flags and ISA flags are stored in the serialized module to detect incompatibilities when a serialized module is instantiated. It improves the error reporting when a compiled module has mismatched shared flags.	2021-04-01 21:39:57 -07:00
Peter Huene	4ad0099da4	Update `wat` crate. Update the `wat` crate to latest version and use `Error::set_path` in `Module::from_file` to properly record the path associated with errors.	2021-04-01 20:11:26 -07:00
Peter Huene	3da03bcfcf	Code review feedback. * Expand doc comment on `Engine::precompile_module`. * Add FIXME comment regarding a future ISA flag compatibility check before doing a JIT from `Module::from_binary`. * Remove no-longer-needed CLI groups from the `compile` command.	2021-04-01 19:38:20 -07:00
Peter Huene	d1313b1291	Code review feedback. * Move `Module::compile` to `Engine::precompile_module`. * Remove `Module::deserialize` method. * Make `Module::serialize` the same format as `Engine::precompile_module`. * Make `Engine::precompile_module` return a `Vec<u8>`. * Move the remaining serialization-related code to `serialization.rs`.	2021-04-01 19:38:19 -07:00
Peter Huene	abf3bf29f9	Add a `wasmtime settings` command to print Cranelift settings. This commit adds the `wasmtime settings` command to print out available Cranelift settings for a target (defaults to the host). The compile command has been updated to remove the Cranelift ISA options in favor of encouraging users to use `wasmtime settings` to discover what settings are available. This will reduce the maintenance cost for syncing the compile command with Cranelift ISA flags.	2021-04-01 19:38:19 -07:00
Peter Huene	a474524d3b	Code review feedback. * Removed `Config::cranelift_clear_cpu_flags`. * Renamed `Config::cranelift_other_flag` to `Config::cranelift::flag_set`. * Renamed `--cranelift-flag` to `--cranelift-set`. * Renamed `--cranelift-preset` to `--cranelift-enable`.	2021-04-01 19:38:19 -07:00
Peter Huene	1ce2a87149	Code review feedback. * Remove `Config::for_target` in favor of setter `Config::target`. * Remove explicit setting of Cranelift flags in `Config::new` in favor of calling the `Config` methods that do the same thing. * Serialize the package version independently of the data when serializing a module. * Use struct deconstructing in module serialization to ensure tunables and features aren't missed. * Move common log initialization in the CLI into `CommonOptions`.	2021-04-01 19:38:19 -07:00
Peter Huene	7e02940c72	Add Wasmtime to AOT header. Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>	2021-04-01 19:38:18 -07:00
Peter Huene	29d366db7b	Add a compile command to Wasmtime. This commit adds a `compile` command to the Wasmtime CLI. The command can be used to Ahead-Of-Time (AOT) compile WebAssembly modules. With the `all-arch` feature enabled, AOT compilation can be performed for non-native architectures (i.e. cross-compilation). The `Module::compile` method has been added to perform AOT compilation. A few of the CLI flags relating to "on by default" Wasm features have been changed to be "--disable-XYZ" flags. A simple example of using the `wasmtime compile` command: ```text $ wasmtime compile input.wasm $ wasmtime input.cwasm ```	2021-04-01 19:38:18 -07:00
Alex Crichton	a301202b7d	Remove the type-driven ability for duplicates in a `Linker` (#2789 ) When `Linker` was first created it was attempted to be created with the ability to instantiate any wasm modules, including those with duplicate import strings of different types. In an effort to support this a `Linker` supports defining the same names twice so long as they're defined with differently-typed values. This ended up causing wast testsuite failures module linking is enabled, however, because the wrong error message is returned. While it would be possible to fix this there's already the possibility for confusing error messages today due to the `Linker` trying to take on this type-level complexity. In a way this is yet-another type checker for wasm imports, but sort of a bad one because it only supports things like globals/functions, and otherwise you can only define one `Memory`, for example, with a particular name. This commit completely removes this feature from `Linker` to simplify the implementation and make error messages more straightforward. This means that any error message coming from a `Linker` is purely "this thing wasn't defined" rather than a hybrid of "maybe the types didn't match?". I think this also better aligns with the direction that we see conventional wasm modules going which is that duplicate imports are not ever present.	2021-03-29 17:26:02 -05:00
Alex Crichton	7d8931c517	Compile fewer trampolines with module linking (#2774 ) Previously each module in a module-linking-using-module would compile all the trampolines for all signatures for all modules. In forest-like situations with lots of modules this would cause quite a few trampolines to get compiled. The original intention was to have one global list of trampolines for all modules in the module-linking graph that they could all share. With the current design of module linking, however, the intention is for modules to be relatively isolated from one another which would make achieving this difficult. In lieu of total sharing (which would be good for the global scope anyway but we also don't do that right now) this commit implements an alternative strategy where each module simply compiles its own trampolines that it itself can reach. This should mean that module-linking modules behave more similarly to standalone modules in terms of trampoline duplication. If we ever do global trampoline deduplication we can likely batch this all together into one, but for now this should fix the performance issues seen in fuzzing. Closes #2525	2021-03-25 19:11:02 -05:00
Alex Crichton	211731b876	Update wasm-tools crates (#2773 ) Brings in some fuzzing-related bug-fixes	2021-03-25 18:44:31 -05:00
Alex Crichton	9476581ae6	Instantiate fewer instances when fuzzing (#2768 ) This commit fixes an issue where when module linking was enabled for fuzzing (which it is) import types of modules show as imports of instances. In an attempt to satisfy the dummy values of such imports the fuzzing integration would create instances for each import. This would, however, count towards instance limits and isn't always desired. This commit refactors the creation of dummy import values to decompose imports of instances into imports of each individual item. This should retain the pre-module-linking behavior of dummy imports for various fuzzers.	2021-03-25 13:35:38 -05:00
Alex Crichton	30d9164b6e	Fix a number of warnings cropping up on nightly Rust (#2767 ) Various small issues here and there, nothing major	2021-03-25 13:19:37 -05:00
Alex Crichton	3f694ae319	Use stable Rust on CI to test the x64 backend (#2766 ) * Use stable Rust on CI to test the x64 backend This commit leverages the newly-released 1.51.0 compiler to test the new backend on Windows and Linux with a stable compiler instead of a nightly compiler. This isolates the nightly build to just the nightly documentation generation and fuzzing, both of which rely on nightly for the best results right now. * Use updated stable in book build job * Run rustfmt for new stable * Silence new warnings for wasi-nn * Allow some dead code in the x64 backend Looks like new rustc is better about emitting some dead-code warnings * Update rust in peepmatic job * Fix a test in the pooling allocator * Remove `package.metdata.docs.rs` temporarily Needs resolution of https://github.com/rust-lang/cargo/pull/9300 first * Fix a warning in a wasi-nn example	2021-03-25 13:18:59 -05:00
Alex Crichton	d4b54ee0a8	More optimizations for calling into WebAssembly (#2759 ) * Combine stack-based cleanups for faster wasm calls This commit is an extension of #2757 where the goal is to optimize entry into WebAssembly. Currently wasmtime has two stack-based cleanups when entering wasm, one for the externref activation table and another for stack limits getting reset. This commit fuses these two cleanups together into one and moves some code around which enables less captures for fewer closures and such to speed up calls in to wasm a bit more. Overall this drops the execution time from 88ns to 80ns locally for me. This also updates the atomic orderings when updating the stack limit from `SeqCst` to `Relaxed`. While `SeqCst` is a reasonable starting point the usage here should be safe to use `Relaxed` since we're not using the atomics to actually protect any memory, it's simply receiving signals from other threads. * Determine whether a pc is wasm via a global map The macOS implementation of traps recently changed to using mach ports for handlers instead of signal handlers. This means that a previously relied upon invariant, each thread fixes its own trap, was broken. The macOS implementation worked around this by maintaining a global map from thread id to thread local information, however, to solve the problem. This global map is quite slow though. It involves taking a lock and updating a hash map on all calls into WebAssembly. In my local testing this accounts for >70% of the overhead of calling into WebAssembly on macOS. Naturally it'd be great to remove this! This commit fixes this issue and removes the global lock/map that is updated on all calls into WebAssembly. The fix is to maintain a global map of wasm modules and their trap addresses in the `wasmtime` crate. Doing so is relatively simple since we're already tracking this information at the `Store` level. Once we've got a global map then the macOS implementation can use this from a foreign thread and everything works out. Locally this brings the overhead, on macOS specifically, of calling into wasm from 80ns to ~20ns. * Fix compiles * Review comments	2021-03-24 11:41:33 -05:00
Alex Crichton	c95971ab59	Optimize calling a WebAssembly function (#2757 ) This commit implements a few optimizations, mainly inlining, that should improve the performance of calling a WebAssembly function. This code path can be quite hot depending on the embedding case and we hadn't really put much effort into optimizing the nitty gritty. The predominant optimization here is adding `#[inline]` to trivial functions so performance is improved without having to compile with LTO. Another optimization is to call `lazy_per_thread_init` when traps are initialized per-thread (when a `Store` is created) rather than each time a function is called. The next optimization is to change the unwind reason in the `CallThreadState` to `MaybeUninit` to avoid extra checks in the default case about whether we need to drop its variants (since in the happy path we never need to drop it). The final optimization is to optimize out a few checks when `async` support is disabled for a small speed boost. In a small benchmark where wasmtime calls a simple wasm function my macOS computer dropped from 110ns to 86ns overhead, a 20% decrease. The macOS overhead is still largely dominated by the global lock acquisition and hash table management for traps right now, but I suspect the Linux overhead is much better (should be on the order of ~30 or so ns). We still have a long way to go to compete with SpiderMonkey which, in testing, seem to have ~6ns overhead in calling the same wasm function on my computer.	2021-03-23 15:22:37 -05:00
Peter Huene	4471d27567	Merge pull request #2741 from peterhuene/refactor-fiber-stacks Split out fiber stacks from fibers.	2021-03-22 11:05:16 -07:00
Benjamin Bouvier	6e6713ae0b	cranelift: add support for the Mac aarch64 calling convention This bumps target-lexicon and adds support for the AppleAarch64 calling convention. Specifically for WebAssembly support, we only have to worry about the new stack slots convention. Stack slots don't need to be at least 8-bytes, they can be as small as the data type's size. For instance, if we need stack slots for (i32, i32), they can be located at offsets (+0, +4). Note that they still need to be properly aligned on the data type they're containing, though, so if we need stack slots for (i32, i64), we can't start the i64 slot at the +4 offset (it must start at the +8 offset). Added one test that was failing on the Mac M1, as well as other tests stressing different yet similar situations.	2021-03-22 10:06:13 +01:00
Peter Huene	e6dda413a4	Code review feedback. * Add assert to `StackPool::deallocate` to ensure the fiber stack given to it comes from the pool. * Remove outdated comment about windows and stacks as the allocator now returns fiber stacks. * Remove conditional compilation around `stack_size` in the allocators as it was just clutter.	2021-03-20 00:05:08 -07:00
Peter Huene	f8f51afac1	Split out fiber stacks from fibers. This commit splits out a `FiberStack` from `Fiber`, allowing the instance allocator trait to return `FiberStack` rather than raw stack pointers. This keeps the stack creation mostly in `wasmtime_fiber`, but now the on-demand instance allocator can make use of it. The instance allocators no longer have to return a "not supported" error to indicate that the store should allocate its own fiber stack. This includes a bunch of cleanup in the instance allocator to scope stacks to the new "async" feature in the runtime. Closes #2708.	2021-03-18 20:21:02 -07:00
Benjamin Bouvier	5fecdfa491	Mach ports continued + support aarch64-apple unwinding (#2723 ) * Switch macOS to using mach ports for trap handling This commit moves macOS to using mach ports instead of signals for handling traps. The motivation for this is listed in #2456, namely that once mach ports are used in a process that means traditional UNIX signal handlers won't get used. This means that if Wasmtime is integrated with Breakpad, for example, then Wasmtime's trap handler never fires and traps don't work. The `traphandlers` module is refactored as part of this commit to split the platform-specific bits into their own files (it was growing quite a lot for one inline `cfg_if!`). The `unix.rs` and `windows.rs` files remain the same as they were before with a few minor tweaks for some refactored interfaces. The `macos.rs` file is brand new and lifts almost its entire implementation from SpiderMonkey, adapted for Wasmtime though. The main gotcha with mach ports is that a separate thread is what services the exception. Some unsafe magic allows this separate thread to read non-`Send` and temporary state from other threads, but is hoped to be safe in this context. The unfortunate downside is that calling wasm on macOS now involves taking a global lock and modifying a global hash map twice-per-call. I'm not entirely sure how to get out of this cost for now, but hopefully for any embeddings on macOS it's not the end of the world. Closes #2456 * Add a sketch of arm64 apple support * store: maintain CallThreadState mapping when switching fibers * cranelift/aarch64: generate unwind directives to disable pointer auth Aarch64 post ARMv8.3 has a feature called pointer authentication, designed to fight ROP/JOP attacks: some pointers may be signed using new instructions, adding payloads to the high (previously unused) bits of the pointers. More on this here: https://lwn.net/Articles/718888/ Unwinders on aarch64 need to know if some pointers contained on the call frame contain an authentication code or not, to be able to properly authenticate them or use them directly. Since native code may have enabled it by default (as is the case on the Mac M1), and the default is that this configuration value is inherited, we need to explicitly disable it, for the only kind of supported pointers (return addresses). To do so, we set the value of a non-existing dwarf pseudo register (34) to 0, as documented in https://github.com/ARM-software/abi-aa/blob/master/aadwarf64/aadwarf64.rst#note-8. This is done at the function granularity, in the spirit of Cranelift compilation model. Alternatively, a single directive could be generated in the CIE, generating less information per module. * Make exception handling work on Mac aarch64 too * fibers: use a breakpoint instruction after the final call in wasmtime_fiber_start Co-authored-by: Alex Crichton <alex@alexcrichton.com>	2021-03-17 09:43:22 -05:00
Nick Fitzgerald	d081ef9c2e	Bump Wasmtime to 0.25.0; Cranelift to 0.72.0	2021-03-16 11:02:56 -07:00
Alex Crichton	2697a18d2f	Redo the statically typed `Func` API (#2719 ) * Redo the statically typed `Func` API This commit reimplements the `Func` API with respect to statically typed dispatch. Previously `Func` had a `getN` and `getN_async` family of methods which were implemented for 0 to 16 parameters. The return value of these functions was an `impl Fn(..)` closure with the appropriate parameters and return values. There are a number of downsides with this approach that have become apparent over time: * The addition of `_async` doubled the API surface area (which is quite large here due to one-method-per-number-of-parameters). The [documentation of `Func`][old-docs] are quite verbose and feel "polluted" with all these getters, making it harder to understand the other methods that can be used to interact with a `Func`. * These methods unconditionally pay the cost of returning an owned `impl Fn` with a `'static` lifetime. While cheap, this is still paying the cost for cloning the `Store` effectively and moving data into the closed-over environment. * Storage of the return value into a struct, for example, always requires `Box`-ing the returned closure since it otherwise cannot be named. * Recently I had the desire to implement an "unchecked" path for invoking wasm where you unsafely assert the type signature of a wasm function. Doing this with today's scheme would require doubling (again) the API surface area for both async and synchronous calls, further polluting the documentation. The main benefit of the previous scheme is that by returning a `impl Fn` it was quite easy and ergonomic to actually invoke the function. In practice, though, examples would often have something akin to `.get0::<()>()?()?` which is a lot of things to interpret all at once. Note that `get0` means "0 parameters" yet a type parameter is passed. There's also a double function invocation which looks like a lot of characters all lined up in a row. Overall, I think that the previous design is starting to show too many cracks and deserves a rewrite. This commit is that rewrite. The new design in this commit is to delete the `getN{,_async}` family of functions and instead have a new API: impl Func { fn typed<P, R>(&self) -> Result<&Typed<P, R>>; } impl Typed<P, R> { fn call(&self, params: P) -> Result<R, Trap>; async fn call_async(&self, params: P) -> Result<R, Trap>; } This should entirely replace the current scheme, albeit by slightly losing ergonomics use cases. The idea behind the API is that the existence of `Typed<P, R>` is a "proof" that the underlying function takes `P` and returns `R`. The `Func::typed` method peforms a runtime type-check to ensure that types all match up, and if successful you get a `Typed` value. Otherwise an error is returned. Once you have a `Typed` then, like `Func`, you can either `call` or `call_async`. The difference with a `Typed`, however, is that the params/results are statically known and hence these calls can be much more efficient. This is a much smaller API surface area from before and should greatly simplify the `Func` documentation. There's still a problem where `Func::wrapN_async` produces a lot of functions to document, but that's now the sole offender. It's a nice benefit that the statically-typed-async verisons are now expressed with an `async` function rather than a function-returning-a-future which makes it both more efficient and easier to understand. The type `P` and `R` are intended to either be bare types (e.g. `i32`) or tuples of any length (including 0). At this time `R` is only allowed to be `()` or a bare `i32`-style type because multi-value is not supported with a native ABI (yet). The `P`, however, can be any size of tuples of parameters. This is also where some ergonomics are lost because instead of `f(1, 2)` you now have to write `f.call((1, 2))` (note the double-parens). Similarly `f()` becomes `f.call(())`. Overall I feel that this is a better tradeoff than before. While not universally better due to the loss in ergonomics I feel that this design is much more flexible in terms of what you can do with the return value and also understanding the API surface area (just less to take in). [old-docs]: https://docs.rs/wasmtime/0.24.0/wasmtime/struct.Func.html#method.get0 * Rename Typed to TypedFunc * Implement multi-value returns through `Func::typed` * Fix examples in docs * Fix some more errors * More test fixes * Rebasing and adding `get_typed_func` * Updating tests * Fix typo * More doc tweaks * Tweak visibility on `Func::invoke` * Fix tests again	2021-03-11 14:43:34 -06:00
Alex Crichton	918c012d00	Fix some issues around TLS management with async (#2709 ) This commit fixes a few issues around managing the thread-local state of a wasmtime thread. We intentionally only have a singular TLS variable in the whole world, and the problem is that when stack-switching off an async thread we were not restoring the previous TLS state. This is necessary in two cases: * Futures aren't guaranteed to be polled/completed in a stack-like fashion. If a poll sees that a future isn't ready then we may resume execution in a previous wasm context that ends up needing the TLS information. * Futures can also cross threads (when the whole store crosses threads) and we need to save/restore TLS state from the thread we're coming from and the thread that we're going to. The stack switching issue necessitates some more glue around suspension and resumption of a stack to ensure we save/restore the TLS state on both sides. The thread issue, however, also necessitates that we use `#[inline(never)]` on TLS access functions and never have TLS borrows live across a function which could result in running arbitrary code (as was the case for the `tls::set` function.	2021-03-11 11:32:33 -06:00
Peter Huene	54c07d8f16	Implement shared host functions. (#2625 ) * Implement defining host functions at the Config level. This commit introduces defining host functions at the `Config` rather than with `Func` tied to a `Store`. The intention here is to enable a host to define all of the functions once with a `Config` and then use a `Linker` (or directly with `Store::get_host_func`) to use the functions when instantiating a module. This should help improve the performance of use cases where a `Store` is short-lived and redefining the functions at every module instantiation is a noticeable performance hit. This commit adds `add_to_config` to the code generation for Wasmtime's `Wasi` type. The new method adds the WASI functions to the given config as host functions. This commit adds context functions to `Store`: `get` to get a context of a particular type and `set` to set the context on the store. For safety, `set` cannot replace an existing context value of the same type. `Wasi::set_context` was added to set the WASI context for a `Store` when using `Wasi::add_to_config`. * Add `Config::define_host_func_async`. * Make config "async" rather than store. This commit moves the concept of "async-ness" to `Config` rather than `Store`. Note: this is a breaking API change for anyone that's already adopted the new async support in Wasmtime. Now `Config::new_async` is used to create an "async" config and any `Store` associated with that config is inherently "async". This is needed for async shared host functions to have some sanity check during their execution (async host functions, like "async" `Func`, need to be called with the "async" variants). * Update async function tests to smoke async shared host functions. This commit updates the async function tests to also smoke the shared host functions, plus `Func::wrap0_async`. This also changes the "wrap async" method names on `Config` to `wrap$N_host_func_async` to slightly better match what is on `Func`. * Move the instance allocator into `Engine`. This commit moves the instantiated instance allocator from `Config` into `Engine`. This makes certain settings in `Config` no longer order-dependent, which is how `Config` should ideally be. This also removes the confusing concept of the "default" instance allocator, instead opting to construct the on-demand instance allocator when needed. This does alter the semantics of the instance allocator as now each `Engine` gets its own instance allocator rather than sharing a single one between all engines created from a configuration. * Make `Engine::new` return `Result`. This is a breaking API change for anyone using `Engine::new`. As creating the pooling instance allocator may fail (likely cause is not enough memory for the provided limits), instead of panicking when creating an `Engine`, `Engine::new` now returns a `Result`. * Remove `Config::new_async`. This commit removes `Config::new_async` in favor of treating "async support" as any other setting on `Config`. The setting is `Config::async_support`. * Remove order dependency when defining async host functions in `Config`. This commit removes the order dependency where async support must be enabled on the `Config` prior to defining async host functions. The check is now delayed to when an `Engine` is created from the config. * Update WASI example to use shared `Wasi::add_to_config`. This commit updates the WASI example to use `Wasi::add_to_config`. As only a single store and instance are used in the example, it has no semantic difference from the previous example, but the intention is to steer users towards defining WASI on the config and only using `Wasi::add_to_linker` when more explicit scoping of the WASI context is required.	2021-03-11 10:14:03 -06:00
Peter Huene	f8cc824396	Merge pull request #2518 from peterhuene/add-allocator Implement the pooling instance allocator.	2021-03-08 12:20:31 -08:00
Peter Huene	7a93132ffa	Code review feedback. * Improve comments. * Drop old table element after updating the table. * Extract out the same `cfg_if!` to a single constant.	2021-03-08 09:04:13 -08:00
Dan Gohman	8bd1c33fec	Add a comment documenting fuel consumption rates. (#2711 )	2021-03-08 09:03:08 -06:00
Pat Hickey	ccdf6ec0b1	Merge pull request #2701 from bytecodealliance/pch/wiggle_async wiggle: support for Rust async	2021-03-05 10:43:55 -08:00
Peter Huene	a7190764e1	More code review changes. * Add more overflow checks in table/memory initialization. * Comment for `with_allocation_strategy` to explain ignored `Config` options. * Fix Wasmtime `Table` to not panic for type mismatches in `fill`/`copy`. * Add tests for that fix.	2021-03-05 00:49:06 -08:00
Peter Huene	a4084db096	More feedback changes. * Don't reexport types from `wasmtime_runtime` from the `wasmtime` crate. * Add more comments.	2021-03-04 22:27:27 -08:00
Peter Huene	ff840b3d3b	More PR feedback changes. * More use of `anyhow`. * Change `make_accessible` into `protect_linear_memory` to better demonstrate what it is used for; this will make the uffd implementation make a little more sense. * Remove `create_memory_map` in favor of just creating the `Mmap` instances in the pooling allocator. This also removes the need for `MAP_NORESERVE` in the uffd implementation. * Moar comments. * Remove `BasePointerIterator` in favor of `impl Iterator`. * The uffd implementation now only monitors linear memory pages and will only receive faults on pages that could potentially be accessible and never on a statically known guard page. * Stop allocating memory or table pools if the maximum limit of the memory or table is 0.	2021-03-04 20:14:40 -08:00
Peter Huene	a464465e2f	Code review feedback changes. * Add `anyhow` dependency to `wasmtime-runtime`. * Revert `get_data` back to `fn`. * Remove `DataInitializer` and box the data in `Module` translation instead. * Improve comments on `MemoryInitialization`. * Remove `MemoryInitialization::OutOfBounds` in favor of proper bulk memory semantics. * Use segmented memory initialization except for when the uffd feature is enabled on Linux. * Validate modules with the allocator after translation. * Updated various functions in the runtime to return `anyhow::Result`. * Use a slice when copying pages instead of `ptr::copy_nonoverlapping`. * Remove unnecessary casts in `OnDemandAllocator::deallocate`. * Better document the `uffd` feature. * Use WebAssembly page-sized pages in the paged initialization. * Remove the stack pool from the uffd handler and simply protect just the guard pages.	2021-03-04 18:19:46 -08:00
Peter Huene	4e83392070	Fix bad merge. Fix a bad merge with the `async` feature that accidentally removed the allocation of fiber stacks via the instance allocator.	2021-03-04 18:19:46 -08:00
Peter Huene	505437e353	Code cleanup. Last minute code clean up to fix some comments and rename `address_space_size` to `memory_reservation_size` to better describe what the option is doing.	2021-03-04 18:19:46 -08:00
Peter Huene	a481e11e63	Add the `uffd` feature to the wasmtime crate docs.	2021-03-04 18:19:46 -08:00
Peter Huene	f5c4d87c45	Implement on-demand memory initialization for the uffd feature. This commit implements copying paged initialization data upon a fault of a linear memory page. If the initialization data is "paged", then the appropriate pages are copied into the Wasm page (or zeroed if the page is not present in the initialization data). If the initialization data is not "paged", the Wasm page is zeroed so that module instantiation can initialize the pages.	2021-03-04 18:19:45 -08:00
Peter Huene	a2c439117a	Implement user fault handling with `userfaultfd` on Linux. This commit implements the `uffd` feature which turns on support for utilizing the `userfaultfd` system call on Linux for the pooling instance allocator. By handling page faults in userland, we are able to detect guard page accesses without having to constantly change memory page protections. This should help reduce the number of syscalls as well as kernel lock contentions when many threads are allocating and deallocating instances. Additionally, the user fault handler can lazy initialize linear memories of an instance (implementation to come).	2021-03-04 18:18:52 -08:00
Peter Huene	e71ccbf9bc	Implement the pooling instance allocator. This commit implements the pooling instance allocator. The allocation strategy can be set with `Config::with_allocation_strategy`. The pooling strategy uses the pooling instance allocator to preallocate a contiguous region of memory for instantiating modules that adhere to various limits. The intention of the pooling instance allocator is to reserve as much of the host address space needed for instantiating modules ahead of time and to reuse committed memory pages wherever possible.	2021-03-04 18:18:51 -08:00
Peter Huene	16ca5e16d9	Implement allocating fiber stacks for an instance allocator. This commit implements allocating fiber stacks in an instance allocator. The on-demand instance allocator doesn't support custom stacks, so the implementation will use the allocation from `wasmtime-fiber` for the fiber stacks. In the future, the pooling instance allocator will return custom stacks to use on Linux and macOS. On Windows, the native fiber implementation will always be used.	2021-03-04 18:18:51 -08:00

1 2 3 4 5

201 Commits