wasmtime

Author	SHA1	Message	Date
Alex Crichton	dbd000c1ce	Change `asm` to `__asm__` in `helpers.c` (#6188 ) This is an attempt to fix #6177 since according to [this reference][1] some modes of compilation require `__asm__` instead of `asm`. [1]: https://en.cppreference.com/w/c/language/asm	2023-04-10 17:47:25 +00:00
Nick Fitzgerald	2e48babf23	cranelift-wasm: Add a bounds-checking optimization for dynamic memories and guard pages (#6031 ) * cranelift-wasm: Add a bounds-checking optimization for dynamic memories and guard pages This is a new special case for when we know that there are enough guard pages to cover the memory access's offset and access size. The precise should-we-trap condition is index + offset + access_size > bound However, if we instead check only the partial condition index > bound then the most out of bounds that the access can be, while that partial check still succeeds, is `offset + access_size`. However, when we have a guard region that is at least as large as `offset + access_size`, we can rely on the virtual memory subsystem handling these out-of-bounds errors at runtime. Therefore, the partial `index > bound` check is sufficient for this heap configuration. Additionally, this has the advantage that a series of Wasm loads that use the same dynamic index operand but different static offset immediates -- which is a common code pattern when accessing multiple fields in the same struct that is in linear memory -- will all emit the same `index > bound` check, which we can GVN. * cranelift: Add WAT tests for accessing dynamic memories with the same index but different offsets The bounds check comparison is GVN'd but we still branch on values we should know will always be true if we get this far in the code. This is actual `br_if`s in the non-Spectre code and `select_spectre_guard`s that we should know will always go a certain way if we have Spectre mitigations enabled. Improving the non-Spectre case is pretty straightforward: walk the dominator tree and remember which values we've already branched on at this point, and therefore we can simplify any further conditional branches on those same values into direct jumps. Improving the Spectre case requires something that is morally the same, but has a few snags: * We don't have actual `br_if`s to determine whether the bounds checking condition succeeded or not. We need to instead reason about dominating `select_spectre_guard; {load, store}` instruction pairs. * We have to be SUPER careful about reasoning "through" `select_spectre_guard`s. Our general rule is never to do that, since it could break the speculative execution sandboxing that the instruction is designed for.	2023-03-17 19:06:19 +00:00
Alex Crichton	28371bfd40	Validate faulting addresses are valid to fault on (#6028 ) * Validate faulting addresses are valid to fault on This commit adds a defense-in-depth measure to Wasmtime which is intended to mitigate the impact of CVEs such as GHSA-ff4p-7xrq-q5r8. Currently Wasmtime will catch `SIGSEGV` signals for WebAssembly code so long as the instruction which faulted is an allow-listed instruction (aka has a trap code listed for it). With the recent security issue, however, the problem was that a wasm guest could exploit a compiler bug to access memory outside of its sandbox. If the access was successful there's no real way to detect that, but if the access was unsuccessful then Wasmtime would happily swallow the `SIGSEGV` and report a nominal trap. To embedders, this might look like nothing is going awry. The new strategy implemented here in this commit is to attempt to be more robust towards these sorts of failures. When a `SIGSEGV` is raised the faulting pc is recorded but additionally the address of the inaccessible location is also record. After the WebAssembly stack is unwound and control returns to Wasmtime which has access to a `Store` Wasmtime will now use this inaccessible faulting address to translate it to a wasm address. This process should be guaranteed to succeed as WebAssembly should only be able to access a well-defined region of memory for all linear memories in a `Store`. If no linear memory in a `Store` could contain the faulting address, then Wasmtime now prints a scary message and aborts the process. The purpose of this is to catch these sorts of bugs, make them very loud errors, and hopefully mitigate impact. This would continue to not mitigate the impact of a guest successfully loading data outside of its sandbox, but if a guest was doing a sort of probing strategy trying to find valid addresses then any invalid access would turn into a process crash which would immediately be noticed by embedders. While I was here I went ahead and additionally took a stab at #3120. Traps due to `SIGSEGV` will now report the size of linear memory and the address that was being accessed in addition to the bland "access out of bounds" error. While this is still somewhat bland in the context of a high level source language it's hopefully at least a little bit more actionable for some. I'll note though that this isn't a guaranteed contextual message since only the default configuration for Wasmtime generates `SIGSEGV` on out-of-bounds memory accesses. Dynamically bounds-checked configurations, for example, don't do this. Testing-wise I unfortunately am not aware of a great way to test this. The closet equivalent would be something like an `unsafe` method `Config::allow_wasm_sandbox_escape`. In lieu of adding tests, though, I can confirm that during development the crashing messages works just fine as it took awhile on macOS to figure out where the faulting address was recorded in the exception information which meant I had lots of instances of recording an address of a trap not accessible from wasm. * Fix tests * Review comments * Fix compile after refactor * Fix compile on macOS * Fix trap test for s390x s390x rounds faulting addresses to 4k boundaries.	2023-03-17 14:52:54 +00:00
Alex Crichton	5ae8575296	x64: Take SIGFPE signals for divide traps (#6026 ) * x64: Take SIGFPE signals for divide traps Prior to this commit Wasmtime would configure `avoid_div_traps=true` unconditionally for Cranelift. This, for the division-based instructions, would change emitted code to explicitly trap on trap conditions instead of letting the `div` x86 instruction trap. There's no specific reason for Wasmtime, however, to specifically avoid traps in the `div` instruction. This means that the extra generated branches on x86 aren't necessary since the `div` and `idiv` instructions already trap for similar conditions as wasm requires. This commit instead disables the `avoid_div_traps` setting for Wasmtime's usage of Cranelift. Subsequently the codegen rules were updated slightly: * When `avoid_div_traps=true`, traps are no longer emitted for `div` instructions. * The `udiv`/`urem` instructions now list their trap as divide-by-zero instead of integer overflow. * The lowering for `sdiv` was updated to still explicitly check for zero but the integer overflow case is deferred to the instruction itself. * The lowering of `srem` no longer checks for zero and the listed trap for the `div` instruction is a divide-by-zero. This means that the codegen for `udiv` and `urem` no longer have any branches. The codegen for `sdiv` removes one branch but keeps the zero-check to differentiate the two kinds of traps. The codegen for `srem` removes one branch but keeps the -1 check since the semantics of `srem` mismatch with the semantics of `idiv` with a -1 divisor (specifically for INT_MIN). This is unlikely to have really all that much of a speedup but was something I noticed during #6008 which seemed like it'd be good to clean up. Plus Wasmtime's signal handling was already set up to catch `SIGFPE`, it was just never firing. * Remove the `avoid_div_traps` cranelift setting With no known users currently removing this should be possible and helps simplify the x64 backend. * x64: GC more support for avoid_div_traps Remove the `validate_sdiv_divisor` pseudo-instructions and clean up some of the ISLE rules now that `div` is allowed to itself trap unconditionally. x64: Store div trap code in instruction itself * Keep divisors in registers, not in memory Don't accidentally fold multiple traps together * Handle EXC_ARITHMETIC on macos * Update emit tests * Update winch and tests	2023-03-16 00:18:45 +00:00
Alex Crichton	8bb183f16e	Implement the relaxed SIMD proposal (#5892 ) * Initial support for the Relaxed SIMD proposal This commit adds initial scaffolding and support for the Relaxed SIMD proposal for WebAssembly. Codegen support is supported on the x64 and AArch64 backends on this time. The purpose of this commit is to get all the boilerplate out of the way in terms of plumbing through a new feature, adding tests, etc. The tests are copied from the upstream repository at this time while the WebAssembly/testsuite repository hasn't been updated. A summary of changes made in this commit are: * Lowerings for all relaxed simd opcodes have been added, currently all exhibiting deterministic behavior. This means that few lowerings are optimal on the x86 backend, but on the AArch64 backend, for example, all lowerings should be optimal. * Support is added to codegen to, eventually, conditionally generate different code based on input codegen flags. This is intended to enable codegen to more efficient instructions on x86 by default, for example, while still allowing embedders to force architecture-independent semantics and behavior. One good example of this is the `f32x4.relaxed_fmadd` instruction which when deterministic forces the `fma` instruction, but otherwise if the backend doesn't have support for `fma` then intermediate operations are performed instead. * Lowerings of `iadd_pairwise` for `i16x8` and `i32x4` were added to the x86 backend as they're now exercised by the deterministic lowerings of relaxed simd instructions. * Sample codegen tests for added for x86 and aarch64 for some relaxed simd instructions. * Wasmtime embedder support for the relaxed-simd proposal and forcing determinism have been added to `Config` and the CLI. * Support has been added to the `.wast` runtime execution for the `(either ...)` matcher used in the relaxed-simd proposal. Tests for relaxed-simd are run both with a default `Engine` as well as a "force deterministic" `Engine` to test both configurations. * All tests from the upstream repository were copied into Wasmtime. These tests should be deleted when WebAssembly/testsuite is updated. * x64: Add x86-specific lowerings for relaxed simd This commit builds on the prior commit and adds an array of `x86_` instructions to Cranelift which have semantics that match their corresponding x86 equivalents. Translation for relaxed simd is then additionally updated to conditionally generate different CLIF for relaxed simd instructions depending on whether the target is x86 or not. This means that for AArch64 no changes are made but for x86 most relaxed instructions now lower to some x86-equivalent with slightly different semantics than the "deterministic" lowering. Add libcall support for fma to Wasmtime This will be required to implement the `f32x4.relaxed_madd` instruction (and others) when an x86 host doesn't specify the `has_fma` feature. * Ignore relaxed-simd tests on s390x and riscv64 * Enable relaxed-simd tests on s390x * Update cranelift/codegen/meta/src/shared/instructions.rs Co-authored-by: Andrew Brown <andrew.brown@intel.com> * Add a FIXME from review * Add notes about deterministic semantics * Don't default `has_native_fma` to `true` * Review comments and rebase fixes --------- Co-authored-by: Andrew Brown <andrew.brown@intel.com>	2023-03-07 15:52:41 +00:00
Alex Crichton	f91640ffab	Fix a panic due to a race in unpark and park (#5871 ) * Remove globals from parking spot tests Use `std:🧵:scope` to keep everything local to just the tests. * Fix a panic due to a race in `unpark` and `park` This commit fixes a panic in the `ParkingSpot` implementation where an `unpark` signal may not get acknowledged when a waiter times out, causing the waiter to remove itself from the internal map but panic thinking that it missed an unpark signal. The fix in this commit is to consume unpark signals when a timeout happens. This can lead to another possible race I've detailed in the comments which I believe is allowed by the specification of park/unpark in wasm. * Update crates/runtime/src/parking_spot.rs Co-authored-by: Andrew Brown <andrew.brown@intel.com> --------- Co-authored-by: Andrew Brown <andrew.brown@intel.com>	2023-02-23 23:20:05 +00:00
Alphyr	cb150d37ce	Update dependencies (#5513 )	2023-02-14 19:45:15 +00:00
Koute	e40a838beb	Prevent trampoline entrypoints from being stripped out during LTO (#5773 ) This works around a `rustc` bug where compiling with LTO will sometimes strip out some of the trampoline entrypoint symbols resulting in a linking failure.	2023-02-14 09:16:27 -06:00
Nick Fitzgerald	317cc51337	Rename `VMCallerCheckedAnyfunc` to `VMCallerCheckedFuncRef` (#5738 ) At some point what is now `funcref` was called `anyfunc` and the spec changed, but we didn't update our internal names. This does that. Co-authored-by: Jamey Sharp <jsharp@fastly.com>	2023-02-07 22:09:02 +00:00
Alex Crichton	91b8a2c527	Always allocate `Instance` memory with `malloc` (#5656 ) This commit removes the pooling of `Instance` allocations from the pooling instance allocator. This means that the allocation of `Instance` (and `VMContext`) memory, now always happens through the system `malloc` and `free` instead of optionally being part of the pooling instance allocator. Along the way this refactors the `InstanceAllocator` trait so the pooling and on-demand allocators can share more structure with this new property of the implementation. The main rationale for this commit is to reduce the RSS of long-lived programs which allocate instances with the pooling instance allocator and aren't using the "next available" allocation strategy. In this situation the memory for an instance is never decommitted until the end of the program, meaning that eventually all instance slots will become occupied and resident. This has the effect of Wasmtime slowly eating more and more memory over time as each slot gets an instance allocated. By switching to the system allocator this should reduce the current RSS workload from O(used slots) to O(active slots), which is more in line with expectations.	2023-02-01 19:37:45 +00:00
Alex Crichton	8ffbb9cfd7	Reimplement the pooling instance allocation strategy (#5661 ) * Reimplement the pooling instance allocation strategy This commit is a reimplementation of the strategy by which the pooling instance allocator selects a slot for a module. Previously there was a choice amongst three different algorithms: "reuse affinity", "next available", and "random". The default was "reuse affinity" but some new data has come to light which shows that this may not always be a good default. Notably the pooling allocator will retain some memory per-slot in the pooling instance allocator, for example instance data or memory data if-so-configured. This means that a currently unused, but previously used, slot can contribute to the RSS usage of a program using Wasmtime. Consequently the RSS impact here is O(max slots) which can be counter-intuitive for embedders. This particularly affects "reuse affinity" because the algorithm for picking a slot when there are no affine slots is "pick a random slot", which means eventually all slots will get used. In discussions about possible ways to tackle this, an alternative to "pick a strategy" arose and is now implemented in this commit. Concretely the new allocation algorithm for a slot is now: * First pick the most recently used affine slot, if one exists. * Otherwise if the number of affine slots to other modules is above some threshold N then pick the least-recently used affine slot. * Otherwise pick a slot that's affine to nothing. The "N" in this algorithm is configurable and setting it to 0 is the same as the old "next available" strategy while setting it to infinity is the same as the "reuse affinity" algorithm. Setting it to something in the middle provides a knob to allow a modest "cache" of affine slots while not allowing the total set of slots used to grow too much beyond the maximal concurrent set of modules. The "random" strategy is now no longer possible and was removed to help simplify the allocator. * Resolve rustdoc warnings in `wasmtime-runtime` crate * Remove `max_cold` as it duplicates the `slot_state.len()` * More descriptive names * Add a comment and debug assertion * Add some list assertions	2023-02-01 11:43:51 -06:00
Alex Crichton	4ad86752de	Fix libcall relocations for precompiled modules (#5608 ) * Fix libcall relocations for precompiled modules This commit fixes some asserts and support for relocation libcalls in precompiled modules loaded from disk. In doing so this reworks how mmaps are managed for files from disk. All non-file-backed `Mmap` entries are read/write but file-backed versions were readonly. This commit changes this such that all `Mmap` objects, even if they're file-backed, start as read/write. The file-based versions all use copy-on-write to preserve the private-ness of the mapping. This is not functionally intended to change anything. Instead this should have some more memory writable after a module is loaded but the text section, for example, is still left as read/execute when loading is finished. Additionally this makes modules compiled in memory more consistent with modules loaded from disk. * Update a comment * Force images to become readonly during publish This marks compiled images as entirely readonly during the `CodeMemory::publish` step which happens just before the text section becomes executable. This ensures that all images, no matter where they come from, are guaranteed frozen before they start executing.	2023-01-25 12:09:15 -06:00
Szczepan Ćwikliński	86790d36df	Fix compile errors on FreeBSD x64/arm64 (#5606 ) * Fix compile error on FreeBSD x64 * Fix compile on FreeBSD arm64 * Update Cargo.lock for ittapi * vet: certify diff for ittapi libraries Co-authored-by: Andrew Brown <andrew.brown@intel.com>	2023-01-20 18:42:03 +00:00
Alex Crichton	9b896d2a70	Resolve libcall relocations for older CPUs (#5567 ) * Resolve libcall relocations for older CPUs Long ago Wasmtime used to have logic for resolving relocations post-compilation for libcalls which I ended up removing during refactorings last year. As #5563 points out, however, it's possible to get Wasmtime to panic by disabling SSE features which forces Cranelift to use libcalls for some floating-point operations instead. Note that this also requires disabling SIMD because SIMD support has a baseline of SSE 4.2. This commit pulls back the old implementations of various libcalls and reimplements logic necessary to have them work on CPUs without SSE 4.2 Closes #5563 * Fix log message in `wast` support * Fix offset listed in relocations Be sure to factor in the offset of the function itself * Review comments	2023-01-18 09:04:10 -06:00
Alex Crichton	d9fdbfd50e	Use the `sym` operator for inline assembly (#5459 ) * Use the `sym` operator for inline assembly Avoids extra `#[no_mangle]` functions and undue symbols being exposed from Wasmtime. This is a newly stabilized feature in Rust 1.66.0. I've also added a `rust-version` entry to the `wasmtime` crate to try to head off possible reports in the future about odd error messages or usage of unstable features if the rustc version is too old. * Fix a s390x warning * Add `rust-version` annotation to Wasmtime crate As the other main entrypoint for embedders.	2022-12-16 20:12:24 +00:00
Alex Crichton	03715dda9d	Tidy up some internals of instance allocation (#5346 ) * Simplify the `ModuleRuntimeInfo` trait slightly Fold two functions into one as they're only called from one location anyway. * Remove ModuleRuntimeInfo::signature This is redundant as the array mapping is already stored within the `VMContext` so that can be consulted rather than having a separate trait function for it. This required altering the `Global` creation slightly to work correctly in this situation. * Remove a now-dead constant * Shared `VMOffsets` across instances This commit removes the computation of `VMOffsets` to being per-module instead of per-instance. The `VMOffsets` structure is also quite large so this shaves off 112 bytes per instance which isn't a huge impact but should help lower the cost of instantiating small modules. * Remove `InstanceAllocator::adjust_tunables` This is no longer needed or necessary with the pooling allocator. * Fix compile warning * Fix a vtune warning * Fix pooling tests * Fix another test warning	2022-12-01 22:22:08 +00:00
Alex Crichton	ed6769084b	Add a `WasmBacktrace::new()` constructor (#5341 ) * Add a `WasmBacktrace::new()` constructor This commit adds a method of manually capturing a backtrace of WebAssembly frames within a `Store`. The new constructor can be called with any `AsContext` values, primarily `&Store` and `&Caller`, during host functions to inspect the calling state. For now this does not respect the `Config::wasm_backtrace` option and instead unconditionally captures the backtrace. It's hoped that this can continue to adapt to needs of embedders by making it more configurable int he future if necessary. Closes #5339 * Split `new` into `capture` and `force_capture`	2022-12-01 22:19:07 +00:00
Alex Crichton	e0b9663e44	Remove some custom error types in Wasmtime (#5347 ) * Remove some custom error types in Wasmtime These types are mostly cumbersome to work with nowadays that `anyhow` is used everywhere else. This commit removes `InstantiationError` and `SetupError` in favor of using `anyhow::Error` throughout. This can eventually culminate in creation of specific errors for embedders to downcast to but for now this should be general enough. * Fix Windows build	2022-12-01 14:47:10 -06:00
Alex Crichton	86acb9a438	Use workspace inheritance for some more dependencies (#5349 ) Deduplicate some dependency directives through `[workspace.dependencies]`	2022-11-29 22:32:56 +00:00
Dan Gohman	d6d3c49972	Update to cap-std 1.0, io-lifetimes 1.0. (#5330 ) The main change here is that io-lifetimes 1.0 switches to use the I/O safety feature in the standard library rather than providing its own copy. This also updates to windows-sys 0.42.0 and rustix 0.36.	2022-11-28 15:31:18 -08:00
Alex Crichton	951bdcb2cf	Clear affine slots when dropping a `Module` (#5321 ) * Clear affine slots when dropping a `Module` This commit implements a resource usage optimization for Wasmtime with the pooling instance allocator by ensuring that when a `Module` is dropped its backing virtual memory mappings are all removed. Currently when a `Module` is dropped it releases a strong reference to its internal memory image but the memory image may stick around in individual pooling instance allocator slots. When using the `Random` allocation strategy, for example, this means that the memory images could stick around for a long time. While not a pressing issue this has resource usage implications for Wasmtime. Namely removing a `Module` does not guarantee the memfd, if in use for a memory image, is closed and deallocated within the kernel. Unfortunately simply closing the memfd is not sufficient as well as the mappings into the address space additionally all need to be removed for the kernel to release the resources for the memfd. This means that to release all kernel-level resources for a `Module` all slots which have the memory image mapped in must have the slot reset. This problem isn't particularly present when using the `NextAvailable` allocation strategy since the number of lingering memfds is proportional to the maximum concurrent size of wasm instances. With the `Random` and `ReuseAffinity` strategies, however, it's much more prominent because the number of lingering memfds can reach the total number of slots available. This can appear as a leak of kernel-level memory which can cause other system instability. To fix this issue this commit adds necessary instrumentation to `Drop for Module` to purge all references to the module in the pooling instance allocator. All index allocation strategies now maintain affinity tracking to ensure that regardless of the strategy in use a module that is dropped will remove all its memory mappings. A new allocation method was added to the index allocator for allocating an index without setting affinity and only allocating affine slots. This is used to iterate over all the affine slots without holding the global index lock for an unnecessarily long time while mappings are removed. * Review comments	2022-11-28 08:58:02 -06:00
Alex Crichton	6ce2ac19b8	Refactor shared memory internals, expose embedder methods (#5311 ) This commit refactors the internals of `wasmtime_runtime::SharedMemory` a bit to expose the necessary functions to invoke from the `wasmtime::SharedMemory` layer. Notably some items are moved out of the `RwLock` from prior, such as the type and the `VMMemoryDefinition`. Additionally the organization around the `atomic_*` methods has been redone to ensure that the `wasmtime`-layer abstraction has a single method to call into which everything else uses as well.	2022-11-22 08:51:55 -08:00
Harald Hoyer	8ce98e3c12	fix: atomit wait does not sleep long enough (#5315 ) From the documentation of `CondVar::wait_timeout`: > The semantics of this function are equivalent to wait except that the thread > will be blocked for roughly no longer than `dur`. This method should not be > used for precise timing due to anomalies such as preemption or platform > differences that might not cause the maximum amount of time waited to be > precisely `dur`. Therefore, go to sleep again, if the thread has not slept long enough. Signed-off-by: Harald Hoyer <harald@profian.com> Signed-off-by: Harald Hoyer <harald@profian.com>	2022-11-22 09:36:29 -06:00
Harald Hoyer	c74706aa59	feat: implement memory.atomic.notify,wait32,wait64 (#5255 ) * feat: implement memory.atomic.notify,wait32,wait64 Added the parking_spot crate, which provides the needed registry for the operations. Signed-off-by: Harald Hoyer <harald@profian.com> * fix: change trap message for HeapMisaligned The threads spec test wants "unaligned atomic" instead of "misaligned memory access". Signed-off-by: Harald Hoyer <harald@profian.com> * tests: add test for atomic wait on non-shared memory Signed-off-by: Harald Hoyer <harald@profian.com> * tests: add tests/spec_testsuite/proposals/threads without pooling and reference types. Also "shared_memory" is added to the "spectest" interface. Signed-off-by: Harald Hoyer <harald@profian.com> * tests: add atomics_notify.wast checking that notify with 0 waiters returns 0 on shared and non-shared memory. Signed-off-by: Harald Hoyer <harald@profian.com> * tests: add tests for atomic wait on shared memory - return 2 - timeout for 0 - return 2 - timeout for 1000ns - return 1 - invalid value Signed-off-by: Harald Hoyer <harald@profian.com> * fixup! feat: implement memory.atomic.notify,wait32,wait64 Signed-off-by: Harald Hoyer <harald@profian.com> * fixup! feat: implement memory.atomic.notify,wait32,wait64 Signed-off-by: Harald Hoyer <harald@profian.com> Signed-off-by: Harald Hoyer <harald@profian.com>	2022-11-21 18:23:06 +00:00
Alex Crichton	7a31c5b07c	Deduplicate listings of traps in Wasmtime (#5299 ) This commit replaces `wasmtime_environ::TrapCode` with `wasmtime::Trap`. This is possible with past refactorings which slimmed down the `Trap` definition in the `wasmtime` crate to a simple `enum`. This means that there's one less place that all the various trap opcodes need to be listed in Wasmtime.	2022-11-18 22:04:38 +00:00
Alex Crichton	7ec626b898	Use deterministic randomness fuzzing the pooling allocator (#5247 ) This commit updates the index allocation performed in the pooling allocator with a few refactorings: * With `cfg(fuzzing)` a deterministic rng is now used to improve reproducibility of fuzz test cases. * The `Mutex` was pushed inside of `IndexAllocator`, renamed from `PoolingAllocationState`. * Randomness is now always done through a `SmallRng` stored in the `IndexAllocator` instead of using `thread_rng`. * The `is_empty` method has been removed in favor of an `Option`-based return on `alloc`. This refactoring is additionally intended to encapsulate more implementation details of `IndexAllocator` to more easily allow for alternate implementations in the future such as lock-free approaches (possibly).	2022-11-10 20:53:04 +00:00
Alex Crichton	92f6fe36cc	Fix CI after CVE fixes (#5245 ) * Fix CI after CVE fixes Alas we can't run CI ahead of time so this fixes various minor build issues from the merging of the recent CVE fixes. Note that I plan to publish the advisories once CI issues are sorted out. * Fix mmap/free of zero bytes	2022-11-10 13:35:15 -06:00
Alex Crichton	3535acbf3b	Merge pull request from GHSA-wh6w-3828-g9qf * Unconditionally use `MemoryImageSlot` This commit removes the internal branching within the pooling instance allocator to sometimes use a `MemoryImageSlot` and sometimes now. Instead this is now unconditionally used in all situations on all platforms. This fixes an issue where the state of a slot could get corrupted if modules being instantiated switched from having images to not having an image or vice versa. The bulk of this commit is the removal of the `memory-init-cow` compile-time feature in addition to adding Windows support to the `cow.rs` file. * Fix compile on Unix * Add a stricter assertion for static memory bounds Double-check that when a memory is allocated the configuration required is satisfied by the pooling allocator.	2022-11-10 11:34:38 -06:00
Nick Fitzgerald	47fa1ad6a8	Rework bounds checking for atomic operations (#5239 ) Before, we would do a `heap_addr` to translate the given Wasm memory address into a native memory address and pass it into the libcall that implemented the atomic operation, which would then treat the address as a Wasm memory address and pass it to `validate_atomic_addr` to be bounds checked a second time. This is a bit nonsensical, as we are validating a native memory address as if it were a Wasm memory address. Now, we no longer do a `heap_addr` to translate the Wasm memory address to a native memory address. Instead, we pass the Wasm memory address to the libcall, and the libcall is responsible for doing the bounds check (by calling `validate_atomic_addr` with the correct type of memory address now).	2022-11-09 16:19:43 -08:00
Chris Fallin	d59caf39b6	Wasmtime+Cranelift: strip out some dead x86-32 code. (#5226 ) * Wasmtime+Cranelift: strip out some dead x86-32 code. I was recently pointed to fastly/Viceroy#200 where it seems some folks are trying to compile Wasmtime (via Viceroy) for Windows x86-32 and the failures may not be loud enough. I've tried to reproduce this cross-compiling to i686-pc-windows-gnu from Linux and hit build failures (as expected) in several places. Nevertheless, while trying to discern what others may be attempting, I noticed some dead x86-32-specific code in our repo, and figured it would be a good idea to clean this up. Otherwise, it (i) sends some mixed messages -- "hey look, this codebase does support x86-32" -- and (ii) keeps untested code around, which is generally not great. This PR removes x86-32-specific cases in traphandlers and unwind code, and Cranelift's native feature detection. It adds helpful compile-error messages in a few cases. If we ever support x86-32 (contributors welcome! The big missing piece is Cranelift support; see #1980), these compile errors and git history should be enough to recover any knowledge we are now encoding in the source. I left the x86-32 support in `wasmtime-fiber` alone because that seems like a bit of a special case -- foundation library, separate from the rest of Wasmtime, with specific care to provide a (presumably working) full 32-bit version. * Remove some extraneous compile_error!s, already covered by others.	2022-11-08 23:03:17 +00:00
Alex Crichton	50cffad0d3	Implement support for dynamic memories in the pooling allocator (#5208 ) * Implement support for dynamic memories in the pooling allocator This is a continuation of the thrust in #5207 for reducing page faults and lock contention when using the pooling allocator. To that end this commit implements support for efficient memory management in the pooling allocator when using wasm that is instrumented with bounds checks. The `MemoryImageSlot` type now avoids unconditionally shrinking memory back to its initial size during the `clear_and_remain_ready` operation, instead deferring optional resizing of memory to the subsequent call to `instantiate` when the slot is reused. The instantiation portion then takes the "memory style" as an argument which dictates whether the accessible memory must be precisely fit or whether it's allowed to exceed the maximum. This in effect enables skipping a call to `mprotect` to shrink the heap when dynamic memory checks are enabled. In terms of page fault and contention this should improve the situation by: * Fewer calls to `mprotect` since once a heap grows it stays grown and it never shrinks. This means that a write lock is taken within the kernel much more rarely from before (only asymptotically now, not N-times-per-instance). * Accessed memory after a heap growth operation will not fault if it was previously paged in by a prior instance and set to zero with `memset`. Unlike #5207 which requires a 6.0 kernel to see this optimization this commit enables the optimization for any kernel. The major cost of choosing this strategy is naturally the performance hit of the wasm itself. This is being looked at in PRs such as #5190 to improve Wasmtime's story here. This commit does not implement any new configuration options for Wasmtime but instead reinterprets existing configuration options. The pooling allocator no longer unconditionally sets `static_memory_bound_is_maximum` and then implements support necessary for this memory type. This other change to this commit is that the `Tunables::static_memory_bound` configuration option is no longer gating on the creation of a `MemoryPool` and it will now appropriately size to `instance_limits.memory_pages` if the `static_memory_bound` is to small. This is done to accomodate fuzzing more easily where the `static_memory_bound` will become small during fuzzing and otherwise the configuration would be rejected and require manual handling. The spirit of the `MemoryPool` is one of large virtual address space reservations anyway so it seemed reasonable to interpret the configuration this way. * Skip zero memory_size cases These are causing errors to happen when fuzzing and otherwise in theory shouldn't be too interesting to optimize for anyway since they likely aren't used in practice.	2022-11-08 14:43:08 -06:00
Alex Crichton	d3a6181939	Add support for keeping pooling allocator pages resident (#5207 ) When new wasm instances are created repeatedly in high-concurrency environments one of the largest bottlenecks is the contention on kernel-level locks having to do with the virtual memory. It's expected that usage in this environment is leveraging the pooling instance allocator with the `memory-init-cow` feature enabled which means that the kernel level VM lock is acquired in operations such as: 1. Growing a heap with `mprotect` (write lock) 2. Faulting in memory during usage (read lock) 3. Resetting a heap's contents with `madvise` (read lock) 4. Shrinking a heap with `mprotect` when reusing a slot (write lock) Rapid usage of these operations can lead to detrimental performance especially on otherwise heavily loaded systems, worsening the more frequent the above operations are. This commit is aimed at addressing the (2) case above, reducing the number of page faults that are fulfilled by the kernel. Currently these page faults happen for three reasons: * When memory is first accessed after the heap is grown. * When the initial linear memory image is accessed for the first time. * When the initial zero'd heap contents, not part of the linear memory image, are accessed. This PR is attempting to address the latter of these cases, and to a lesser extent the first case as well. Specifically this PR provides the ability to partially reset a pooled linear memory with `memset` rather than `madvise`. This is done to have the same effect of resetting contents to zero but namely has a different effect on paging, notably keeping the pages resident in memory rather than returning them to the kernel. This means that reuse of a linear memory slot on a page that was previously `memset` will not trigger a page fault since everything remains paged into the process. The end result is that any access to linear memory which has been touched by `memset` will no longer page fault on reuse. On more recent kernels (6.0+) this also means pages which were zero'd by `memset`, made inaccessible with `PROT_NONE`, and then made accessible again with `PROT_READ \| PROT_WRITE` will not page fault. This can be common when a wasm instances grows its heap slightly, uses that memory, but then it's shrunk when the memory is reused for the next instance. Note that this kernel optimization requires a 6.0+ kernel. This same optimization is furthermore applied to both async stacks with the pooling memory allocator in addition to table elements. The defaults of Wasmtime are not changing with this PR, instead knobs are being exposed for embedders to turn if they so desire. This is currently being experimented with at Fastly and I may come back and alter the defaults of Wasmtime if it seems suitable after our measurements.	2022-11-04 20:56:34 +00:00
Alex Crichton	b14551d7ca	Refactor configuration for the pooling allocator (#5205 ) This commit changes the APIs in the `wasmtime` crate for configuring the pooling allocator. I plan on adding a few more configuration options in the near future and the current structure was feeling unwieldy for adding these new abstractions. The previous `struct`-based API has been replaced with a builder-style API in a similar shape as to `Config`. This is done to help make it easier to add more configuration options in the future through adding more methods as opposed to adding more field which could break prior initializations.	2022-11-04 20:06:45 +00:00
Alex Crichton	cd53bed898	Implement AOT compilation for components (#5160 ) * Pull `Module` out of `ModuleTextBuilder` This commit is the first in what will likely be a number towards preparing for serializing a compiled component to bytes, a precompiled artifact. To that end my rough plan is to merge all of the compiled artifacts for a component into one large object file instead of having lots of separate object files and lots of separate mmaps to manage. To that end I plan on eventually using `ModuleTextBuilder` to build one large text section for all core wasm modules and trampolines, meaning that `ModuleTextBuilder` is no longer specific to one module. I've extracted out functionality such as function name calculation as well as relocation resolving (now a closure passed in) in preparation for this. For now this just keeps tests passing, and the trajectory for this should become more clear over the following commits. * Remove component-specific object emission This commit removes the `ComponentCompiler::emit_obj` function in favor of `Compiler::emit_obj`, now renamed `append_code`. This involved significantly refactoring code emission to take a flat list of functions into `append_code` and the caller is responsible for weaving together various "families" of functions and un-weaving them afterwards. * Consolidate ELF parsing in `CodeMemory` This commit moves the ELF file parsing and section iteration from `CompiledModule` into `CodeMemory` so one location keeps track of section ranges and such. This is in preparation for sharing much of this code with components which needs all the same sections to get tracked but won't be using `CompiledModule`. A small side benefit from this is that the section parsing done in `CodeMemory` and `CompiledModule` is no longer duplicated. * Remove separately tracked traps in components Previously components would generate an "always trapping" function and the metadata around which pc was allowed to trap was handled manually for components. With recent refactorings the Wasmtime-standard trap section in object files is now being generated for components as well which means that can be reused instead of custom-tracking this metadata. This commit removes the manual tracking for the `always_trap` functions and plumbs the necessary bits around to make components look more like modules. * Remove a now-unnecessary `Arc` in `Module` Not expected to have any measurable impact on performance, but complexity-wise this should make it a bit easier to understand the internals since there's no longer any need to store this somewhere else than its owner's location. * Merge compilation artifacts of components This commit is a large refactoring of the component compilation process to produce a single artifact instead of multiple binary artifacts. The core wasm compilation process is refactored as well to share as much code as necessary with the component compilation process. This method of representing a compiled component necessitated a few medium-sized changes internally within Wasmtime: * A new data structure was created, `CodeObject`, which represents metadata about a single compiled artifact. This is then stored as an `Arc` within a component and a module. For `Module` this is always uniquely owned and represents a shuffling around of data from one owner to another. For a `Component`, however, this is shared amongst all loaded modules and the top-level component. * The "module registry" which is used for symbolicating backtraces and for trap information has been updated to account for a single region of loaded code holding possibly multiple modules. This involved adding a second-level `BTreeMap` for now. This will likely slow down instantiation slightly but if it poses an issue in the future this should be able to be represented with a more clever data structure. This commit additionally solves a number of longstanding issues with components such as compiling only one host-to-wasm trampoline per signature instead of possibly once-per-module. Additionally the `SignatureCollection` registration now happens once-per-component instead of once-per-module-within-a-component. * Fix compile errors from prior commits * Support AOT-compiling components This commit adds support for AOT-compiled components in the same manner as `Module`, specifically adding: * `Engine::precompile_component` * `Component::serialize` * `Component::deserialize` * `Component::deserialize_file` Internally the support for components looks quite similar to `Module`. All the prior commits to this made adding the support here (unsurprisingly) easy. Components are represented as a single object file as are modules, and the functions for each module are all piled into the same object file next to each other (as are areas such as data sections). Support was also added here to quickly differentiate compiled components vs compiled modules via the `e_flags` field in the ELF header. * Prevent serializing exported modules on components The current representation of a module within a component means that the implementation of `Module::serialize` will not work if the module is exported from a component. The reason for this is that `serialize` doesn't actually do anything and simply returns the underlying mmap as a list of bytes. The mmap, however, has `.wasmtime.info` describing component metadata as opposed to this module's metadata. While rewriting this section could be implemented it's not so easy to do so and is otherwise seen as not super important of a feature right now anyway. * Fix windows build * Fix an unused function warning * Update crates/environ/src/compilation.rs Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>	2022-11-02 15:26:26 +00:00
Christopher Serr	9a8a710d8b	Add missing `Win32_Foundation` feature (#5134 ) This is necessary for the `wasmtime-runtime` crate to compile on Windows.	2022-10-26 20:42:31 +00:00
Nick Fitzgerald	a2f846f124	Don't re-capture backtraces when propagating traps through host frames (#5049 ) * Add a benchmark for traps with many Wasm<-->host calls on the stack * Add a test for expected Wasm stack traces with Wasm<--host calls on the stack when we trap * Don't re-capture backtraces when propagating traps through host frames This fixes some accidentally quadratic code where we would re-capture a Wasm stack trace (takes `O(n)` time) every time we propagated a trap through a host frame back to Wasm (can happen `O(n)` times). And `O(n) * O(n) = O(n^2)`, of course. Whoops. After this commit, it trapping with a call stack that is `n` frames deep of Wasm-to-host-to-Wasm calls just captures a single backtrace and is therefore just a proper `O(n)` time operation, as it is intended to be. Now we explicitly track whether we need to capture a Wasm backtrace or not when raising a trap. This unfortunately isn't as straightforward as one might hope, however, because of the split between `wasmtime::Trap` and `wasmtime_runtime::Trap`. We need to decide whether or not to capture a Wasm backtrace inside `wasmtime_runtime` but in order to determine whether to do that or not we need to reflect on the `anyhow::Error` and see if it is a `wasmtime::Trap` that already has a backtrace or not. This can't be done the straightforward way because it would introduce a cyclic dependency between the `wasmtime` and `wasmtime-runtime` crates. We can't merge those two `Trap` types-- at least not without effectively merging the whole `wasmtime` and `wasmtime-runtime` crates together, which would be a good idea in a perfect world but would be a ton of ocean boiling from where we currently are -- because `wasmtime::Trap` does symbolication of stack traces which relies on module registration information data that resides inside the `wasmtime` crate and therefore can't be moved into `wasmtime-runtime`. We resolve this problem by adding a boolean to `wasmtime_runtime::raise_user_trap` that controls whether we should capture a Wasm backtrace or not, and then determine whether we need a backtrace or not at each of that function's call sites, which are in `wasmtime` and therefore can do the reflection to determine whether the user trap already has a backtrace or not. Phew! Fixes #5037 * debug assert that we don't record unnecessary backtraces for traps * Add assertions around `needs_backtrace` Unfortunately we can't do debug_assert_eq!(needs_backtrace, trap.inner.backtrace.get().is_some()); because `needs_backtrace` doesn't consider whether Wasm backtraces have been disabled via config. * Consolidate `needs_backtrace` calculation followed by calling `raise_user_trap` into one place	2022-10-13 07:22:46 -07:00
Yuyi Wang	6bcc430855	Initial work to build for Windows ARM64 (#4990 ) * Make wasmtime build for windows-aarch64 * Add check for win arm64 build. * Fix checks for winarm64 key in workflows. * Add target in windows arm64 build. * Add tracking issue for Windows ARM64 trap handling	2022-10-02 19:45:42 -07:00
yuyang-ok	cdecc858b4	add riscv64 backend for cranelift. (#4271 ) Add a RISC-V 64 (`riscv64`, RV64GC) backend. Co-authored-by: yuyang <756445638@qq.com> Co-authored-by: Chris Fallin <chris@cfallin.org> Co-authored-by: Afonso Bordado <afonsobordado@az8.co>	2022-09-27 17:30:31 -07:00
Alex Crichton	84994203a1	Increase the `sigaltstack` stack size (#4964 ) This commit updates the `MIN_STACK_SIZE` constant for Unix platforms when allocating a sigaltstack from 16k to 64k. The signal handler captures a wasm `Backtrace` which involves memory allocations and it was recently discovered that, at least in debug mode, jemalloc can take up to 16k of stack space for an allocation. To allow running the sigaltstack size is increased here.	2022-09-26 22:48:26 +00:00
Alex Crichton	f12ef84cdc	Remove `handling_trap` variable (#4963 ) This historically was used to guard against recursive faults but later refactorings have made this variable somewhat obsolete. The code that it still protects is not the "meat" of trap handling. Instead the `jmp_buf_if_trap` is changed to be more like "take" so once a "take" succeeds it won't be able to recursively call any more "meat". Overall this shouldn't affect anything, it's just a small internal cleanup.	2022-09-26 22:41:47 +00:00
Alex Crichton	7b311004b5	Leverage Cargo's workspace inheritance feature (#4905 ) * Leverage Cargo's workspace inheritance feature This commit is an attempt to reduce the complexity of the Cargo manifests in this repository with Cargo's workspace-inheritance feature becoming stable in Rust 1.64.0. This feature allows specifying fields in the root workspace `Cargo.toml` which are then reused throughout the workspace. For example this PR shares definitions such as: * All of the Wasmtime-family of crates now use `version.workspace = true` to have a single location which defines the version number. * All crates use `edition.workspace = true` to have one default edition for the entire workspace. * Common dependencies are listed in `[workspace.dependencies]` to avoid typing the same version number in a lot of different places (e.g. the `wasmparser = "0.89.0"` is now in just one spot. Currently the workspace-inheritance feature doesn't allow having two different versions to inherit, so all of the Cranelift-family of crates still manually specify their version. The inter-crate dependencies, however, are shared amongst the root workspace. This feature can be seen as a method of "preprocessing" of sorts for Cargo manifests. This will help us develop Wasmtime but shouldn't have any actual impact on the published artifacts -- everything's dependency lists are still the same. * Fix wasi-crypto tests	2022-09-26 11:30:01 -05:00
Dan Gohman	6f50ddaaf2	Update to cap-std 0.26. (#4940 ) * Update to cap-std 0.26. This is primarily to pull in bytecodealliance/cap-std#271, the fix for #4936, compilation on Rust nightly on Windows. It also updates to rustix 0.35.10, to pull in bytecodealliance/rustix#403, the fix for bytecodealliance/rustix#402, compilation on newer versions of the libc crate, which changed a public function from `unsafe` to safe. Fixes #4936. * Update the system-interface audit for 0.23. * Update the libc supply-chain config version.	2022-09-21 14:56:38 -05:00
Anton Kirilov	d8b290898c	Initial forward-edge CFI implementation (#3693 ) * Initial forward-edge CFI implementation Give the user the option to start all basic blocks that are targets of indirect branches with the BTI instruction introduced by the Branch Target Identification extension to the Arm instruction set architecture. Copyright (c) 2022, Arm Limited. * Refactor `from_artifacts` to avoid second `make_executable` (#1) This involves "parsing" twice but this is parsing just the header of an ELF file so it's not a very intensive operation and should be ok to do twice. * Address the code review feedback Copyright (c) 2022, Arm Limited. Co-authored-by: Alex Crichton <alex@alexcrichton.com>	2022-09-08 09:35:58 -05:00
Alex Crichton	65930640f8	Bump Wasmtime to 2.0.0 (#4874 ) This commit replaces #4869 and represents the actual version bump that should have happened had I remembered to bump the in-tree version of Wasmtime to 1.0.0 prior to the branch-cut date. Alas!	2022-09-06 13:49:56 -05:00
Xuran	bca4dae8b0	feat: add a knob for reset stack (#4813 ) * feat: add a knob for reset stack * Touch up documentation of `async_stack_zeroing` Co-authored-by: Alex Crichton <alex@alexcrichton.com>	2022-09-01 16:09:46 +00:00
Nick Fitzgerald	ff0e84ecf4	Wasmtime: fix stack walking across frames from different stores (#4779 ) We were previously implicitly assuming that all Wasm frames in a stack used the same `VMRuntimeLimits` as the previous frame we walked, but this is not true when Wasm in store A calls into the host which then calls into Wasm in store B: \| ... \| \| Host \| \| +-----------------+ \| stack \| Wasm in store A \| \| grows +-----------------+ \| down \| Host \| \| +-----------------+ \| \| Wasm in store B \| V +-----------------+ Trying to walk this stack would previously result in a runtime panic. The solution is to push the maintenance of our list of saved Wasm FP/SP/PC registers that allow us to identify contiguous regions of Wasm frames on the stack deeper into `CallThreadState`. The saved registers list is now maintained whenever updating the `CallThreadState` linked list by making the `CallThreadState::prev` field private and only accessible via a getter and setter, where the setter always maintains our invariants.	2022-08-30 18:28:00 +00:00
Alex Crichton	5add267b87	Fix a soundness issue with lowering variants (#4723 ) * Fix a compile error on nightly Rust It looks like Rust nightly has gotten a bit more strict about attributes-on-expressions and previously accepted code is no longer accepted. This commit updates the generated code for a macro to a form which is accepted by rustc. * Fix a soundness issue with lowering variants This commit fixes a soundness issue lowering variants in the component model where host memory could be leaked to the guest module by accident. In reviewing code recently for `Val::lower` I noticed that the variant lowering was extending the payload with `ValRaw::u32(0)` to appropriately fit the size of the variant. In reading this it appeared incorrect to me due to the fact that it should be `ValRaw::u64(0)` since up to 64-bits can be read. Additionally this implementation was also incorrect because the lowered representation of the payload itself was not possibly zero-extended to 64-bits to accommodate other variants. It turned out these issues were benign because with the dynamic surface area to the component model the arguments were all initialized to 0 anyway. The static version of the API, however, does not initialize arguments to 0 and I wanted to initially align these two implementations so I updated the variant implementation of lowering for dynamic values and removed the zero-ing of arguments. To test this change I updated the `debug` mode of adapter module generation to assert that the upper bits of values in wasm are always zero when the value is casted down (during `stack_get` which only happens with variants). I then threaded through the `debug` boolean configuration parameter into the dynamic and static fuzzers. To my surprise this new assertion tripped even after the fix was applied. It turns out, though, that there was other leakage of bits through other means that I was previously unaware of. At the primitive level lowerings of types like `u32` will have a `Lower` representation of `ValRaw` and the lowering is simply `dst.write(ValRaw::i32(self))`, or the equivalent thereof. The problem, that the fuzzers detected, with this pattern is that the `ValRaw` type is 16-bytes, and `ValRaw::i32(X)` only initializes the first 4. This meant that all the lowerings for all primitives were writing up to 12 bytes of garbage from the host for the wasm module to read. It turned out that this write of a `ValRaw` was sometimes 16 bytes and sometimes the appropriate size depending on the number of optimizations in play. With enough inlining for example `dst.write(ValRaw::i32(self))` would only write 4 bytes, as expected. In debug mode though without inlining 16 bytes would be written, including the garbage from the upper bits. To solve this issue I ended up taking a somewhat different approach. I primarily updated the `ValRaw` constructors to simply always extend the values internally to 64-bits, meaning that the low 8 bytes of a `ValRaw` is always initialized. This prevents any undefined data from leaking from the host into a wasm module, and means that values are also zero-extended even if they're only used in 32-bit contexts outside of a variant. This felt like the best fix for now, though, in terms of not really having a performance impact while additionally not requiring a rewrite of all lowerings. This solution ended up also neatly removing the "zero out the entire payload" logic that was previously require. Now after a payload is lowered only the tail end of the payload, up to the size of the variant, is zeroed out. This means that each lowered argument is written to at most once which should hopefully be a small performance boost for calling into functions as well.	2022-08-16 22:33:24 +00:00
Alex Crichton	cc955e4e7e	Rename `MmapVec::drain` to `split_off` (#4673 ) * Rename `MmapVec::drain` to `split_off` As suggested on #4609 * Fix tests * Make MmapVec::split_off work like Vec::split_off Co-authored-by: Jamey Sharp <jsharp@fastly.com>	2022-08-15 21:00:12 +00:00
Alex Crichton	c1c48b4386	Don't be clever about representing non-CoW images (#4691 ) This commit fixes a build warning on Rust 1.63 when the `memory-init-cow` feature is disabled in the `wasmtime-runtime` crate. Some "tricks" were used prior to have the `MemoryImage` type be an empty `enum {}` but that wreaks havoc with warnings so this commit instead just makes it a unit struct and makes all methods panic (as they shouldn't be hit anyway).	2022-08-11 18:16:28 +00:00
Nick Fitzgerald	0b1f51f804	Remove unnecessary parens around expression (#4647 ) Fixes a compiler warning.	2022-08-08 15:48:03 -07:00

1 2 3 4 5 ...

422 Commits