wasmtime

Author	SHA1	Message	Date
Jamey Sharp	8bbd9bb228	aarch64: Test instruction selection for `bmask` (#5396 ) I copied the `bmask` precise-output tests from x64 and used CRANELIFT_TEST_BLESS=1 to generate this test. I don't know aarch64 well enough to decide if this output is correct. However, for I128 it is identical to the previous I128-only precise-output tests, and the existing runtests for bmask pass on aarch64, so I think it's likely correct.	2022-12-08 10:22:23 -08:00
Jamey Sharp	0456c1d213	cranelift-isle: Factor constraint/binding relation (#5383 ) It turns out that during codegen I'll want to know which bindings were added for a particular constraint. Factoring that out and making sure to use it everywhere that constraints and bindings are created ensures that these will always stay in sync. It also simplifies the implementation of `normalize_equivalence_classes`, which needs to create bindings for constraints but doesn't care what they are. Also make `add_pattern_constraints` non-recursive and reuse allocations.	2022-12-08 02:26:51 +00:00
Jamey Sharp	8726eeefb3	cranelift-isle: Add "partial" flag for constructors (#5392 ) * cranelift-isle: Add "partial" flag for constructors Instead of tying fallibility of constructors to whether they're either internal or pure, this commit assumes all constructors are infallible unless tagged otherwise with a "partial" flag. Internal constructors without the "partial" flag are not allowed to use constructors which have the "partial" flag on the right-hand side of any rules, because they have no way to report last-minute match failures. Multi-constructors should never be "partial"; they report match failures with an empty iterator instead. In turn this means you can't use partial constructors on the right-hand side of internal multi-constructor rules. However, you can use the same constructors on the left-hand side with `if` or `if-let` instead. In many cases, ISLE can already trivially prove that an internal constructor always returns `Some`. With this commit, those cases are largely unchanged, except for removing all the `Option`s and `Some`s from the generated code for those terms. However, for internal non-partial constructors where ISLE could not prove that, it now emits an `unreachable!` panic as the last-resort, instead of returning `None` like it used to do. Among the existing backends, here's how many constructors have these panic cases: - x64: 14% (53/374) - aarch64: 15% (41/277) - riscv64: 23% (26/114) - s390x: 47% (268/567) It's often possible to rewrite rules so that ISLE can tell the panic can never be hit. Just ensure that there's a lowest-priority rule which has no constraints on the left-hand side. But in many of these constructors, it's difficult to statically prove the unhandled cases are unreachable because that's only down to knowledge about how they're called or other preconditions. So this commit does not try to enforce that all terms have a last-resort fallback rule. * Check term flags while translating expressions Instead of doing it in a separate pass afterward. This involved threading all the term flags (pure, multi, partial) through the recursive `translate_expr` calls, so I extracted the flags to a new struct so they can all be passed together. * Validate multi-term usage Now that I've threaded the flags through `translate_expr`, it's easy to check this case too, so let's just do it. * Extract `ReturnKind` to use in `ExternalSig` There are only three legal states for the combination of `multi` and `infallible`, so replace those fields of `ExternalSig` with a three-state enum. * Remove `Option` wrapper from multi-extractors too If we'd had any external multi-constructors this would correct their signatures as well. * Update ISLE tests * Tag prelude constructors as pure where appropriate I believe the only reason these weren't marked `pure` before was because that would have implied that they're also partial. Now that those two states are specified separately we apply this flag more places. * Fix my changes to aarch64 `lower_bmask` and `imm` terms	2022-12-07 17:16:03 -08:00
Alex Crichton	c9527e0af6	Remove references to wasm-bindgen in documentation (#5394 ) These references are really, really old and are no longer applicable. In general the `wasm-*.md` documentation needs a lot of updates but this applies at least a small band-aid to remove the `#[wasm_bindgen]` references which are likely more harmful than helpful.	2022-12-07 16:41:50 -06:00
Chris Fallin	8c55b81300	Optimizations to egraph framework (#5391 ) * Optimizations to egraph framework: - Save elaborated results by canonical value, not latest value (union value). Previously we were artificially skipping and re-elaborating some values we already had because we were not finding them in the map. - Make some changes to handling of icmp results: when icmp became I8-typed (when bools went away), many uses became `(uextend $I32 (icmp $I8 ...))`, and so patterns in lowering backends were no longer matching. This PR includes an x64-specific change to match `(brz (uextend (icmp ...)))` and similarly for `brnz`, but it also takes advantage of the ability to write rules easily in the egraph mid-end to rewrite selects with icmp inputs appropriately. - Extend constprop to understand selects in the egraph mid-end. With these changes, bz2.wasm sees a ~1% speedup, and spidermonkey.wasm with a fib.js input sees a 16.8% speedup: ``` $ time taskset 1 target/release/wasmtime run --allow-precompiled --dir=. ./spidermonkey.base.cwasm ./fib.js 1346269 taskset 1 target/release/wasmtime run --allow-precompiled --dir=. ./fib.js 2.14s user 0.01s system 99% cpu 2.148 total $ time taskset 1 target/release/wasmtime run --allow-precompiled --dir=. ./spidermonkey.egraphs.cwasm ./fib.js 1346269 taskset 1 target/release/wasmtime run --allow-precompiled --dir=. ./fib.js 1.78s user 0.01s system 99% cpu 1.788 total ``` * Review feedback.	2022-12-07 13:23:13 -08:00
Trevor Elliott	c5379051c4	Enable the ssa verifier in debug builds (#5354 ) Enable regalloc2's SSA verifier in debug builds to check for any outstanding reuse of virtual registers in def constraints. As fuzzing enables debug_assertions, this will enable the SSA verifier when fuzzing as well.	2022-12-07 12:22:51 -08:00
Nick Fitzgerald	f0c4b6f3a1	Cranelift: Implement `iadd_cout` on x64 for 32- and 64-bit integers (#5285 ) * Split the `iadd_cout` runtests by type * Implement `iadd_cout` for 32- and 64-bit values on x64 * Delete trailing whitespace in `riscv/lower.isle`	2022-12-07 19:54:14 +00:00
Alex Crichton	7f53525ad9	Fix built with latest `wit-parser` crate (#5393 ) A mistake was made in the publication of `wit-parser` where a breaking change was made without bumping its major version, causing build issues on `main` if `wit-parser` is updated. This commit updates `wit-parser` to the latest and we'll handle breaking changes better next time. Closes #5390	2022-12-07 10:47:50 -06:00
Jamey Sharp	29b23d41b6	ISLE rule cleanups (#5389 ) * cranelift-codegen: Use ISLE matching, not same_value The `same_value` function just wrapped an equality test into an external constructor, but we can do that with ISLE's equality constraints instead. * riscv64: Remove custom condition-code tests The `lower_icmp` term exists solely to decide whether to sign-extend or zero-extend the comparison operands, based on whether the condition code requires a signed comparison. It additionally tested whether the condition code was == or !=, but produced the same result as for other unsigned comparisons. We already have `signed_cond_code` in the ISLE prelude, which classifies the total-ordering condition codes according to whether they're signed. It also lumps == and != in the "unsigned" camp, as desired. So this commit uses the existing method from the prelude instead of riscv64-local definitions. Because this version has no constraints on the left-hand side of the rule in the unsigned case, ISLE generates Rust that always returns `Some`. That shows that the current use of `unwrap` is justified, at the only Rust-side call-site of `constructor_lower_icmp`, which is in cranelift/codegen/src/isa/riscv64/lower/isle.rs. * ISLE prelude: make offset32 infallible This extractor always returns `Some`, so it doesn't need to be fallible.	2022-12-07 02:55:59 +00:00
Chris Fallin	0eb22429d1	Fuzzing: add `use_egraphs` option back to fuzzing config generator. (#5388 ) This PR reverts #5128 (commit `b3333bf9ea`), adding back the ability for the fuzzing config generator to set the `use_egraphs` Cranelift option. This will start to fuzz the egraphs-based optimization framework again, now that #5382 has landed.	2022-12-07 00:47:58 +00:00
Trevor Elliott	ab6c8e1a1a	Bump regalloc2 to version 0.5.1 (#5387 ) Bump regalloc2 to version 0.5.1.	2022-12-06 15:38:03 -08:00
Chris Fallin	f980defe17	egraph support: rewrite to work in terms of CLIF data structures. (#5382 ) * egraph support: rewrite to work in terms of CLIF data structures. This work rewrites the "egraph"-based optimization framework in Cranelift to operate on aegraphs (acyclic egraphs) represented in the CLIF itself rather than as a separate data structure to which and from which we translate the CLIF. The basic idea is to add a new kind of value, a "union", that is like an alias but refers to two other values rather than one. This allows us to represent an eclass of enodes (values) as a tree. The union node allows for a value to have multiple representations: either constituent value could be used, and (in well-formed CLIF produced by correct optimization rules) they must be equivalent. Like the old egraph infrastructure, we take advantage of acyclicity and eager rule application to do optimization in a single pass. Like before, we integrate GVN (during the optimization pass) and LICM (during elaboration). Unlike the old egraph infrastructure, everything stays in the DataFlowGraph. "Pure" enodes are represented as instructions that have values attached, but that are not placed into the function layout. When entering "egraph" form, we remove them from the layout while optimizing. When leaving "egraph" form, during elaboration, we can place an instruction back into the layout the first time we elaborate the enode; if we elaborate it more than once, we clone the instruction. The implementation performs two passes overall: - One, a forward pass in RPO (to see defs before uses), that (i) removes "pure" instructions from the layout and (ii) optimizes as it goes. As before, we eagerly optimize, so we form the entire union of optimized forms of a value before we see any uses of that value. This lets us rewrite uses to use the most "up-to-date" form of the value and canonicalize and optimize that form. The eager rewriting and acyclic representation make each other work (we could not eagerly rewrite if there were cycles; and acyclicity does not miss optimization opportunities only because the first time we introduce a value, we immediately produce its "best" form). This design choice is also what allows us to avoid the "parent pointers" and fixpoint loop of traditional egraphs. This forward optimization pass keeps a scoped hashmap to "intern" nodes (thus performing GVN), and also interleaves on a per-instruction level with alias analysis. The interleaving with alias analysis allows alias analysis to see the most optimized form of each address (so it can see equivalences), and allows the next value to see any equivalences (reuses of loads or stored values) that alias analysis uncovers. - Two, a forward pass in domtree preorder, that "elaborates" pure enodes back into the layout, possibly in multiple places if needed. This tracks the loop nest and hoists nodes as needed, performing LICM as it goes. Note that by doing this in forward order, we avoid the "fixpoint" that traditional LICM needs: we hoist a def before its uses, so when we place a node, we place it in the right place the first time rather than moving later. This PR replaces the old (a)egraph implementation. It removes both the cranelift-egraph crate and the logic in cranelift-codegen that uses it. On `spidermonkey.wasm` running a simple recursive Fibonacci microbenchmark, this work shows 5.5% compile-time reduction and 7.7% runtime improvement (speedup). Most of this implementation was done in (very productive) pair programming sessions with Jamey Sharp, thus: Co-authored-by: Jamey Sharp <jsharp@fastly.com> * Review feedback. * Review feedback. * Review feedback. * Bugfix: cprop rule: `(x + k1) - k2` becomes `x - (k2 - k1)`, not `x - (k1 - k2)`. Co-authored-by: Jamey Sharp <jsharp@fastly.com>	2022-12-06 14:58:57 -08:00
Alex Crichton	08d44e3746	Change how wasm DWARF is inserted into artifacts (#5358 ) This commit fixes a bug with components by changing how DWARF information from a wasm binary is copied over to the final compiled artifact. Note that this is not the Wasmtime-generated DWARF but rather the native wasm DWARF itself used in backtraces. Previously the wasm dwarf was inserted into sections `..wasm` where `` was `debug_info`, `debug_str`, etc -- one per `gimli::SectionId` as found in the original wasm module. This does not work with components, however, where modules did not correctly separate their debug information into separate sections or otherwise disambiguate. The fix in this commit is to instead smash all the debug information together into one large section and store offsets into that giant section. This is similar to the `name`-section scraping or the trap metadata section where one section contains all the data for all the modules in a component. This simplifies the object file parsing by only looking for one section name and doesn't add all that much complexity to serializing and looking up dwarf information as well.	2022-12-06 14:29:13 -06:00
Rainy Sinclair	51b6a0436c	Run differential fuzzing in non-trapping mode 90% of the time (#5385 )	2022-12-06 20:18:57 +00:00
Alex Crichton	2329ecc341	Add a `wasmtime::component::bindgen!` macro (#5317 ) * Import Wasmtime support from the `wit-bindgen` repo This commit imports the `wit-bindgen-gen-host-wasmtime-rust` crate from the `wit-bindgen` repository into the upstream Wasmtime repository. I've chosen to not import the full history here since the crate is relatively small and doesn't have a ton of complexity. While the history of the crate is quite long the current iteration of the crate's history is relatively short so there's not a ton of import there anyway. The thinking is that this can now continue to evolve in-tree. * Refactor `wasmtime-component-macro` a bit Make room for a `wit_bindgen` macro to slot in. * Add initial support for a `bindgen` macro * Add tests for `wasmtime::component::bindgen!` * Improve error forgetting `async` feature * Add end-to-end tests for bindgen * Add an audit of `unicase` * Add a license to the test-helpers crate * Add vet entry for `pulldown-cmark` * Update publish script with new crate * Try to fix publish script * Update audits * Update lock file	2022-12-06 13:06:00 -06:00
Trevor Elliott	293bb5b334	riscv64: Only emit jumps at the end of basic blocks (#5381 ) This PR fixes two bugs in the riscv64 backend, where branch instructions were emitted in the middle of a basic block: Constant emission, where the constants are inlined into the vcode and are jumped over at runtime, The BrTableCheck pseudo-instruction, which was always emitted before a BrTable instruction, and would handle jumping to the default label. The first bug was resolved by introducing two new psuedo instructions, LoadConst32 and LoadConst64. Both of these instructions serve to delay the original encoding to emission time, after regalloc2 has run. The second bug was fixed by removing the BrTableCheck instruction. As it was always emitted directly before BrTable, it was easier to remove it and merge the two into a single instruction.	2022-12-06 10:54:10 -08:00
Chris Fallin	feaa7ca75f	Alias analysis: refactor for use by other driver loops. (#5380 ) * Alias analysis: refactor for use by other driver loops. This PR pulls the core of the alias analysis infrastructure into a `process_inst()` method that operates on a single instruction, and allows another compiler pass to apply store-to-load forwarding and redundant-load elimination interleaved with other work. The existing behavior remains unchanged; the pass's toplevel loop calls this extracted method. This refactor is a prerequisite for using the alias analysis as part of a refactored egraph-based optimization framework. * Review feedback: remove unneeded mut.	2022-12-06 18:30:02 +00:00
Alex Crichton	4a0cefb1aa	Fix a fuzz failure due to changing errors (#5384 ) Fix the `instantiate-many` fuzzer from a recent regression introduced in #5347 where an error message changed slightly. Ideally a concrete error type would be tested for here but that's deferred to a future PR.	2022-12-06 17:41:32 +00:00
Trevor Elliott	353a681671	Avoid reusing a register during constant loading (#5379 ) Avoid reusing a register when loading a constant, allocating a temporary instead.	2022-12-05 18:37:53 -08:00
Alex Crichton	4933762d81	Add release notes for 3.0.1 and update some versions (#5364 ) * Add release notes for 3.0.1 * Update some version directives for crates in Wasmtime * Mark anything with `publish = false` as version 0.0.0 * Mark the icache coherence crate with the same version as Wasmtime * Fix manifest directives	2022-12-06 01:26:39 +00:00
Trevor Elliott	7d28d586da	riscv64: Don't reuse registers when loading constants (#5376 ) Rework the constant loading functions in the riscv64 backend to generate fresh temporaries instead of reusing the destination register.	2022-12-05 16:51:52 -08:00
Saúl Cabrera	28cfa57533	cranelift: Small documentation fixes (#5377 ) * `translate_operator` doesn't return a boolean. * `from_base_offset` doesn't panic if offset is smaller than base.	2022-12-06 00:46:58 +00:00
Trevor Elliott	817c2b205c	riscv64: Use a temporary when translating shift amount (#5375 ) Use a temporary when translating the shift amount, instead of reusing the destination register.	2022-12-05 20:54:14 +00:00
Trevor Elliott	b475b9bd19	Terminate blocks with a single branch in riscv64 (#5374 ) Ensure that we're terminating blocks with a single branch instruction, when testing I128 values against zero.	2022-12-05 20:13:28 +00:00
Alex Crichton	46e0ad4f62	Update release notes for 4.0.0 (#5373 )	2022-12-05 10:31:51 -06:00
Anton Romanov	29d4d1063f	[codegen] Fixed mutability of domtree reference (#5371 )	2022-12-05 08:19:32 -08:00
Trevor Elliott	6aea8e0d7e	Don't reuse destination registers when lowering splat on aarch64 (#5370 )	2022-12-05 08:18:49 -08:00
wasmtime-publish	a28d4d3c89	Bump Wasmtime to 5.0.0 (#5372 ) Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	2022-12-05 08:38:57 -06:00
Trevor Elliott	2e9b0802ab	aarch64: Rework amode compilation to produce SSA code (#5369 ) Rework the compilation of amodes in the aarch64 backend to stop reusing registers and instead generate fresh virtual registers for intermediates. This resolves some SSA checker violations with the aarch64 backend, and as a nice side-effect removes some unnecessary movs in the generated code.	2022-12-02 01:23:15 +00:00
Trevor Elliott	d54a27d0ea	Allocate temporary intermediates when loading constants on aarch64 (#5366 ) As loading constants on aarch64 can take up to 4 instructions, we need to plumb through some additional registers. Rather than pass a fixed list of registers in, pass an allocation function.	2022-12-01 22:29:36 +00:00
Alex Crichton	03715dda9d	Tidy up some internals of instance allocation (#5346 ) * Simplify the `ModuleRuntimeInfo` trait slightly Fold two functions into one as they're only called from one location anyway. * Remove ModuleRuntimeInfo::signature This is redundant as the array mapping is already stored within the `VMContext` so that can be consulted rather than having a separate trait function for it. This required altering the `Global` creation slightly to work correctly in this situation. * Remove a now-dead constant * Shared `VMOffsets` across instances This commit removes the computation of `VMOffsets` to being per-module instead of per-instance. The `VMOffsets` structure is also quite large so this shaves off 112 bytes per instance which isn't a huge impact but should help lower the cost of instantiating small modules. * Remove `InstanceAllocator::adjust_tunables` This is no longer needed or necessary with the pooling allocator. * Fix compile warning * Fix a vtune warning * Fix pooling tests * Fix another test warning	2022-12-01 22:22:08 +00:00
Alex Crichton	ed6769084b	Add a `WasmBacktrace::new()` constructor (#5341 ) * Add a `WasmBacktrace::new()` constructor This commit adds a method of manually capturing a backtrace of WebAssembly frames within a `Store`. The new constructor can be called with any `AsContext` values, primarily `&Store` and `&Caller`, during host functions to inspect the calling state. For now this does not respect the `Config::wasm_backtrace` option and instead unconditionally captures the backtrace. It's hoped that this can continue to adapt to needs of embedders by making it more configurable int he future if necessary. Closes #5339 * Split `new` into `capture` and `force_capture`	2022-12-01 22:19:07 +00:00
Alex Crichton	e0b9663e44	Remove some custom error types in Wasmtime (#5347 ) * Remove some custom error types in Wasmtime These types are mostly cumbersome to work with nowadays that `anyhow` is used everywhere else. This commit removes `InstantiationError` and `SetupError` in favor of using `anyhow::Error` throughout. This can eventually culminate in creation of specific errors for embedders to downcast to but for now this should be general enough. * Fix Windows build	2022-12-01 14:47:10 -06:00
Nick Fitzgerald	4510a4a805	Cranelift: mark post-legalization trapping blocks as cold (#5367 ) Trapping is a rare event.	2022-12-01 12:46:26 -08:00
Nick Fitzgerald	1eeec7b698	cranelift-wasm: Remove `ModuleTranslationState` (#5365 ) * cranelift-wasm: Remove `ModuleTranslationState` We were putting data into it, but never reading data out of it. Can be removed. * cranelift-wasm: move `funct_state.rs` sub module to `state.rs` Since it is the only submodule of `state` it can just be the whole `state` module.	2022-12-01 19:04:36 +00:00
Nam Junghyun	ebb693aa18	Move precompiled module detection into wasmtime (#5342 ) * Treat `-` as an alias to `/dev/stdin` This applies to unix targets only, as Windows does not have an appropriate alternative. * Add tests for piped modules from stdin This applies to unix targets only, as Windows does not have an appropriate alternative. * Move precompiled module detection into wasmtime Previously, wasmtime-cli checked the module to be loaded is precompiled or not, by pre-opening the given file path to check if the "\x7FELF" header exists. This commit moves this branch into the `Module::from_trusted_file`, which is only invoked with `--allow-precompiled` flag on CLI. The initial motivation of the commit is, feeding a module to wasmtime from piped inputs, is blocked by the pre-opening of the module. The `Module::from_trusted_file`, assumes the --allow-precompiled flag so there is no piped inputs, happily mmap-ing the module to test if the header exists. If --allow-precompiled is not supplied, the existing `Module::from_file` will be used, without the additional header check as the precompiled modules are intentionally not allowed on piped inputs for security measures. One caveat of this approach is that the user may be confused if he or she tries to execute a precompiled module without --allow-precompiled, as wasmtime shows an 'input bytes aren't valid utf-8' error, not directly getting what's going wrong. So this commit includes a hack-ish workaround for this. Thanks to @jameysharp for suggesting this idea with a detailed guidance.	2022-12-01 09:13:39 -08:00
Trevor Elliott	37c3c5b1e0	Remove an unnecessary debug trace (#5359 )	2022-11-30 20:37:20 -08:00
Trevor Elliott	c16f2956db	Allocate a temporary for 64-bit constant loads in the s390x backend (#5357 ) Avoid reusing a destination virtual register for 64-bit constants in the s390x backend. This change addresses a case identified by the regalloc2 ssa validator, as the destination register was written to twice when constants were generated via the MachInst::gen_constant function.	2022-11-30 17:01:14 -08:00
Jamey Sharp	0e65f87e37	cranelift-isle: Reject unreachable rules (#5322 ) Some of our ISLE rules can never fire because there's a higher-priority rule that will always fire instead. Sometimes the worst that can happen is we generate sub-optimal output. That's not so bad but we'd still like to know about it so we can fix it. In other cases there might be instructions which can't be lowered in isolation. If a general rule for lowering one of the instructions is higher-priority than the rule for lowering the combined sequence, then lowering the combined sequence will always fail. Either way, this is always a bug, so make it a fatal error if we can detect it.	2022-11-30 15:06:00 -08:00
Trevor Elliott	d8dbabfe6b	Don't reuse registers in the x64 div lowering (#5356 ) Introduce a temporary for an intermediate value in the lowering of div in the x64 backend. Additionally, add a src argument to the shift_r smart constructor, which is why the diff got larger than just the div lowering.	2022-11-30 22:44:59 +00:00
Trevor Elliott	87b63174b1	Don't reuse registers in make_i64x2_from_lanes (#5355 ) Avoid reusing output registers in make_i64x2_from_lanes by threading the output name instead, and using smart constructors for x64_pinsrd instead of constructing the instructions directly.	2022-11-30 14:37:01 -08:00
Nick Fitzgerald	79f7fa6079	Cranelift: implement `heap_{load,store}` instruction legalization (#5351 ) * Cranelift: implement `heap_{load,store}` instruction legalization This does not remove `heap_addr` yet, but it does factor out the common bounds-check-and-compute-the-native-address functionality that is shared between all of `heap_{addr,load,store}`. Finally, this adds a missing optimization for when we can dedupe explicit bounds checks for static memories and Spectre mitigations. * Cranelift: Enable `heap_load_store_*` run tests on all targets	2022-11-30 19:12:49 +00:00
Alex Crichton	830885383f	Implement inline stack probes for AArch64 (#5353 ) * Turn off probestack by default in Cranelift The probestack feature is not implemented for the aarch64 and s390x backends and currently the on-by-default status requires the aarch64 and s390x implementations to be a stub. Turning off probestack by default allows the s390x and aarch64 backends to panic with an error message to avoid providing a false sense of security. When the probestack option is implemented for all backends, however, it may be reasonable to re-enable. * aarch64: Improve codegen for AMode fallback Currently the final fallback for finalizing an `AMode` will generate both a constant-loading instruction as well as an `add` instruction to the base register into the same temporary. This commit improves the codegen by removing the `add` instruction and folding the final add into the finalized `AMode`. This changes the `extendop` used but both registers are 64-bit so shouldn't be affected by the extending operation. * aarch64: Implement inline stack probes This commit implements inline stack probes for the aarch64 backend in Cranelift. The support here is modeled after the x64 support where unrolled probes are used up to a particular threshold after which a loop is generated. The instructions here are similar in spirit to x64 except that unlike x64 the stack pointer isn't modified during the unrolled loop to avoid needing to re-adjust it back up at the end of the loop. * Enable inline probestack for AArch64 and Riscv64 This commit enables inline probestacks for the AArch64 and Riscv64 architectures in the same manner that x86_64 has it enabled now. Some more testing was additionally added since on Unix platforms we should be guaranteed that Rust's stack overflow message is now printed too. * Enable probestack for aarch64 in cranelift-fuzzgen * Address review comments * Remove implicit stack overflow traps from x64 backend This commit removes implicit `StackOverflow` traps inserted by the x64 backend for stack-based operations. This was historically required when stack overflow was detected with page faults but Wasmtime no longer requires that since it's not suitable for wasm modules which call host functions. Additionally no other backend implements this form of implicit trap-code additions so this is intended to synchronize the behavior of all the backends. This fixes a test added prior for aarch64 to properly abort the process instead of accidentally being caught by Wasmtime. * Fix a style issue	2022-11-30 12:30:00 -06:00
Peter Huene	8bc7550211	wasmtime: enable stack probing for x86_64 targets. (#5350 ) * wasmtime: enable stack probing for x86_64 targets. This commit unconditionally enables stack probing for x86_64 targets. On Windows, stack probing is always required because of the way Windows commits stack pages (via guard page access). Fixes #5340. * Remove SIMD types from test case.	2022-11-30 09:57:53 -06:00
Timothy Chen	67fc5389b0	Remove sig data arg and ret fields to reduce size (#5319 ) * Remove sig data arg and ret fields to reduce size * Update cranelift/codegen/src/machinst/abi.rs Co-authored-by: Jamey Sharp <jamey@minilop.net> * Update cranelift/codegen/src/machinst/abi.rs Co-authored-by: Jamey Sharp <jamey@minilop.net> * Fix offsets * Add comment Co-authored-by: Jamey Sharp <jamey@minilop.net>	2022-11-30 07:19:41 -08:00
Benjamin Bouvier	2bb1fb08fa	Flush icache on android aarch64 too (#5331 )	2022-11-30 07:15:34 -08:00
Thibault Charbonnier	e7cb82af89	c-api: add wasm_config_parallel_compilation_set (#5298 )	2022-11-29 23:03:05 +00:00
Alex Crichton	86acb9a438	Use workspace inheritance for some more dependencies (#5349 ) Deduplicate some dependency directives through `[workspace.dependencies]`	2022-11-29 22:32:56 +00:00
Nick Fitzgerald	2ad3f78624	Cranelift: fix `heap_{load,store}` test generator script (#5348 ) Was missing some '$' characters and so was comparing string literals against string literals instead of variable values against string literals. Regenerated tests to fix them and add missing tests.	2022-11-29 20:53:14 +00:00
Nick Fitzgerald	913a2ec8c8	Cranelift: consider heap's guard pages when legalizing `heap_addr` (#5335 ) * Cranelift: consider heap's guard pages when legalizing `heap_addr` Fixes #5328 * Update comment to align more directly with implementation * Add legalization tests for `heap_addr` and offset guard pages	2022-11-29 19:54:25 +00:00

1 2 3 4 5 ...

10630 Commits