wasmtime

Author	SHA1	Message	Date
Jan-Justin van Tonder	db8fe0108f	cranelift: Add big and little endian memory accesses to interpreter (#5893 ) * Added `mem_flags` parameter to `State::checked_{load,store}` as the means for determining the endianness, typically derived from an instruction. * Added `native_endianness` property to `InterpreterState` as fallback when determining endianness, such as in cases where there are no memory flags avaiable or set. * Added `to_be` and `to_le` methods to `DataValue`. * Added `AtomicCas` and `AtomicRmw` to list of instructions with retrievable memory flags for `InstructionData::memflags`. * Enabled `atomic-{cas,rmw}-subword-{big,little}.clif` for interpreter run tests.	2023-03-02 11:57:01 +00:00
Chris Fallin	7b8854f803	egraphs: fix handling of effectful-but-idempotent ops and GVN. (#5800 ) * Revert "egraphs: disable GVN of effectful idempotent ops (temporarily). (#5808)" This reverts commit `c7e2571866`. * egraphs: fix handling of effectful-but-idempotent ops and GVN. This PR addresses #5796: currently, ops that are effectful, i.e., remain in the side-effecting skeleton (which we keep in the `Layout` while the egraph exists), but are idempotent and thus mergeable by a GVN pass, are not handled properly. GVN is still possible on effectful but idempotent ops precisely because our GVN does not create partial redundancies: it removes an instruction only when it is dominated by an identical instruction. An isntruction will not be "hoisted" to a point where it could execute in the optimized code but not in the original. However, there are really two parts to the egraph implementation that produce this effect: the deduplication on insertion into the egraph, and the elaboration with a scoped hashmap. The deduplication lets us give a single name (value ID) to all copies of an identical instruction, and then elaboration will re-create duplicates if GVN should not hoist or merge some of them. Because deduplication need not worry about dominance or scopes, we use a simple (non-scoped) hashmap to dedup/intern ops as "egraph nodes". When we added support for GVN'ing effectful but idempotent ops (#5594), we kept the use of this simple dedup'ing hashmap, but these ops do not get elaborated; instead they stay in the side-effecting skeleton. Thus, we inadvertently created potential for weird code-motion effects. The proposal in #5796 would solve this in a clean way by treating these ops as pure again, and keeping them out of the skeleton, instead putting "force" pseudo-ops in the skeleton. However, this is a little more complex than I would like, and I've realized that @jameysharp's earlier suggestion is much simpler: we can keep an actual scoped hashmap separately just for the effectful-but-idempotent ops, and use it to GVN while we build the egraph. In effect, we're fusing a separate GVN pass with the egraph pass (but letting it interact corecursively with egraph rewrites. This is in principle similar to how we keep a separate map for loads and fuse this pass with the egraph rewrite pass as well. Note that we can use a `ScopedHashMap` here without the "context" (as needed by `CtxHashMap`) because, as noted by @jameysharp, in practice the ops we want to GVN have all their args inline. Equality on the `InstructinoData` itself is conservative: two insts whose struct contents compare shallowly equal are definitely identical, but identical insts in a deep-equality sense may not compare shallowly equal, due to list indirection. This is fine for GVN, because it is still sound to skip any given GVN opportunity (and keep the original instructions). Fixes #5796. * Add comments from review.	2023-03-02 02:10:42 +00:00
Afonso Bordado	2dd6064005	fuzzgen: Generate multiple functions per testcase (#5765 ) * fuzzgen: Generate multiple functions per testcase * fuzzgen: Fix typo Co-authored-by: Jamey Sharp <jamey@minilop.net> --------- Co-authored-by: Jamey Sharp <jamey@minilop.net>	2023-02-28 18:47:09 +00:00
Afonso Bordado	fc080c739e	fuzzgen: Add `AtomicRMW` (#5861 )	2023-02-23 18:34:28 +00:00
Trevor Elliott	80c147d9c0	Rework br_table to use BlockCall (#5731 ) Rework br_table to use BlockCall, allowing us to avoid adding new nodes during ssa construction to hold block arguments. Additionally, many places where we previously matched on InstructionData to extract branch destinations can be replaced with a use of branch_destination or branch_destination_mut.	2023-02-16 09:23:27 -08:00
Trevor Elliott	19f337e29b	Move the default block to the front of the underlying jump table storage (#5770 ) The new api on JumpTableData makese it easy to keep the default label first, and that shrinks the diff in #5731 a bit.	2023-02-13 20:50:29 +00:00
Trevor Elliott	d99783fc91	Move default blocks into jump tables (#5756 ) Move the default block off of the br_table instrution, and into the JumpTable that it references.	2023-02-10 08:53:30 -08:00
Trevor Elliott	15fe9c7c93	Inline jump tables in parsed br_table instructions (#5755 ) As jump tables are used by at most one br_table instruction, inline their definition in those instructions instead of requiring them to be declared as function-level metadata.	2023-02-09 14:24:04 -08:00
bjorn3	202d3af16a	Remove the unused sigid argument purpose (#5753 )	2023-02-09 09:18:39 -08:00
Trevor Elliott	b0b3f67cb0	Move jump tables to the DataFlowGraph (#5745 ) Move the storage for jump tables off of FunctionStencil and onto DataFlowGraph. This change is in service of #5731, making it easier to access the jump table data in the context of helpers like inst_values.	2023-02-07 21:21:35 -08:00
Trevor Elliott	d71c9458dc	Make `DataFlowGraph::blocks` public (#5740 ) Similar to when we exposed the DataFlowGraph::insts field through a restrictive newtype, expose DataFlowGraph::blocks through an interface that allows a restrictive set of operations. This field being public now allows us to avoid a rematch in ssa construction, and simplifies the implementation of adding a block argument to a block referenced by a br_table instruction.	2023-02-07 17:11:14 -08:00
Trevor Elliott	3343cf80e9	Add assertions for matches that used to use analyze_branch (#5733 ) Following up from #5730, add debug assertions to ensure that new branch instructions don't slip through matches that used to use analyze_branch.	2023-02-07 14:51:18 -08:00
Trevor Elliott	2c8425998b	Refactor matches that used to consume BranchInfo (#5734 ) Explicitly borrow the instruction data, and use a mutable borrow to avoid rematch.	2023-02-07 13:29:42 -08:00
Chris Fallin	673b448cfe	Cranelift DFG: make inst clone deep-clone varargs lists. (#5727 ) When investigating #5716, I found that rematerialization of a `call`, in addition to blowing up for other reasons, caused aliasing of the varargs list (the `EntityList` in the `ListPool`), such that editing the args of the second copy of the call instruction inadvertently updated the first as well. This PR modifies `DataFlowGraph::clone_inst` so that it always clones the varargs list if present. This shouldn't have any functional impact on Cranelift today, because we don't rematerialize any instructions with varargs; but it's important to get it right to avoid a bug later!	2023-02-07 01:21:09 +00:00
Trevor Elliott	c8a6adf825	Remove analyze_branch and BranchInfo (#5730 ) We don't have overlap in behavior for branch instructions anymore, so we can remove analyze_branch and instead match on the InstructionData directly. Co-authored-by: Jamey Sharp <jamey@minilop.net>	2023-02-06 17:06:57 -08:00
Nick Fitzgerald	bdfb746548	Cranelift: Introduce the `return_call` and `return_call_indirect` instructions (#5679 ) * Cranelift: Introduce the `tail` calling convention This is an unstable-ABI calling convention that we will eventually use to support Wasm tail calls. Co-Authored-By: Jamey Sharp <jsharp@fastly.com> * Cranelift: Introduce the `return_call` and `return_call_indirect` instructions These will be used to implement tail calls for Wasm and any other language targeting CLIF. The `return_call_indirect` instruction differs from the Wasm instruction of the same name by taking a native address callee rather than a Wasm function index. Co-Authored-By: Jamey Sharp <jsharp@fastly.com> * Cranelift: Implement verification rules for `return_call[_indirect]` They must: * have the same return types between the caller and callee, * have the same calling convention between caller and callee, * and that calling convention must support tail calls. Co-Authored-By: Jamey Sharp <jsharp@fastly.com> * cargo fmt --------- Co-authored-by: Jamey Sharp <jsharp@fastly.com>	2023-02-01 21:20:35 +00:00
Trevor Elliott	e82995f03c	Add a convenience function for displaying a BlockCall (#5677 ) Add a display method to BlockCall that returns a std::fmt::Displayable result. Rework the display code in the write module of cranelift-codegen to use this method instead.	2023-01-31 14:26:10 -08:00
Trevor Elliott	a5698cedf8	cranelift: Remove brz and brnz (#5630 ) Remove the brz and brnz instructions, as their behavior is now redundant with brif.	2023-01-30 20:34:56 +00:00
Trevor Elliott	b58a197d33	cranelift: Add a conditional branch instruction with two targets (#5446 ) Add a conditional branch instruction with two targets: brif. This instruction will eventually replace brz and brnz, as it encompasses the behavior of both. This PR also changes the InstructionData layout for instruction formats that hold BlockCall values, taking the same approach we use for Value arguments. This allows branch_destination to return a slice to the BlockCall values held in the instruction, rather than requiring that we pattern match on InstructionData to fetch the then/else blocks. Function generation for fuzzing has been updated to generate uses of brif, and I've run the cranelift-fuzzgen target locally for hours without triggering any new failures.	2023-01-24 14:37:16 -08:00
Trevor Elliott	7cea73a81d	Refactor BranchInfo::Table to no longer have an optional default branch (#5593 )	2023-01-18 17:17:03 -08:00
Trevor Elliott	1e6c13d83e	cranelift: Rework block instructions to use BlockCall (#5464 ) Add a new type BlockCall that represents the pair of a block name with arguments to be passed to it. (The mnemonic here is that it looks a bit like a function call.) Rework the implementation of jump, brz, and brnz to use BlockCall instead of storing the block arguments as varargs in the instruction's ValueList. To ensure that we're processing block arguments from BlockCall values in instructions, three new functions have been introduced on DataFlowGraph that both sets of arguments: inst_values - returns an iterator that traverses values in the instruction and block arguments map_inst_values - applies a function to each value in the instruction and block arguments overwrite_inst_values - overwrite all values in an instruction and block arguments with values from the iterator Co-authored-by: Jamey Sharp <jamey@minilop.net>	2023-01-17 16:31:15 -08:00
Ayomide Bamidele	b47e644c3d	Remove vconcat and vsplit clif instructions (#5465 ) Fixes #5463. * remove vsplit instruction * remove vconcat instruction * remove unsused half/double vector helper functions * remove unused operand constraints * delete + inline Type::half_vector method	2022-12-20 00:41:55 +00:00
Trevor Elliott	25bf8e0e67	Make DataFlowGraph::insts public, but restricted (#5450 ) We have some operations defined on DataFlowGraph purely to work around borrow-checker issues with InstructionData and other data on DataFlowGraph. Part of the problem is that indexing the DFG directly hides the fact that we're only indexing the insts field of the DFG. This PR makes the insts field of the DFG public, but wraps it in a newtype that only allows indexing. This means that the borrow checker is better able to tell when operations on memory held by the DFG won't conflict, which comes up frequently when mutating ValueLists held by InstructionData.	2022-12-16 10:46:09 -08:00
Nick Fitzgerald	c0b587ac5f	Remove heaps from core Cranelift, push them into `cranelift-wasm` (#5386 ) * cranelift-wasm: translate Wasm loads into lower-level CLIF operations Rather than using `heap_{load,store,addr}`. * cranelift: Remove the `heap_{addr,load,store}` instructions These are now legalized in the `cranelift-wasm` frontend. * cranelift: Remove the `ir::Heap` entity from CLIF * Port basic memory operation tests to .wat filetests * Remove test for verifying CLIF heaps * Remove `heap_addr` from replace_branching_instructions_and_cfg_predecessors.clif test * Remove `heap_addr` from readonly.clif test * Remove `heap_addr` from `table_addr.clif` test * Remove `heap_addr` from the simd-fvpromote_low.clif test * Remove `heap_addr` from simd-fvdemote.clif test * Remove `heap_addr` from the load-op-store.clif test * Remove the CLIF heap runtest * Remove `heap_addr` from the global_value.clif test * Remove `heap_addr` from fpromote.clif runtests * Remove `heap_addr` from fdemote.clif runtests * Remove `heap_addr` from memory.clif parser test * Remove `heap_addr` from reject_load_readonly.clif test * Remove `heap_addr` from reject_load_notrap.clif test * Remove `heap_addr` from load_readonly_notrap.clif test * Remove `static-heap-without-guard-pages.clif` test Will be subsumed when we port `make-heap-load-store-tests.sh` to generating `.wat` tests. * Remove `static-heap-with-guard-pages.clif` test Will be subsumed when we port `make-heap-load-store-tests.sh` over to `.wat` tests. * Remove more heap tests These will be subsumed by porting `make-heap-load-store-tests.sh` over to `.wat` tests. * Remove `heap_addr` from `simple-alias.clif` test * Remove `heap_addr` from partial-redundancy.clif test * Remove `heap_addr` from multiple-blocks.clif test * Remove `heap_addr` from fence.clif test * Remove `heap_addr` from extends.clif test * Remove runtests that rely on heaps Heaps are not a thing in CLIF or the interpreter anymore * Add generated load/store `.wat` tests * Enable memory-related wasm features in `.wat` tests * Remove CLIF heap from fcmp-mem-bug.clif test * Add a mode for compiling `.wat` all the way to assembly in filetests * Also generate WAT to assembly tests in `make-load-store-tests.sh` * cargo fmt * Reinstate `f{de,pro}mote.clif` tests without the heap bits * Remove undefined doc link * Remove outdated SVG and dot file from docs * Add docs about `None` returns for base address computation helpers * Factor out `env.heap_access_spectre_mitigation()` to a local * Expand docs for `FuncEnvironment::heaps` trait method * Restore f{de,pro}mote+load clif runtests with stack memory	2022-12-15 00:26:45 +00:00
Ulrich Weigand	e913cf3647	Remove IFLAGS/FFLAGS types (#5406 ) All instructions using the CPU flags types (IFLAGS/FFLAGS) were already removed. This patch completes the cleanup by removing all remaining instructions that define values of CPU flags types, as well as the types themselves. Specifically, the following features are removed: - The IFLAGS and FFLAGS types and the SpecialType category. - Special handling of IFLAGS and FFLAGS in machinst/isle.rs and machinst/lower.rs. - The ifcmp, ifcmp_imm, ffcmp, iadd_ifcin, iadd_ifcout, iadd_ifcarry, isub_ifbin, isub_ifbout, and isub_ifborrow instructions. - The writes_cpu_flags instruction property. - The flags verifier pass. - Flags handling in the interpreter. All of these features are currently unused; no functional change intended by this patch. This addresses https://github.com/bytecodealliance/wasmtime/issues/3249.	2022-12-09 13:42:03 -08:00
Chris Fallin	f980defe17	egraph support: rewrite to work in terms of CLIF data structures. (#5382 ) * egraph support: rewrite to work in terms of CLIF data structures. This work rewrites the "egraph"-based optimization framework in Cranelift to operate on aegraphs (acyclic egraphs) represented in the CLIF itself rather than as a separate data structure to which and from which we translate the CLIF. The basic idea is to add a new kind of value, a "union", that is like an alias but refers to two other values rather than one. This allows us to represent an eclass of enodes (values) as a tree. The union node allows for a value to have multiple representations: either constituent value could be used, and (in well-formed CLIF produced by correct optimization rules) they must be equivalent. Like the old egraph infrastructure, we take advantage of acyclicity and eager rule application to do optimization in a single pass. Like before, we integrate GVN (during the optimization pass) and LICM (during elaboration). Unlike the old egraph infrastructure, everything stays in the DataFlowGraph. "Pure" enodes are represented as instructions that have values attached, but that are not placed into the function layout. When entering "egraph" form, we remove them from the layout while optimizing. When leaving "egraph" form, during elaboration, we can place an instruction back into the layout the first time we elaborate the enode; if we elaborate it more than once, we clone the instruction. The implementation performs two passes overall: - One, a forward pass in RPO (to see defs before uses), that (i) removes "pure" instructions from the layout and (ii) optimizes as it goes. As before, we eagerly optimize, so we form the entire union of optimized forms of a value before we see any uses of that value. This lets us rewrite uses to use the most "up-to-date" form of the value and canonicalize and optimize that form. The eager rewriting and acyclic representation make each other work (we could not eagerly rewrite if there were cycles; and acyclicity does not miss optimization opportunities only because the first time we introduce a value, we immediately produce its "best" form). This design choice is also what allows us to avoid the "parent pointers" and fixpoint loop of traditional egraphs. This forward optimization pass keeps a scoped hashmap to "intern" nodes (thus performing GVN), and also interleaves on a per-instruction level with alias analysis. The interleaving with alias analysis allows alias analysis to see the most optimized form of each address (so it can see equivalences), and allows the next value to see any equivalences (reuses of loads or stored values) that alias analysis uncovers. - Two, a forward pass in domtree preorder, that "elaborates" pure enodes back into the layout, possibly in multiple places if needed. This tracks the loop nest and hoists nodes as needed, performing LICM as it goes. Note that by doing this in forward order, we avoid the "fixpoint" that traditional LICM needs: we hoist a def before its uses, so when we place a node, we place it in the right place the first time rather than moving later. This PR replaces the old (a)egraph implementation. It removes both the cranelift-egraph crate and the logic in cranelift-codegen that uses it. On `spidermonkey.wasm` running a simple recursive Fibonacci microbenchmark, this work shows 5.5% compile-time reduction and 7.7% runtime improvement (speedup). Most of this implementation was done in (very productive) pair programming sessions with Jamey Sharp, thus: Co-authored-by: Jamey Sharp <jsharp@fastly.com> * Review feedback. * Review feedback. * Review feedback. * Bugfix: cprop rule: `(x + k1) - k2` becomes `x - (k2 - k1)`, not `x - (k1 - k2)`. Co-authored-by: Jamey Sharp <jsharp@fastly.com>	2022-12-06 14:58:57 -08:00
Saúl Cabrera	28cfa57533	cranelift: Small documentation fixes (#5377 ) * `translate_operator` doesn't return a boolean. * `from_base_offset` doesn't panic if offset is smaller than base.	2022-12-06 00:46:58 +00:00
Nick Fitzgerald	d0d3245a35	Cranelift: Add `heap_load` and `heap_store` instructions (#5300 ) * Cranelift: Define `heap_load` and `heap_store` instructions * Cranelift: Implement interpreter support for `heap_load` and `heap_store` * Cranelift: Add a suite runtests for `heap_{load,store}` There are so many knobs we can twist for heaps and I wanted to exhaustively test all of them, so I wrote a script to generate the tests. I've checked in the script in case we want to make any changes in the future, but I don't think it is worth adding this to CI to check that scripts are up to date or anything like that. * Review feedback	2022-11-21 23:00:39 +00:00
Nick Fitzgerald	3b6544dc66	Fix warnings in `cranelift-codegen` docs builds (#5292 )	2022-11-17 21:13:24 +00:00
Nick Fitzgerald	c2a7ea7e24	Cranelift: de-duplicate bounds checks in legalizations (#5190 ) * Cranelift: Add the `DataFlowGraph::display_value_inst` convenience method * Cranelift: Add some `trace!` logs to some parts of legalization * Cranelift: de-duplicate bounds checks in legalizations When both (1) "dynamic" memories that need explicit bounds checks and (2) spectre mitigations that perform bounds checks are enabled, reuse the same bounds checks between the two legalizations. This reduces the overhead of explicit bounds checks and spectre mitigations over using virtual memory guard pages with spectre mitigations from ~1.9-2.1x overhead to ~1.6-1.8x overhead. That is about a 14-19% speed up for when dynamic memories and spectre mitigations are enabled. <details> ``` execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm Δ = 3422455129.47 ± 120159.49 (confidence = 99%) virtual-memory-guards.so is 2.09x to 2.09x faster than bounds-checks.so! [6563931659 6564063496.07 6564301535] bounds-checks.so [3141492675 3141608366.60 3141895249] virtual-memory-guards.so execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm Δ = 338716136.87 ± 1.38 (confidence = 99%) virtual-memory-guards.so is 2.08x to 2.08x faster than bounds-checks.so! [651961494 651961495.47 651961497] bounds-checks.so [313245357 313245358.60 313245362] virtual-memory-guards.so execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 22742944.07 ± 331.73 (confidence = 99%) virtual-memory-guards.so is 1.87x to 1.87x faster than bounds-checks.so! [48841295 48841567.33 48842139] bounds-checks.so [26098439 26098623.27 26099479] virtual-memory-guards.so ``` </details> <details> ``` execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm Δ = 2465900207.27 ± 146476.61 (confidence = 99%) virtual-memory-guards.so is 1.78x to 1.78x faster than de-duped-bounds-checks.so! [5607275431 5607442989.13 5607838342] de-duped-bounds-checks.so [3141445345 3141542781.87 3141711213] virtual-memory-guards.so execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm Δ = 234253620.20 ± 2.33 (confidence = 99%) virtual-memory-guards.so is 1.75x to 1.75x faster than de-duped-bounds-checks.so! [547498977 547498980.93 547498985] de-duped-bounds-checks.so [313245357 313245360.73 313245363] virtual-memory-guards.so execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 16605659.13 ± 315.78 (confidence = 99%) virtual-memory-guards.so is 1.64x to 1.64x faster than de-duped-bounds-checks.so! [42703971 42704284.40 42704787] de-duped-bounds-checks.so [26098432 26098625.27 26099234] virtual-memory-guards.so ``` </details> <details> ``` execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm Δ = 104462517.13 ± 7.32 (confidence = 99%) de-duped-bounds-checks.so is 1.19x to 1.19x faster than bounds-checks.so! [651961493 651961500.80 651961532] bounds-checks.so [547498981 547498983.67 547498989] de-duped-bounds-checks.so execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm Δ = 956556982.80 ± 103034.59 (confidence = 99%) de-duped-bounds-checks.so is 1.17x to 1.17x faster than bounds-checks.so! [6563930590 6564019842.40 6564243651] bounds-checks.so [5607307146 5607462859.60 5607677763] de-duped-bounds-checks.so execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 6137307.87 ± 247.75 (confidence = 99%) de-duped-bounds-checks.so is 1.14x to 1.14x faster than bounds-checks.so! [48841303 48841472.93 48842000] bounds-checks.so [42703965 42704165.07 42704718] de-duped-bounds-checks.so ``` </details> * Update test expectations * Add a test for deduplicating bounds checks between dynamic memories and spectre mitigations * Define a struct for the Spectre comparison instead of using a tuple * More trace logging for heap legalization	2022-11-15 08:47:22 -08:00
Trevor Elliott	aeceea28e2	Remove trapif and trapff (#5162 ) This branch removes the trapif and trapff instructions, in favor of using an explicit comparison and trapnz. This moves us closer to removing iflags and fflags, but introduces the need to implement instructions like iadd_cout in the x64 and aarch64 backends.	2022-11-03 09:25:11 -07:00
Alex Crichton	2afaac5181	Return `anyhow::Error` from host functions instead of `Trap`, redesign `Trap` (#5149 ) * Return `anyhow::Error` from host functions instead of `Trap` This commit refactors how errors are modeled when returned from host functions and additionally refactors how custom errors work with `Trap`. At a high level functions in Wasmtime that previously worked with `Result<T, Trap>` now work with `Result<T>` instead where the error is `anyhow::Error`. This includes functions such as: * Host-defined functions in a `Linker<T>` * `TypedFunc::call` * Host-related callbacks like call hooks Errors are now modeled primarily as `anyhow::Error` throughout Wasmtime. This subsequently removes the need for `Trap` to have the ability to represent all host-defined errors as it previously did. Consequently the `From` implementations for any error into a `Trap` have been removed here and the only embedder-defined way to create a `Trap` is to use `Trap::new` with a custom string. After this commit the distinction between a `Trap` and a host error is the wasm backtrace that it contains. Previously all errors in host functions would flow through a `Trap` and get a wasm backtrace attached to them, but now this only happens if a `Trap` itself is created meaning that arbitrary host-defined errors flowing from a host import to the other side won't get backtraces attached. Some internals of Wasmtime itself were updated or preserved to use `Trap::new` to capture a backtrace where it seemed useful, such as when fuel runs out. The main motivation for this commit is that it now enables hosts to thread a concrete error type from a host function all the way through to where a wasm function was invoked. Previously this could not be done since the host error was wrapped in a `Trap` that didn't provide the ability to get at the internals. A consequence of this commit is that when a host error is returned that isn't a `Trap` we'll capture a backtrace and then won't have a `Trap` to attach it to. To avoid losing the contextual information this commit uses the `Error::context` method to attach the backtrace as contextual information to ensure that the backtrace is itself not lost. This is a breaking change for likely all users of Wasmtime, but it's hoped to be a relatively minor change to workaround. Most use cases can likely change `-> Result<T, Trap>` to `-> Result<T>` and otherwise explicit creation of a `Trap` is largely no longer necessary. * Fix some doc links * add some tests and make a backtrace type public (#55) * Trap: avoid a trailing newline in the Display impl which in turn ends up with three newlines between the end of the backtrace and the `Caused by` in the anyhow Debug impl * make BacktraceContext pub, and add tests showing downcasting behavior of anyhow::Error to traps or backtraces * Remove now-unnecesary `Trap` downcasts in `Linker::module` * Fix test output expectations * Remove `Trap::i32_exit` This commit removes special-handling in the `wasmtime::Trap` type for the i32 exit code required by WASI. This is now instead modeled as a specific `I32Exit` error type in the `wasmtime-wasi` crate which is returned by the `proc_exit` hostcall. Embedders which previously tested for i32 exits now downcast to the `I32Exit` value. * Remove the `Trap::new` constructor This commit removes the ability to create a trap with an arbitrary error message. The purpose of this commit is to continue the prior trend of leaning into the `anyhow::Error` type instead of trying to recreate it with `Trap`. A subsequent simplification to `Trap` after this commit is that `Trap` will simply be an `enum` of trap codes with no extra information. This commit is doubly-motivated by the desire to always use the new `BacktraceContext` type instead of sometimes using that and sometimes using `Trap`. Most of the changes here were around updating `Trap::new` calls to `bail!` calls instead. Tests which assert particular error messages additionally often needed to use the `:?` formatter instead of the `{}` formatter because the prior formats the whole `anyhow::Error` and the latter only formats the top-most error, which now contains the backtrace. * Merge `Trap` and `TrapCode` With prior refactorings there's no more need for `Trap` to be opaque or otherwise contain a backtrace. This commit parse down `Trap` to simply an `enum` which was the old `TrapCode`. All various tests and such were updated to handle this. The main consequence of this commit is that all errors have a `BacktraceContext` context attached to them. This unfortunately means that the backtrace is printed first before the error message or trap code, but given all the prior simplifications that seems worth it at this time. * Rename `BacktraceContext` to `WasmBacktrace` This feels like a better name given how this has turned out, and additionally this commit removes having both `WasmBacktrace` and `BacktraceContext`. * Soup up documentation for errors and traps * Fix build of the C API Co-authored-by: Pat Hickey <pat@moreproductive.org>	2022-11-02 16:29:31 +00:00
Afonso Bordado	4867813f77	cranelift: Remove `copy` instruction (#5125 )	2022-10-25 17:27:33 -07:00
Chris Fallin	e62e530b7c	egraphs: fix fill-in-the-types logic for multiple projections of one value. (#5112 ) In particular, this was found to happen in #5099 because a `Result` projection node was not deduplicating across two separate `isplit`s that created it. (This is a separate issue we should also fix; `needs_dedup` is I think overly conservative because `Result` can project out a single value from a pure or impure node, but the projection itself should be treated like any other pure operator.) In any case, if we have a value `v0` and two separate `Result { value: v0, result: N, ty }` nodes, each of these will fill in the type `ty` for the `N`th output of `v0`, and the second will idempotently overwrite the first; we should loosen the assert so that it allows this case. Fixes #5099. Fixes #5100.	2022-10-25 05:22:28 +00:00
Trevor Elliott	ec12415b1f	cranelift: Remove redundant branch and select instructions (#5097 ) As discussed in the 2022/10/19 meeting, this PR removes many of the branch and select instructions that used iflags, in favor if using brz/brnz and select in their place. Additionally, it reworks selectif_spectre_guard to take an i8 input instead of an iflags input. For reference, the removed instructions are: br_icmp, brif, brff, trueif, trueff, and selectif.	2022-10-24 16:14:35 -07:00
Chris Fallin	86e77953f8	Fix some egraph-related issues. (#5088 ) This fixes #5086 by addressing two separate issues: - The `ValueDataPacked::set_type()` helper had an embarrassing bitfield-manipulation bug that would mangle the rest of a `ValueDef` when setting its type. This is not normally used, only when the egraph elaboration fills in types after-the-fact on a multi-value node. - The lowering rules for `isplit` on aarch64 and s390x were dispatching on the first output type, rather than the input type. When only the second output is used (as in the example in #5086), the first output type actually remains `INVALID` (and this is fine because it's never used).	2022-10-21 10:24:48 -07:00
Chris Fallin	c392e461a3	egraphs: a few miscellaneous compile-time optimizations. (#5072 ) * egraphs: a few miscellaneous compile-time optimizations. These optimizations together are worth about a 2% compile-time reduction, as measured on one core with spidermonkey.wasm as an input, using `hyperfine` on `wasmtime compile`. The changes included are: - Some better pre-allocation (blockparams and side-effects concatenated list vecs); - Avoiding the indirection of storing list-of-types for every Pure and Inst node, when almost all nodes produce only a single result; instead, store arity and single type if it exists, and allow result projection nodes to fill in types otherwise; - Pack the `MemoryState` enum into one `u32` (this together with the above removal of the type slice allows `Node` to shrink from 48 bytes to 32 bytes); - always-inline an accessor (`entry` on `CtxHash`) that wasn't (`always(inline)` appears to be load-bearing, rather than just `inline`); - Split the update-analysis path into two hotpaths, one for the union case and one for the new-node case (and the former can avoid recomputing for the contained node when replacing a node with node-and-child eclass entry). * Review feedback. * Fix test build. * Fix to lowering when unused output with invalid type is present.	2022-10-19 11:05:00 -07:00
Trevor Elliott	32a7593c94	cranelift: Remove booleans (#5031 ) Remove the boolean types from cranelift, and the associated instructions breduce, bextend, bconst, and bint. Standardize on using 1/0 for the return value from instructions that produce scalar boolean results, and -1/0 for boolean vector elements. Fixes #3205 Co-authored-by: Afonso Bordado <afonso360@users.noreply.github.com> Co-authored-by: Ulrich Weigand <ulrich.weigand@de.ibm.com> Co-authored-by: Chris Fallin <chris@cfallin.org>	2022-10-17 16:00:27 -07:00
Afonso Bordado	766ecb561e	fuzzgen: Always generate reachable blocks (#5034 ) * fuzzgen: Always reachable blocks * fuzzgen: Rename BlockTerminator * fuzzgen: Rename `finalize_block` * fuzzgen: Use `cloned` instead of map clone Thanks @jameysharp! Co-authored-by: Jamey Sharp <jamey@minilop.net> * fuzzgen: `rustfmt` * fuzzgen: Document paramless targets * fuzzgen: Add `BlockTerminatorKind` * fuzzen: Update BrTable/Switch comment * fuzzgen: Minor cleanup Co-authored-by: Jamey Sharp <jamey@minilop.net>	2022-10-17 12:51:20 -07:00
Nick Fitzgerald	03d77d4d6b	Cranelift: Derive `Copy` for `InstructionData` (#5043 ) * Cranelift: Derive `Copy` for `InstructionData` And update `clone` calls to be copies. * Add a test for `InstructionData`'s size	2022-10-12 07:58:27 -07:00
Chris Fallin	2be12a5167	egraph-based midend: draw the rest of the owl (productionized). (#4953 ) * egraph-based midend: draw the rest of the owl. * Rename `egg` submodule of cranelift-codegen to `egraph`. * Apply some feedback from @jsharp during code walkthrough. * Remove recursion from find_best_node by doing a single pass. Rather than recursively computing the lowest-cost node for a given eclass and memoizing the answer at each eclass node, we can do a single forward pass; because every eclass node refers only to earlier nodes, this is sufficient. The behavior may slightly differ from the earlier behavior because we cannot short-circuit costs to zero once a node is elaborated; but in practice this should not matter. * Make elaboration non-recursive. Use an explicit stack instead (with `ElabStackEntry` entries, alongside a result stack). * Make elaboration traversal of the domtree non-recursive/stack-safe. * Work analysis logic in Cranelift-side egraph glue into a general analysis framework in cranelift-egraph. * Apply static recursion limit to rule application. * Fix aarch64 wrt dynamic-vector support -- broken rebase. * Topo-sort cranelift-egraph before cranelift-codegen in publish script, like the comment instructs me to! * Fix multi-result call testcase. * Include `cranelift-egraph` in `PUBLISHED_CRATES`. * Fix atomic_rmw: not really a load. * Remove now-unnecessary PartialOrd/Ord derivations. * Address some code-review comments. * Review feedback. * Review feedback. * No overlap in mid-end rules, because we are defining a multi-constructor. * rustfmt * Review feedback. * Review feedback. * Review feedback. * Review feedback. * Remove redundant `mut`. * Add comment noting what rules can do. * Review feedback. * Clarify comment wording. * Update `has_memory_fence_semantics`. * Apply @jameysharp's improved loop-level computation. Co-authored-by: Jamey Sharp <jamey@minilop.net> * Fix suggestion commit. * Fix off-by-one in new loop-nest analysis. * Review feedback. * Review feedback. * Review feedback. * Use `Default`, not `std::default::Default`, as per @fitzgen Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> * Apply @fitzgen's comment elaboration to a doc-comment. Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> * Add stat for hitting the rewrite-depth limit. * Some code motion in split prelude to make the diff a little clearer wrt `main`. * Take @jameysharp's suggested `try_into()` usage for blockparam indices. Co-authored-by: Jamey Sharp <jamey@minilop.net> * Take @jameysharp's suggestion to avoid double-match on load op. Co-authored-by: Jamey Sharp <jamey@minilop.net> * Fix suggestion (add import). * Review feedback. * Fix stack_load handling. * Remove redundant can_store case. * Take @jameysharp's suggested improvement to FuncEGraph::build() logic Co-authored-by: Jamey Sharp <jamey@minilop.net> * Tweaks to FuncEGraph::build() on top of suggestion. * Take @jameysharp's suggested clarified condition Co-authored-by: Jamey Sharp <jamey@minilop.net> * Clean up after suggestion (unused variable). * Fix loop analysis. * loop level asserts * Revert constant-space loop analysis -- edge cases were incorrect, so let's go with the simple thing for now. * Take @jameysharp's suggestion re: result_tys Co-authored-by: Jamey Sharp <jamey@minilop.net> * Fix up after suggestion * Take @jameysharp's suggestion to use fold rather than reduce Co-authored-by: Jamey Sharp <jamey@minilop.net> * Fixup after suggestion * Take @jameysharp's suggestion to remove elaborate_eclass_use's return value. * Clarifying comment in terminator insts. Co-authored-by: Jamey Sharp <jamey@minilop.net> Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>	2022-10-11 18:15:53 -07:00
Nick Fitzgerald	e2f1ced0b6	Cranelift: Make `Opcode` represented as a `u8` instead of `u16` and remove vestigial conversion impls (#5042 ) * Cranelift: Make `Opcode` represented as a `u8` instead of `u16` * Cranelift: Remove unused conversion impls for `Opcode` These are vestigial, left over from Peepmatic.	2022-10-11 12:57:12 -07:00
Jun Ryung Ju	39fbff92c3	cranelift: Added fp and, or, xor, not ops to interpreter. (#4999 ) * cranelift: Added fp and, or, xor, not ops to interpreter. * Formatting. * Removed archtecture dependent test on float-bitops.	2022-10-06 18:24:45 -07:00
Afonso Bordado	65a3af72c7	fuzzgen: Statistics framework (#4868 ) * cranelift: Add non user trap codes function * cranelift: Add Fuzzgen stats * cranelift: Use `once_cell` and cleanup some stuff * fuzzgen: Remove total_inputs metric * fuzzgen: Filter empty trap codes	2022-09-27 16:04:57 +00:00
Jamey Sharp	e694a6f5d4	Allocate less while constructing cranelift-fuzzgen tests (#4863 ) * Improve panic message if typevar_operand is None * cranelift-fuzzgen: Don't allocate for each choice I don't think the performance of test-case generation is at all important here. I'm actually doing this in preparation for a bigger refactor where I want to be able to borrow the list of valid choices for a given opcode without worrying about lifetimes. * cranelift-fuzzgen: Remove next_func_index It's only used locally within `generate_funcrefs`, so it doesn't need to be in the FunctionBuilder struct. Also there's already a local counter that I think is good enough for this. As far as I know, the function indexes only need to be distinct, not contiguous. * cranelift-fuzzgen: Separate resources from config The function-global variables, blocks, etc that are generated before generating instructions are all owned collections without any lifetime parameters. By contrast, the Unstructured and Config are both borrowed. Separating them will make it easier to borrow from the owned resources.	2022-09-07 12:19:55 -07:00
Jamey Sharp	3d6d49daba	cranelift: Remove of/nof overflow flags from icmp (#4879 ) * cranelift: Remove of/nof overflow flags from icmp Neither Wasmtime nor cg-clif use these flags under any circumstances. From discussion on #3060 I see it's long been unclear what purpose these flags served. Fixes #3060, fixes #4406, and fixes #4875... by deleting all the code that could have been buggy. This changes the cranelift-fuzzgen input format by removing some IntCC options, so I've gone ahead and enabled I128 icmp tests at the same time. Since only the of/nof cases were failing before, I expect these to work. * Restore trapif tests It's still useful to validate that iadd_ifcout's iflags result can be forwarded correctly to trapif, and for that purpose it doesn't really matter what condition code is checked.	2022-09-07 08:38:41 -07:00
Jamey Sharp	9856664f1f	Make DataValue, not Ieee32/64, respect IEEE754 (#4860 ) * cranelift-codegen: Remove all uses of DataValue This type is only used by the interpreter, cranelift-fuzzgen, and filetests. I haven't found another convenient crate for those to all depend on where this type can live instead, but this small refactor at least makes it obvious that code generation does not in any way depend on the implementation of this type. * Make DataValue, not Ieee32/64, respect IEEE754 This fixes #4857 by partially reverting #4849. It turns out that Ieee32 and Ieee64 need bitwise equality semantics so they can be used as hash-table keys. Moving the IEEE754 semantics up a layer to DataValue makes sense in conjunction with #4855, where we introduced a DataValue::bitwise_eq alternative implementation of equality for those cases where users of DataValue still want the bitwise equality semantics. * cranelift-interpreter: Use eq/ord from DataValue This fixes #4828, again, now that the comparison operators on DataValue have the right IEEE754 semantics. * Add regression test from issue #4857	2022-09-03 00:26:14 +00:00
Afonso Bordado	3afb711a51	cranelift: Document Ieee{32,64} implementation (#4854 )	2022-09-02 19:02:22 +00:00
Afonso Bordado	f30a7eb0c9	cranelift: Implement PartialEq on Ieee{32,64} (#4849 ) * cranelift: Add `fcmp` tests Some of these are disabled on aarch64 due to not being implemented yet. * cranelift: Implement float PartialEq for Ieee{32,64} (fixes #4828) Previously `PartialEq` was auto derived. This means that it was implemented in terms of PartialEq in a u32. This is not correct for floats because `NaN != NaN`. PartialOrd was manually implemented in `6d50099816`, but it seems like it was an oversight to leave PartialEq out until now. The test suite depends on the previous behaviour so we adjust it to keep comparing bits instead of floats. * cranelift: Disable `fcmp ord` tests on aarch64 * cranelift: Disable `fcmp ueq` tests on aarch64	2022-09-02 10:42:42 -07:00
Chris Fallin	955d4e4ba1	AArch64: port load and store operations to ISLE. (#4785 ) This retains `lower_amode` in the handwritten code (@akirilov-arm reports that there is an upcoming patch to port this), but tweaks it slightly to take a `Value` rather than an `Inst`.	2022-08-29 17:45:55 -07:00

1 2 3 4 5 ...

253 Commits