wasmtime

Author	SHA1	Message	Date
bjorn3	108f7917c8	Support plugging external profilers into the Cranelift timing infrastructure (#5749 ) * Remove no-std code for cranelift_codegen::timings no-std mode isn't supported by Cranelift anymore * Simplify define_passes macro * Add egraph opt timings * Replace the add_to_current api with PassTimes::add * Omit a couple of unused time measurements * Reduce divergence between run and run_passes a bit * Introduce a Profiler trait This allows plugging in external profilers into the Cranelift profiling framework. * Add Pass::description method * Remove duplicate usage of the compile pass timing * Rustfmt	2023-03-10 19:33:56 +00:00
Jamey Sharp	915801551b	Delete old cranelift-preopt crate (#5642 ) Most of these optimizations are in the egraph `cprop.isle` rules now, making a separate crate unnecessary. Also I think the `udiv` optimizations here are straight-up wrong (doing signed instead of unsigned division, and panicking instead of preserving traps on division by zero) so I'm guessing this crate isn't seriously used anywhere. At the least, bjorn3 confirms that cg_clif doesn't use this, and I've verified that Wasmtime doesn't either. Closes #1090.	2023-01-26 21:32:33 +00:00
Nick Fitzgerald	c0b587ac5f	Remove heaps from core Cranelift, push them into `cranelift-wasm` (#5386 ) * cranelift-wasm: translate Wasm loads into lower-level CLIF operations Rather than using `heap_{load,store,addr}`. * cranelift: Remove the `heap_{addr,load,store}` instructions These are now legalized in the `cranelift-wasm` frontend. * cranelift: Remove the `ir::Heap` entity from CLIF * Port basic memory operation tests to .wat filetests * Remove test for verifying CLIF heaps * Remove `heap_addr` from replace_branching_instructions_and_cfg_predecessors.clif test * Remove `heap_addr` from readonly.clif test * Remove `heap_addr` from `table_addr.clif` test * Remove `heap_addr` from the simd-fvpromote_low.clif test * Remove `heap_addr` from simd-fvdemote.clif test * Remove `heap_addr` from the load-op-store.clif test * Remove the CLIF heap runtest * Remove `heap_addr` from the global_value.clif test * Remove `heap_addr` from fpromote.clif runtests * Remove `heap_addr` from fdemote.clif runtests * Remove `heap_addr` from memory.clif parser test * Remove `heap_addr` from reject_load_readonly.clif test * Remove `heap_addr` from reject_load_notrap.clif test * Remove `heap_addr` from load_readonly_notrap.clif test * Remove `static-heap-without-guard-pages.clif` test Will be subsumed when we port `make-heap-load-store-tests.sh` to generating `.wat` tests. * Remove `static-heap-with-guard-pages.clif` test Will be subsumed when we port `make-heap-load-store-tests.sh` over to `.wat` tests. * Remove more heap tests These will be subsumed by porting `make-heap-load-store-tests.sh` over to `.wat` tests. * Remove `heap_addr` from `simple-alias.clif` test * Remove `heap_addr` from partial-redundancy.clif test * Remove `heap_addr` from multiple-blocks.clif test * Remove `heap_addr` from fence.clif test * Remove `heap_addr` from extends.clif test * Remove runtests that rely on heaps Heaps are not a thing in CLIF or the interpreter anymore * Add generated load/store `.wat` tests * Enable memory-related wasm features in `.wat` tests * Remove CLIF heap from fcmp-mem-bug.clif test * Add a mode for compiling `.wat` all the way to assembly in filetests * Also generate WAT to assembly tests in `make-load-store-tests.sh` * cargo fmt * Reinstate `f{de,pro}mote.clif` tests without the heap bits * Remove undefined doc link * Remove outdated SVG and dot file from docs * Add docs about `None` returns for base address computation helpers * Factor out `env.heap_access_spectre_mitigation()` to a local * Expand docs for `FuncEnvironment::heaps` trait method * Restore f{de,pro}mote+load clif runtests with stack memory	2022-12-15 00:26:45 +00:00
Nick Fitzgerald	f2e1eaa847	cranelift-filetest: Add support for Wasm-to-CLIF translation filetests (#5412 ) This adds support for `.wat` tests in `cranelift-filetest`. The test runner translates the WAT to Wasm and then uses `cranelift-wasm` to translate the Wasm to CLIF. These tests are always precise output tests. The test expectations can be updated by running tests with the `CRANELIFT_TEST_BLESS=1` environment variable set, similar to our compile precise output tests. The test's expected output is contained in the last comment in the test file. The tests allow for configuring the kinds of heaps used to implement Wasm linear memory via TOML in a `;;!` comment at the start of the test. To get ISA and Cranelift flags parsing available in the filetests crate, I had to move the `parse_sets_and_triple` helper from the `cranelift-tools` binary crate to the `cranelift-reader` crate, where I think it logically fits. Additionally, I had to make some more bits of `cranelift-wasm`'s dummy environment `pub` so that I could properly wrap and compose it with the environment used for the `.wat` tests. I don't think this is a big deal, but if we eventually want to clean this stuff up, we can probably remove the dummy environments completely, remove `translate_module`, and fold them into these new test environments and test runner (since Wasmtime isn't using those things anyways).	2022-12-12 19:31:29 +00:00
Chris Fallin	2be12a5167	egraph-based midend: draw the rest of the owl (productionized). (#4953 ) * egraph-based midend: draw the rest of the owl. * Rename `egg` submodule of cranelift-codegen to `egraph`. * Apply some feedback from @jsharp during code walkthrough. * Remove recursion from find_best_node by doing a single pass. Rather than recursively computing the lowest-cost node for a given eclass and memoizing the answer at each eclass node, we can do a single forward pass; because every eclass node refers only to earlier nodes, this is sufficient. The behavior may slightly differ from the earlier behavior because we cannot short-circuit costs to zero once a node is elaborated; but in practice this should not matter. * Make elaboration non-recursive. Use an explicit stack instead (with `ElabStackEntry` entries, alongside a result stack). * Make elaboration traversal of the domtree non-recursive/stack-safe. * Work analysis logic in Cranelift-side egraph glue into a general analysis framework in cranelift-egraph. * Apply static recursion limit to rule application. * Fix aarch64 wrt dynamic-vector support -- broken rebase. * Topo-sort cranelift-egraph before cranelift-codegen in publish script, like the comment instructs me to! * Fix multi-result call testcase. * Include `cranelift-egraph` in `PUBLISHED_CRATES`. * Fix atomic_rmw: not really a load. * Remove now-unnecessary PartialOrd/Ord derivations. * Address some code-review comments. * Review feedback. * Review feedback. * No overlap in mid-end rules, because we are defining a multi-constructor. * rustfmt * Review feedback. * Review feedback. * Review feedback. * Review feedback. * Remove redundant `mut`. * Add comment noting what rules can do. * Review feedback. * Clarify comment wording. * Update `has_memory_fence_semantics`. * Apply @jameysharp's improved loop-level computation. Co-authored-by: Jamey Sharp <jamey@minilop.net> * Fix suggestion commit. * Fix off-by-one in new loop-nest analysis. * Review feedback. * Review feedback. * Review feedback. * Use `Default`, not `std::default::Default`, as per @fitzgen Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> * Apply @fitzgen's comment elaboration to a doc-comment. Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> * Add stat for hitting the rewrite-depth limit. * Some code motion in split prelude to make the diff a little clearer wrt `main`. * Take @jameysharp's suggested `try_into()` usage for blockparam indices. Co-authored-by: Jamey Sharp <jamey@minilop.net> * Take @jameysharp's suggestion to avoid double-match on load op. Co-authored-by: Jamey Sharp <jamey@minilop.net> * Fix suggestion (add import). * Review feedback. * Fix stack_load handling. * Remove redundant can_store case. * Take @jameysharp's suggested improvement to FuncEGraph::build() logic Co-authored-by: Jamey Sharp <jamey@minilop.net> * Tweaks to FuncEGraph::build() on top of suggestion. * Take @jameysharp's suggested clarified condition Co-authored-by: Jamey Sharp <jamey@minilop.net> * Clean up after suggestion (unused variable). * Fix loop analysis. * loop level asserts * Revert constant-space loop analysis -- edge cases were incorrect, so let's go with the simple thing for now. * Take @jameysharp's suggestion re: result_tys Co-authored-by: Jamey Sharp <jamey@minilop.net> * Fix up after suggestion * Take @jameysharp's suggestion to use fold rather than reduce Co-authored-by: Jamey Sharp <jamey@minilop.net> * Fixup after suggestion * Take @jameysharp's suggestion to remove elaborate_eclass_use's return value. * Clarifying comment in terminator insts. Co-authored-by: Jamey Sharp <jamey@minilop.net> Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>	2022-10-11 18:15:53 -07:00
Afonso Bordado	7a9078d9cc	cranelift: Allow `call` and `call_indirect` in runtests (#4667 ) * cranelift: Change test runner order Changes the ordering of runtests to run per target and then per function. This change doesn't do a lot by itself, but helps future refactorings of runtests. * cranelift: Rename SingleFunctionCompiler to TestCaseCompiler * cranelift: Skip runtests per target instead of per run * cranelift: Deduplicate test names With the upcoming changes to the runtest infrastructure we require unique ExtNames for all tests. Note that for test names we have a 16 character limit on test names, and must be unique within those 16 characters. * cranelift: Add TestFileCompiler to runtests TestFileCompiler allows us to compile the entire file once, and then call the trampolines for each test. The previous code was compiling the function for each invocation of a test. * cranelift: Deduplicate ExtName for avg_round tests * cranelift: Rename functions as they are defined. The JIT internally only deals with User functions, and cannot link test name funcs. This also caches trampolines by signature. * cranelift: Preserve original name when reporting errors. * cranelift: Rename aarch64 test functions * cranelift: Add `call` and `call_indirect` tests! * cranelift: Add pauth runtests for aarch64 * cranelift: Rename duplicate s390x tests * cranelift: Delete `i128_bricmp_of` function from i128-bricmp It looks like we forgot to delete it when it was moved to `i128-bricmp-overflow`, and since it didn't have a run invocation it was never compiled. However, s390x does not support this, and panics when lowering. * cranelift: Add `colocated` call tests * cranelift: Rename more `s390x` tests * cranelift: Add pauth + sign_return_address call tests * cranelift: Undeduplicate test names With the latest main changes we now support unlimited length test names. This commit reverts: 52274676ff631c630f9879dd32e756566d3e700f 7989edc172493547cdf63e180bb58365e8a43a42 25c8a8395527d98976be6a34baa3b0b214776739 792e8cfa8f748077f9d80fe7ee5e958b7124e83b * cranelift: Add LibCall tests * cranelift: Revert more test names These weren't auto reverted by the previous revert. * cranelift: Disable libcall tests for aarch64 * cranelift: Runtest fibonacci tests * cranelift: Misc cleanup	2022-08-26 12:42:16 -07:00
Chris Fallin	0824abbae4	Add a basic alias analysis with redundant-load elim and store-to-load fowarding opts. (#4163 ) This PR adds a basic alias analysis, and optimizations that use it. This is a "mid-end optimization": it operates on CLIF, the machine-independent IR, before lowering occurs. The alias analysis (or maybe more properly, a sort of memory-value analysis) determines when it can prove a particular memory location is equal to a given SSA value, and when it can, it replaces any loads of that location. This subsumes two common optimizations: * Redundant load elimination: when the same memory address is loaded two times, and it can be proven that no intervening operations will write to that memory, then the second load is redundant and its result must be the same as the first. We can use the first load's result and remove the second load. * Store-to-load forwarding: when a load can be proven to access exactly the memory written by a preceding store, we can replace the load's result with the store's data operand, and remove the load. Both of these optimizations rely on a "last store" analysis that is a sort of coloring mechanism, split across disjoint categories of abstract state. The basic idea is that every memory-accessing operation is put into one of N disjoint categories; it is disallowed for memory to ever be accessed by an op in one category and later accessed by an op in another category. (The frontend must ensure this.) Then, given this, we scan the code and determine, for each memory-accessing op, when a single prior instruction is a store to the same category. This "colors" the instruction: it is, in a sense, a static name for that version of memory. This analysis provides an important invariant: if two operations access memory with the same last-store, then no other store can alias in the time between that last store and these operations. This must-not-alias property, together with a check that the accessed address is exactly the same (same SSA value and offset), and other attributes of the access (type, extension mode) are the same, let us prove that the results are the same. Given last-store info, we scan the instructions and build a table from "memory location" key (last store, address, offset, type, extension) to known SSA value stored in that location. A store inserts a new mapping. A load may also insert a new mapping, if we didn't already have one. Then when a load occurs and an entry already exists for its "location", we can reuse the value. This will be either RLE or St-to-Ld depending on where the value came from. Note that this does work across basic blocks: the last-store analysis is a full iterative dataflow pass, and we are careful to check dominance of a previously-defined value before aliasing to it at a potentially redundant load. So we will do the right thing if we only have a "partially redundant" load (loaded already but only in one predecessor block), but we will also correctly reuse a value if there is a store or load above a loop and a redundant load of that value within the loop, as long as no potentially-aliasing stores happen within the loop.	2022-05-20 13:19:32 -07:00
Nick Fitzgerald	d2d0a0f36b	Remove Peepmatic!!! Peepmatic was an early attempt at a DSL for peephole optimizations, with the idea that maybe sometime in the future we could user it for instruction selection as well. It didn't really pan out, however: * Peepmatic wasn't quite flexible enough, and adding new operators or snippets of code implemented externally in Rust was a bit of a pain. * The performance was never competitive with the hand-written peephole optimizers. It was very size efficient, but that came at the cost of run-time efficiency. Everything was table-based and interpreted, rather than generating any Rust code. Ultimately, because of these reasons, we never turned Peepmatic on by default. These days, we just landed the ISLE domain-specific language, and it is better suited than Peepmatic for all the things that Peepmatic was originally designed to do. It is more flexible and easy to integrate with external Rust code. It is has better time efficiency, meeting or even beating hand-written code. I think a small part of the reason why ISLE excels in these things is because its design was informed by Peepmatic's failures. I still plan on continuing Peepmatic's mission to make Cranelift's peephole optimizer passes generated from DSL rewrite rules, but using ISLE instead of Peepmatic. Thank you Peepmatic, rest in peace!	2021-11-17 13:04:17 -08:00
Benjamin Bouvier	43a86f14d5	Remove more old backend ISA concepts (#3402 ) This also paves the way for unifying TargetIsa and MachBackend, since now they map one to one. In theory the two traits could be merged, which would be nice to limit the number of total concepts. Also they have quite different responsibilities, so it might be fine to keep them separate. Interestingly, this PR started as removing RegInfo from the TargetIsa trait since the adapter returned a dummy value there. From the fallout, noticed that all Display implementations didn't needed an ISA anymore (since these were only used to render ISA specific registers). Also the whole family of RegInfo / ValueLoc / RegUnit was exclusively used for the old backend, and these could be removed. Notably, some IR instructions needed to be removed, because they were using RegUnit too: this was the oddball of regfill / regmove / regspill / copy_special, which were IR instructions inserted by the old regalloc. Fare thee well!	2021-10-04 10:36:12 +02:00
Benjamin Bouvier	bae4ec6427	Remove ancient register allocation (#3401 )	2021-09-30 21:27:23 +02:00
Afonso Bordado	f4ff7c350a	cranelift: Add heap support to filetest infrastructure (#3154 ) * cranelift: Add heap support to filetest infrastructure * cranelift: Explicit heap pointer placement in filetest annotations * cranelift: Add documentation about the Heap directive * cranelift: Clarify that heap filetests pointers must be laid out sequentially * cranelift: Use wrapping add when computing bound pointer * cranelift: Better error messages when invalid signatures are found for heap file tests.	2021-08-24 09:28:41 -07:00
Afonso Bordado	7453bd5f0d	Cranelift CLIF-level differential fuzzer (#3038 ) * cranelift: Initial fuzzer implementation * cranelift: Generate multiple test cases in fuzzer * cranelift: Separate function generator in fuzzer * cranelift: Insert random instructions in fuzzer * cranelift: Rename gen_testcase * cranelift: Implement div for unsigned values in interpreter * cranelift: Run all test cases in fuzzer * cranelift: Comment options in function_runner * cranelift: Improve fuzzgen README.md * cranelift: Fuzzgen remove unused variable * cranelift: Fuzzer code style fixes Thanks! @bjorn3 * cranelift: Fix nits in CLIF fuzzer Thanks @cfallin! * cranelift: Implement Arbitrary for TestCase * cranelift: Remove gen_testcase * cranelift: Move fuzzers to wasmtime fuzz directory * cranelift: CLIF-Fuzzer ignore tests that produce traps * cranelift: CLIF-Fuzzer create new fuzz target to validate generated testcases * cranelift: Store clif-fuzzer config in a separate struct * cranelift: Generate variables upfront per function * cranelift: Prevent publishing of fuzzgen crate	2021-07-01 06:32:01 -07:00
Andrew Brown	c9e8889d47	Update clippy annotation to use latest version (#2375 )	2020-11-09 09:24:59 -06:00
Nick Fitzgerald	31cbbd1d20	clif-util: Use `anyhow::Error` for errors instead of `String` Also does the same for `cranelift-filetests`.	2020-09-14 18:29:00 -07:00
Nick Fitzgerald	05bf9ea3f3	Rename "Stackmap" to "StackMap" And "stackmap" to "stack_map". This commit is purely mechanical.	2020-08-07 10:08:44 -07:00
Nick Fitzgerald	6aac4c891e	cranelift: Better document and test stack maps	2020-06-08 15:05:20 -07:00
Chris Fallin	48573b52b2	Merge `vcode` filetest mode into `compile`. I hadn't realized before that the filetest backend for `test vcode` is doing essentially what `compile` is doing, but for new (`MachInst`) backends: it is just getting a disassembly and running it through filecheck. There's no reason not to reuse `test compile` for the AArch64 tests as well. This was motivated by the desire to have "this IR compiles successfully" tests work on both x86 and AArch64. It seems this should work fine by adding multiple `target` directives when a test case should be compile-tested on multiple architectures.	2020-05-22 17:28:48 -07:00
Nick Fitzgerald	52c6ece5f3	peepmatic: Make peepmatic optional to enable Rather than outright replacing parts of our existing peephole optimizations passes, this makes peepmatic an optional cargo feature that can be enabled. This allows us to take a conservative approach with enabling peepmatic everywhere, while also allowing us to get it in-tree and make it easier to collaborate on improving it quickly.	2020-05-14 07:52:23 -07:00
Andrew Brown	b26ca3cbdd	Add `test interpret` support to filetests	2020-05-07 16:51:09 -07:00
Andrew Brown	38dff29179	Add ability to call CLIF functions with arbitrary arguments in filetests This resolves the work started in https://github.com/bytecodealliance/cranelift/pull/1231 and https://github.com/bytecodealliance/wasmtime/pull/1436. Cranelift filetests currently have the ability to run CLIF functions with a signature like `() -> b*` and check that the result is true under the `test run` directive. This PR adds the ability to call functions with arbitrary arguments and non-boolean returns and either print the result or check against a list of expected results: - `run` commands look like `; run: %add(2, 2) == 4` or `; run: %add(2, 2) != 5` and verify that the executed CLIF function returns the expected value - `print` commands look like `; print: %add(2, 2)` and print the result of the function to stdout To make this work, this PR compiles a single Cranelift `Function` into a `CompiledFunction` using a `SingleFunctionCompiler`. Because we will not know the signature of the function until runtime, we use a `Trampoline` to place the values in the appropriate location for the calling convention; this should look a lot like what @alexcrichton is doing with `VMTrampoline` in wasmtime (see `3b7cb6ee64/crates/api/src/func.rs (L510-L526)`, `3b7cb6ee64/crates/jit/src/compiler.rs (L260)`). To avoid re-compiling `Trampoline`s for the same function signatures, `Trampoline`s are cached in the `SingleFunctionCompiler`.	2020-04-30 11:21:00 -07:00
Peter Huene	f7e9f86ba9	Refactor unwind generation in Cranelift. This commit makes the following changes to unwind information generation in Cranelift: * Remove frame layout change implementation in favor of processing the prologue and epilogue instructions when unwind information is requested. This also means this work is no longer performed for Windows, which didn't utilize it. It also helps simplify the prologue and epilogue generation code. * Remove the unwind sink implementation that required each unwind information to be represented in final form. For FDEs, this meant writing a complete frame table per function, which wastes 20 bytes or so for each function with duplicate CIEs. This also enables Cranelift users to collect the unwind information and write it as a single frame table. * For System V calling convention, the unwind information is no longer stored in code memory (it's only a requirement for Windows ABI to do so). This allows for more compact code memory for modules with a lot of functions. * Deletes some duplicate code relating to frame table generation. Users can now simply use gimli to create a frame table from each function's unwind information. Fixes #1181.	2020-04-16 11:15:32 -07:00
Chris Fallin	402303f67a	ARM64 backend, part 10 / 11: filetest support for VCode tests. This patch adds support for filetests with the `vcode` type. This allows test cases to feed CLIF into the new backend, produce VCode output with machine instructions, and then perform matching against the pretty-printed text representation of the VCode. Tests for the new ARM64 backend using this infrastructure will come in a followup patch.	2020-04-11 17:52:56 -07:00
Yury Delendik	bd88155483	Refactor unwind; add FDE support. (#1320 ) * Refactor unwind * add FDE support * use sink directly in emit functions * pref off all unwinding generation with feature	2020-01-13 10:32:55 -06:00
Peter Huene	8923bac7e8	Implement emitting Windows unwind information for fastcall functions. (#1155 ) * Implement emitting Windows unwind information for fastcall functions. This commit implements emitting Windows unwind information for x64 fastcall calling convention functions. The unwind information can be used to construct a Windows function table at runtime for JIT'd code, enabling stack walking and unwinding by the operating system. * Address code review feedback. This commit addresses code review feedback: * Remove unnecessary unsafe code. * Emit the unwind information always as little endian. * Fix comments. A dependency from cranelift-codegen to the byteorder crate was added. The byteorder crate is a no-dependencies crate with a reasonable abstraction for writing binary data for a specific endianness. * Address code review feedback. * Disable default features for the `byteorder` crate. * Add a comment regarding the Windows ABI unwind code numerical values. * Panic if we encounter a Windows function with a prologue greater than 256 bytes in size.	2019-11-05 13:14:30 -08:00
Andrew Brown	c3cc225de9	Add filetest for verifying emitted rodata (i.e. `test rodata`)	2019-08-26 16:12:06 -07:00
Andrew Brown	ff3c44385c	Add `test run` to cranelift-filetests to allow executing CLIF (#890 ) * Add ability to run CLIF IR using `clif-util run [-v] {file}` and add `test run` to cranelift-filetests to allow executing CLIF This re-factors the compile/execute parts to a FunctionRunner that is shared between cranelift-filetests and clif-util. CLIF can be now be run using `clif-util run` as well as during `clif-util test` for files with a `test run` header. As before, only functions suffixed with a `run` comment are executed. The `run: fn(...) == ...` expression syntax is left for a subsequent change.	2019-08-21 18:03:09 +02:00
Carmen Kwan	19257f80c1	Add reference types R32 and R64 -Add resumable_trap, safepoint, isnull, and null instructions -Add Stackmap struct and StackmapSink trait Co-authored-by: Mir Ahmed <mirahmed753@gmail.com> Co-authored-by: Dan Gohman <sunfish@mozilla.com>	2019-08-16 11:35:16 -07:00
Benjamin Bouvier	d7d48d5cc6	Add the dyn keyword before trait objects;	2019-06-24 11:42:26 +02:00
lazypassion	747ad3c4c5	moved crates in lib/ to src/, renamed crates, modified some files' text (#660 ) moved crates in lib/ to src/, renamed crates, modified some files' text (#660)	2019-01-28 15:56:54 -08:00

29 Commits