wasmtime

Author	SHA1	Message	Date
Trevor Elliott	f138fc0ed3	Bump regalloc2 to 0.5.0 (#5345 ) * Bump the regalloc2 dependency to 0.5.0 * Replace preg_set_from_machine_env with PRegSet::from * Vet the regalloc2 update	2022-11-29 11:25:35 -08:00
Jamey Sharp	3b76874834	cranelift-isle: Fix representation for overlap checks (#5337 ) Ulrich Weigand identified two bugs in this code due to it falsely claiming there were unreachable rules in the s390x backend. The fixes are: - Add constraints for pure constructors. I didn't notice that a constructor which is declared pure (which currently implies that it is fallible), when used on the left-hand side of a rule, can cause the rule to fail to match. Therefore, any constructors on the left-hand side must be noted as additional constraints on the rule, so that overlap checking can see them. - Ignore subset-overlaps for rules with equality constraints This eliminates false positives when checking for unreachable rules. It introduces false negatives instead but we prefer to fail to detect an error instead of claiming that valid input is wrong. We can implement a more accurate check later.	2022-11-29 11:02:12 -08:00
Afonso Bordado	ec342c20e3	cranelift: Add `iadd_cout` lowerings for aarch64 (#5177 ) * cranelift: Add `iadd_cout`/`isub_bout` i128 tests * aarch64: Add `iadd_cout` lowerings * fuzzgen: Add `iadd_cout`	2022-11-29 10:58:44 -08:00
Jamey Sharp	ff5abfd993	cranelift-isle: Minor error-handling cleanups (#5338 ) - Remove remaining references to Miette - Borrow implementation of `line_starts` from codespan-reporting - Clean up a use of `Result` that no longer conflicts with a local definition - When printing plain errors, add a blank line between errors for readability	2022-11-29 03:07:05 +00:00
Trevor Elliott	a5a0645aff	Don't allow reuse_def constraints in the s390x Loop instruction (#5336 )	2022-11-28 17:52:11 -08:00
Trevor Elliott	368004428a	Fix rule shadowing instances in x64 and aarch64 backends (#5334 ) Fix shadowing identified in #5322 for imul and swiden_high/swiden_low/uwiden_high/uwiden_low combinations in the x64 backend, and remove some redundant rules from the aarch64 dynamic neon ruleset. Additionally, add tests to the x64 backend showing that the imul specializations are firing.	2022-11-28 15:48:34 -08:00
Nick Fitzgerald	58a5089e48	Cranelift: log number of CLIF insts/blocks to optimize/lower (#5333 )	2022-11-28 19:35:29 +00:00
Nick Fitzgerald	6fe69d00ca	Cranelift: add debug logs counting how many vcode instructions and blocks we lower to (#5332 )	2022-11-28 18:57:02 +00:00
Trevor Elliott	54a6d2f79a	Generate more fixed_nonallocatable constraints, and add debug assertions (#5132 ) Add assertions to the OperandCollector that show we're not using pinned vregs, and use reg_fixed_nonallocatable constraints when a real register is used with other constraint generation functions like reg_use etc.	2022-11-28 10:31:56 -08:00
Rodrigo Batista de Moraes	28cf995fd3	cranelift-frontend: make `FunctionBuilder::finalize` consume self (#5316 )	2022-11-23 23:41:52 +00:00
Jamey Sharp	044b57f334	cranelift-isle: Rewrite error reporting (#5318 ) There were several issues with ISLE's existing error reporting implementation. - When using Miette for more readable error reports, it would panic if errors were reported from multiple files in the same run. - Miette is pretty heavy-weight for what we're doing, with a lot of dependencies. - The `Error::Errors` enum variant led to normalization steps in many places, to avoid using that variant to represent a single error. This commit: - replaces Miette with codespan-reporting - gets rid of a bunch of cargo-vet exemptions - replaces the `Error::Errors` variant with a new `Errors` type - removes source info from `Error` variants so they're easy to construct - adds source info only when formatting `Errors` - formats `Errors` with a custom `Debug` impl - shares common code between ISLE's callers, islec and cranelift-codegen - includes a source snippet even with fancy-errors disabled I tried to make this a series of smaller commits but I couldn't find any good split points; everything was too entangled with everything else.	2022-11-23 14:20:48 -08:00
Timothy Chen	48ee42efc2	Refactor Sigdata methods with sigset (#5307 ) * Refactor sigdata methods * Update cranelift/codegen/src/machinst/abi.rs Co-authored-by: Jamey Sharp <jamey@minilop.net> * Address comments Co-authored-by: Jamey Sharp <jamey@minilop.net>	2022-11-22 09:03:51 -08:00
Nick Fitzgerald	d0d3245a35	Cranelift: Add `heap_load` and `heap_store` instructions (#5300 ) * Cranelift: Define `heap_load` and `heap_store` instructions * Cranelift: Implement interpreter support for `heap_load` and `heap_store` * Cranelift: Add a suite runtests for `heap_{load,store}` There are so many knobs we can twist for heaps and I wanted to exhaustively test all of them, so I wrote a script to generate the tests. I've checked in the script in case we want to make any changes in the future, but I don't think it is worth adding this to CI to check that scripts are up to date or anything like that. * Review feedback	2022-11-21 23:00:39 +00:00
Trevor Elliott	54cfa4df34	cranelift: Fix implicit pointer argument register use (#5301 ) * Fix arg handling to write to VRegs instead of physical regs * Make is_included_in_clobbers required, and handle Args on x64 and riscv64	2022-11-18 16:47:03 -08:00
Jamey Sharp	54207d343e	cranelift-isle: Specialize for Term at rule root (#5295 ) In #5174 we decided it doesn't make sense for a rule to have a bind-pattern at the root of its left-hand side. There's no Rust value corresponding to the root value of such a term, because it actually represents a function declaration with one or more arguments. This commit takes that to its logical conclusion. `sema::Rule` previously had an `lhs` field whose value must always be a `Pattern::Term` variant, and anyone using that structure had to deal with the possibility of finding the wrong variant there. Now the relevant fields from that variant are stored directly in `Rule` instead. Also, the (tiny!) portion of `translate_pattern` which applied when the pattern was the root term is now inlined in `collect_rules`. Because `translate_pattern` no longer has to special-case the root term, we can delete its `rule_term` and `is_root` arguments. That brings it down to a more manageable four arguments, which means many calls fit on one line now.	2022-11-18 11:21:08 -08:00
Jun Ryung Ju	e5f93d9ec0	cranelift: Support `bnot`, `band`, `bor`, `bxor` for x86_64. (#5036 ) * Support `bnot`, `band`, `bor`, `bxor` for x86_64. * Fix-up to handle `B{8,16,32,64}` type on bitops * Fix-up conflict.	2022-11-18 07:45:54 -08:00
Jamey Sharp	9a44ef7443	cranelift-isle: Unify expressions and bindings (#5294 ) As it turns out, that distinction was not necessary for this representation. Removing it eliminates some complexity around wrapping expressions as bindings and vice versa. It also clears up some confusion about which category to put certain constructs in (arguments and extractors) by refusing to have different categories. While I was writing this patch I also realized that `add_match_variant` and `normalize_equivalence_classes` both need to do fundamentally the same things with enum variants, so I refactored them to share code and make their relationship clearer. Finally, I reviewed all the comments in this file and fixed some places where they could be more clear.	2022-11-17 16:00:59 -08:00
Nick Fitzgerald	3b6544dc66	Fix warnings in `cranelift-codegen` docs builds (#5292 )	2022-11-17 21:13:24 +00:00
Alex Crichton	9bf2a8e663	Remove some dead code in the cranelift-wasm crate (#5290 ) * Remove some dead code in the cranelift-wasm crate * Remove some more dead code	2022-11-17 16:28:11 +00:00
Timothy Chen	de6e4a4e20	Shrink the size of SigData in Cranelift (#5261 ) * Shrink the size of SigData in Cranelift * Update cranelift/codegen/src/machinst/abi.rs Co-authored-by: Jamey Sharp <jamey@minilop.net> * Change ret arg length to u16 * Add test Co-authored-by: Jamey Sharp <jamey@minilop.net>	2022-11-17 00:15:19 +00:00
Trevor Elliott	4780bd5902	Don't use %rcx directly with CoffTlsGetAddr (#5278 ) Avoid naming %rcx as written by the CoffTlsGetAddr pseudo-instruction in the x64 backend, and instead emit a fixed-def constraint for a fresh VReg and %rcx.	2022-11-16 11:32:09 -08:00
Trevor Elliott	07bd8bf34a	Remove unnecessary moves in x64 gen_memcpy (#5277 ) Remove some unnecessary moves in the x64 gen_memcpy implementation -- the call instruction that's generated will already constrain the args to those registers.	2022-11-16 10:33:00 -08:00
Afonso Bordado	a793648eb2	cranelift: Fix `fdemote` on the interpreter (#5158 ) * cranelift: Cleanup `fdemote`/`fpromote` tests * cranelift: Fix `fdemote`/`fpromote` instruction docs The verifier fails if the input and output types are the same for these instructions * cranelift: Fix `fdemote`/`fpromote` in the interpreter * fuzzgen: Add `fdemote`/`fpromote`	2022-11-15 22:22:00 +00:00
Trevor Elliott	a007e02bd2	Add fixed_nonallocatable constraints when appropriate (#5253 ) Plumb the set of allocatable registers through the OperandCollector and use it validate uses of fixed-nonallocatable registers, like %rsp on x86_64.	2022-11-15 12:49:17 -08:00
Nick Fitzgerald	f6ae67f3f0	Cranelift(aarch64): Use an existing extractor instead of a new pure constructor (#5273 )	2022-11-15 20:40:44 +00:00
Nick Fitzgerald	d335dc8d5a	Cranelift: Do not optimize heap bounds checking comparison in legalization (#5272 ) That optimization is only for 12-bit immediates in Aarch64, which is now handled in backend lowering, so we can simplify this code a bit now.	2022-11-15 19:54:52 +00:00
Nick Fitzgerald	9967782726	Cranelift(Aarch64): Optimize lowering of `icmp`s with immediates (#5252 ) We can encode more constants into 12-bit immediates if we do the following rewrite for comparisons with odd constants: A >= B + 1 ==> A - 1 >= B ==> A > B	2022-11-15 09:18:55 -08:00
Nick Fitzgerald	c2a7ea7e24	Cranelift: de-duplicate bounds checks in legalizations (#5190 ) * Cranelift: Add the `DataFlowGraph::display_value_inst` convenience method * Cranelift: Add some `trace!` logs to some parts of legalization * Cranelift: de-duplicate bounds checks in legalizations When both (1) "dynamic" memories that need explicit bounds checks and (2) spectre mitigations that perform bounds checks are enabled, reuse the same bounds checks between the two legalizations. This reduces the overhead of explicit bounds checks and spectre mitigations over using virtual memory guard pages with spectre mitigations from ~1.9-2.1x overhead to ~1.6-1.8x overhead. That is about a 14-19% speed up for when dynamic memories and spectre mitigations are enabled. <details> ``` execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm Δ = 3422455129.47 ± 120159.49 (confidence = 99%) virtual-memory-guards.so is 2.09x to 2.09x faster than bounds-checks.so! [6563931659 6564063496.07 6564301535] bounds-checks.so [3141492675 3141608366.60 3141895249] virtual-memory-guards.so execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm Δ = 338716136.87 ± 1.38 (confidence = 99%) virtual-memory-guards.so is 2.08x to 2.08x faster than bounds-checks.so! [651961494 651961495.47 651961497] bounds-checks.so [313245357 313245358.60 313245362] virtual-memory-guards.so execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 22742944.07 ± 331.73 (confidence = 99%) virtual-memory-guards.so is 1.87x to 1.87x faster than bounds-checks.so! [48841295 48841567.33 48842139] bounds-checks.so [26098439 26098623.27 26099479] virtual-memory-guards.so ``` </details> <details> ``` execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm Δ = 2465900207.27 ± 146476.61 (confidence = 99%) virtual-memory-guards.so is 1.78x to 1.78x faster than de-duped-bounds-checks.so! [5607275431 5607442989.13 5607838342] de-duped-bounds-checks.so [3141445345 3141542781.87 3141711213] virtual-memory-guards.so execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm Δ = 234253620.20 ± 2.33 (confidence = 99%) virtual-memory-guards.so is 1.75x to 1.75x faster than de-duped-bounds-checks.so! [547498977 547498980.93 547498985] de-duped-bounds-checks.so [313245357 313245360.73 313245363] virtual-memory-guards.so execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 16605659.13 ± 315.78 (confidence = 99%) virtual-memory-guards.so is 1.64x to 1.64x faster than de-duped-bounds-checks.so! [42703971 42704284.40 42704787] de-duped-bounds-checks.so [26098432 26098625.27 26099234] virtual-memory-guards.so ``` </details> <details> ``` execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm Δ = 104462517.13 ± 7.32 (confidence = 99%) de-duped-bounds-checks.so is 1.19x to 1.19x faster than bounds-checks.so! [651961493 651961500.80 651961532] bounds-checks.so [547498981 547498983.67 547498989] de-duped-bounds-checks.so execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm Δ = 956556982.80 ± 103034.59 (confidence = 99%) de-duped-bounds-checks.so is 1.17x to 1.17x faster than bounds-checks.so! [6563930590 6564019842.40 6564243651] bounds-checks.so [5607307146 5607462859.60 5607677763] de-duped-bounds-checks.so execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 6137307.87 ± 247.75 (confidence = 99%) de-duped-bounds-checks.so is 1.14x to 1.14x faster than bounds-checks.so! [48841303 48841472.93 48842000] bounds-checks.so [42703965 42704165.07 42704718] de-duped-bounds-checks.so ``` </details> * Update test expectations * Add a test for deduplicating bounds checks between dynamic memories and spectre mitigations * Define a struct for the Spectre comparison instead of using a tuple * More trace logging for heap legalization	2022-11-15 08:47:22 -08:00
Trevor Elliott	dece901d16	Use regalloc constraints for sse blend operations (#5251 ) Instead of using xmm0 explicitly for the mask argument to instructions like blendvpd, use regalloc constraints to constrain it to xmm0 instead.	2022-11-14 16:44:34 -08:00
Afonso Bordado	ff46bbaebf	cranelift: Fix `iadd_carry`/`iadd_cout` in the interpreter (#5176 )	2022-11-14 10:18:28 -08:00
Denys Zadorozhnyi	d3692c2f2b	fix typo in caller_conv arg name in ABIMachineSpec::gen_call; (#5259 )	2022-11-14 09:02:07 -08:00
Jamey Sharp	70c72ee2a4	cranelift-isle: New IR and revised overlap checks (#5195 ) * cranelift-isle: New IR and revised overlap checks * Improve error reporting * Avoid "unused argument" warnings a nicer way * Remove unused fields * Minimize diff and "fix" error handling I had tried to use Miette "right" and made things worse somehow. Among other changes, revert all my changes to unrelated parts of `error.rs` and `error_miette.rs`. * Review comments: Rename "unmatchable" to "unreachable" * Review comments: newtype wrappers, not type aliases * Review comments: more comments on overlap checks * Review comments: Clarify `normalize_equivalence_classes` * Review comments: use union-find instead of linked list This saves about 50 lines of code in the trie_again module. The union-find implementation is about twice as long as that, counting comments and doc-tests, but that's a worth-while tradeoff. However, this makes `normalize_equivalence_classes` slower, because now finding all elements of an equivalence class takes time linear in the total size of all equivalence classes. If that ever turns out to be a problem in practice we can find some way to optimize `remove_set_of`. * Review comments: Hide constraints HashMap We want to enforce that consumers of this representation can't observe non-deterministic ordering in any of its public types. * Review comments: Normalize equivalence classes incrementally I'm not sure whether this is a good idea. It doesn't make the logic particularly simpler, and I think it will do more work if three or more binding sites with enum-variant constraints get set equal to each other. * More comments and other clarifications * Revert "Review comments: Normalize equivalence classes incrementally" * Even more comments	2022-11-14 02:29:22 +00:00
Jamey Sharp	95ca72a37a	cranelift-isle: Misc sema cleanups (#5242 ) This mostly amounts to factoring out duplicated code and turning various uses of `unwrap_or_continue!` into iterator chains.	2022-11-11 01:53:05 +00:00
Trevor Elliott	0367fbc2d4	cranelift: Rework pinned register lowering (#5249 ) Rework pinned register lowering to avoid the use of pinned virtual registers, instead using the MovFromPReg and MovToPReg pseudo instructions.	2022-11-10 16:19:25 -08:00
Alex Crichton	0548952319	Update wasm-tools crates (#5248 ) No major updates, just keeping up-to-date.	2022-11-10 21:23:20 +00:00
Jamey Sharp	f0fccbd18a	cranelift-isle: Helpers to get type/term by name (#5241 ) This is a common pattern in sema, so factor it out. Since this version uses `intern` instead of `intern_mut`, it might be a tiny bit faster when errors occur due to not writing names into maps then. When no error occurs, ISLE should do exactly the same work with or without this commit.	2022-11-10 09:51:49 -08:00
Nick Fitzgerald	47fa1ad6a8	Rework bounds checking for atomic operations (#5239 ) Before, we would do a `heap_addr` to translate the given Wasm memory address into a native memory address and pass it into the libcall that implemented the atomic operation, which would then treat the address as a Wasm memory address and pass it to `validate_atomic_addr` to be bounds checked a second time. This is a bit nonsensical, as we are validating a native memory address as if it were a Wasm memory address. Now, we no longer do a `heap_addr` to translate the Wasm memory address to a native memory address. Instead, we pass the Wasm memory address to the libcall, and the libcall is responsible for doing the bounds check (by calling `validate_atomic_addr` with the correct type of memory address now).	2022-11-09 16:19:43 -08:00
Jamey Sharp	86679489ef	cranelift-isle: if-let patterns aren't root terms (#5233 ) The `is_root` flag to `translate_pattern` just determines whether the `rule_term` argument is used, which begs a larger cleanup. But that cleanup is less clear if `is_root` is set anywhere aside from the call in `collect_rules`. So I wanted to get confirmation that this particular use of that flag is incorrect first. These two arguments (`is_root` and `rule_term`) are used to prevent expansion of a term as an internal extractor ("macro") if: - that term is also an internal constructor - and it's the root term on the left-hand side of the current rule - and the pattern we're currently translating has no parents. I'm not sure what it should mean to use the term you're currently defining as the root pattern on the left-hand side of an if-let in the same rule, but I don't think it should have this particular special treatment.	2022-11-09 15:32:33 -08:00
Jamey Sharp	54998715ea	cranelift-isle: Save variable names for later use (#5221 ) It's nice to be able to report these names after sema analysis completes so rule authors can recognize which names they used. This isn't used anywhere yet, but I'm planning to use it during codegen, and the rule-verification folks wanted something like this for debugging output.	2022-11-09 15:21:15 -08:00
Jamey Sharp	d38631a724	cranelift-isle: Don't panic on too-large rule priorities (#5236 ) Found with ISLE's fuzzer.	2022-11-09 20:36:02 +00:00
Nick Fitzgerald	fc62d4ad65	Cranelift: Make `heap_addr` return calculated `base + index + offset` (#5231 ) * Cranelift: Make `heap_addr` return calculated `base + index + offset` Rather than return just the `base + index`. (Note: I've chosen to use the nomenclature "index" for the dynamic operand and "offset" for the static immediate.) This move the addition of the `offset` into `heap_addr`, instead of leaving it for the subsequent memory operation, so that we can Spectre-guard the full address, and not allow speculative execution to read the first 4GiB of memory. Before this commit, we were effectively doing load(spectre_guard(base + index) + offset) Now we are effectively doing load(spectre_guard(base + index + offset)) Finally, this also corrects `heap_addr`'s documented semantics to say that it returns an address that will trap on access if `index + offset + access_size` is out of bounds for the given heap, rather than saying that the `heap_addr` itself will trap. This matches the implemented behavior for static memories, and after https://github.com/bytecodealliance/wasmtime/pull/5190 lands (which is blocked on this commit) will also match the implemented behavior for dynamic memories. * Update heap_addr docs * Factor out `offset + size` to a helper	2022-11-09 19:53:51 +00:00
Jamey Sharp	33a192556e	cranelift-isle: Do fewer term lookups (#5232 ) While checking the call graph of extractors during semantic validation, save `TermId` instead of `Sym`. The types are both just integer indexes, but the `TermId` is more useful here. Saving it avoids needing to check for failed map lookups twice, which simplifies the implementation.	2022-11-09 11:24:38 -08:00
Trevor Elliott	b077854b57	Generate SSA code from returns (#5172 ) Modify return pseudo-instructions to have pairs of registers: virtual and real. This allows us to constrain the virtual registers to the real ones specified by the abi, instead of directly emitting moves to those real registers.	2022-11-08 16:00:49 -08:00
Chris Fallin	d59caf39b6	Wasmtime+Cranelift: strip out some dead x86-32 code. (#5226 ) * Wasmtime+Cranelift: strip out some dead x86-32 code. I was recently pointed to fastly/Viceroy#200 where it seems some folks are trying to compile Wasmtime (via Viceroy) for Windows x86-32 and the failures may not be loud enough. I've tried to reproduce this cross-compiling to i686-pc-windows-gnu from Linux and hit build failures (as expected) in several places. Nevertheless, while trying to discern what others may be attempting, I noticed some dead x86-32-specific code in our repo, and figured it would be a good idea to clean this up. Otherwise, it (i) sends some mixed messages -- "hey look, this codebase does support x86-32" -- and (ii) keeps untested code around, which is generally not great. This PR removes x86-32-specific cases in traphandlers and unwind code, and Cranelift's native feature detection. It adds helpful compile-error messages in a few cases. If we ever support x86-32 (contributors welcome! The big missing piece is Cranelift support; see #1980), these compile errors and git history should be enough to recover any knowledge we are now encoding in the source. I left the x86-32 support in `wasmtime-fiber` alone because that seems like a bit of a special case -- foundation library, separate from the rest of Wasmtime, with specific care to provide a (presumably working) full 32-bit version. * Remove some extraneous compile_error!s, already covered by others.	2022-11-08 23:03:17 +00:00
Nick Fitzgerald	fd7b903f33	Cranelift: Use a custom enum instead of boolean for the ISLE target (#5228 ) Easier to read and doesn't require `/* is_lower = */`-style comments at call sites.	2022-11-08 21:44:02 +00:00
Trevor Elliott	70bca801ab	cranelift: Resize with `types::INVALID` isntead of `types::I8` (#5227 )	2022-11-08 20:42:20 +00:00
Trevor Elliott	d94173ea09	Add a VRegAllocator to separate VReg allocation from VCode (#5222 ) Remove the dependency on VCode for VReg allocation. This will simplify the changes in #5172, as that PR introduces the need to allocate temporary registers from the ABI context. This change also allows us to remove some fields from VCode: reftyped_vregs_set and have_ref_values.	2022-11-08 10:05:02 -08:00
Ulrich Weigand	3e5938e65a	Support big- and little-endian lane order with bitcast (#5196 ) Add a MemFlags operand to the bitcast instruction, where only the `big` and `little` flags are accepted. These define the lane order to be used when casting between types of different lane counts. Update all users to pass an appropriate MemFlags argument. Implement lane swaps where necessary in the s390x back-end. This is the final part necessary to fix https://github.com/bytecodealliance/wasmtime/issues/4566.	2022-11-07 14:41:10 -08:00
Afonso Bordado	9814e8bfeb	fuzzgen: Add a few more ops (#5201 ) Adds `bitselect`,`select` and `select_spectre_guard`	2022-11-07 09:08:26 -08:00
wasmtime-publish	08ef518c95	Bump Wasmtime to 4.0.0 (#5209 ) Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	2022-11-06 13:32:34 -06:00

... 3 4 5 6 7 ...

4328 Commits