wasmtime

Author	SHA1	Message	Date
Trevor Elliott	80c77da334	x64: Lower bitcast, fabs, and fneg in ISLE (#4729 ) * Add tests for bitcast * Migrate bitcast to ISLE * Add tests for fabs * Lower fabs in ISLE * Add tests for fneg * Lower fneg in ISLE	2022-08-18 17:59:23 -07:00
Trevor Elliott	8b6019909b	x64: Lower widening and narrowing operations in ISLE (#4722 ) Lower uwiden_high, uwiden_low, swiden_high, swiden_low, snarrow, and unarrow in ISLE.	2022-08-18 11:53:24 -07:00
Trevor Elliott	0a71df6a37	x64: Refactor vector_all_ones, and remove buggy sse_cmp_op (#4728 ) The sse_cmp_op rule had cases that would produce SseOperand values that aren't legal to use with MInst.XmmRmR, and was only used in vector_all_ones when constructing an XmmRmR value. Additionally, vector_all_ones always called sse_cmp_op with the same type, so the other cases were redundant. The solution in this PR is to remove sse_cmp_op entirely and inline a call to x64_pcmpeqd directly in vector_all_ones, and remove the unused argument from vector_all_ones.	2022-08-17 21:30:52 +00:00
Jamey Sharp	c569e7bea5	Remove unreachable x64 lowerings for iadd_imm (#4726 ) All of the `*_imm` instructions are rewritten during legalization to an explicit `iconst` plus the general form of the operator, so backends never see them. Therefore these ISLE rules in the x64 backend can never match anything.	2022-08-16 22:54:48 +00:00
Trevor Elliott	fbfceaec98	x64: Migrate iadd_pairwise to ISLE (#4718 ) * Add a test for iadd_pairwise with swiden input * Implement iadd_pairwise for swiden_{low,high} input * Add a test case for iadd_pairwise with uwiden input * Implement iadd_pairwise with uwiden	2022-08-16 12:21:06 -07:00
Trevor Elliott	3c1490dd59	x64: Lower fcvt_to_{u,s}int{,_sat} in ISLE (#4704 ) https://github.com/bytecodealliance/wasmtime/pull/4704	2022-08-16 09:03:50 -07:00
Trevor Elliott	498e7156b4	Remove the handling of `cmpps` in `produces_const` (#4714 ) https://github.com/bytecodealliance/wasmtime/pull/4714	2022-08-15 15:48:01 -07:00
Nick Fitzgerald	e0d4934ef4	Cranelift: Remove the `ABICaller` trait (#4711 ) * Cranelift: Remove the `ABICaller` trait It has only one implementation: the `ABICallerImpl` struct. We can just use that directly rather than having extra, unnecessary layers of generics and abstractions. * Cranelift: Rename `ABICallerImpl` to `Caller`	2022-08-15 20:41:08 +00:00
Trevor Elliott	1d0f6fa4fb	Fix a bug in produces_const (#4709 ) https://github.com/bytecodealliance/wasmtime/pull/4709	2022-08-15 19:00:33 +00:00
Nick Fitzgerald	f0c60f46a8	Cranelift: Remove `ABICallee` trait (#4701 ) * Cranelift: Remove `ABICallee` trait It has only one implementation: the `ABICalleeImpl` struct. By using that directly we can avoid unnecessary layers of generics and abstractions as well as a couple `Box`es that were previously putting the single implementation into a `Box<dyn>`. * Cranelift: Rename `ABICalleeImpl` to `AbiCallee` * Fix comments as per review * Rename `AbiCallee` to `Callee`	2022-08-15 18:27:05 +00:00
Afonso Bordado	c6d2a3f94e	cranelift: Add `ireduce`/`iconcat`/`isplit` to the clif fuzzer (#4703 ) * cranelift: Add ireduce to fuzzer * cranelift: Add iconcat/isplit to fuzzer	2022-08-15 09:18:08 -07:00
Benjamin Bouvier	8a9b1a9025	Implement an incremental compilation cache for Cranelift (#4551 ) This is the implementation of https://github.com/bytecodealliance/wasmtime/issues/4155, using the "inverted API" approach suggested by @cfallin (thanks!) in Cranelift, and trait object to provide a backend for an all-included experience in Wasmtime. After the suggestion of Chris, `Function` has been split into mostly two parts: - on the one hand, `FunctionStencil` contains all the fields required during compilation, and that act as a compilation cache key: if two function stencils are the same, then the result of their compilation (`CompiledCodeBase<Stencil>`) will be the same. This makes caching trivial, as the only thing to cache is the `FunctionStencil`. - on the other hand, `FunctionParameters` contain the... function parameters that are required to finalize the result of compilation into a `CompiledCode` (aka `CompiledCodeBase<Final>`) with proper final relocations etc., by applying fixups and so on. Most changes are here to accomodate those requirements, in particular that `FunctionStencil` should be `Hash`able to be used as a key in the cache: - most source locations are now relative to a base source location in the function, and as such they're encoded as `RelSourceLoc` in the `FunctionStencil`. This required changes so that there's no need to explicitly mark a `SourceLoc` as the base source location, it's automatically detected instead the first time a non-default `SourceLoc` is set. - user-defined external names in the `FunctionStencil` (aka before this patch `ExternalName::User { namespace, index }`) are now references into an external table of `UserExternalNameRef -> UserExternalName`, present in the `FunctionParameters`, and must be explicitly declared using `Function::declare_imported_user_function`. - some refactorings have been made for function names: - `ExternalName` was used as the type for a `Function`'s name; while it thus allowed `ExternalName::Libcall` in this place, this would have been quite confusing to use it there. Instead, a new enum `UserFuncName` is introduced for this name, that's either a user-defined function name (the above `UserExternalName`) or a test case name. - The future of `ExternalName` is likely to become a full reference into the `FunctionParameters`'s mapping, instead of being "either a handle for user-defined external names, or the thing itself for other variants". I'm running out of time to do this, and this is not trivial as it implies touching ISLE which I'm less familiar with. The cache computes a sha256 hash of the `FunctionStencil`, and uses this as the cache key. No equality check (using `PartialEq`) is performed in addition to the hash being the same, as we hope that this is sufficient data to avoid collisions. A basic fuzz target has been introduced that tries to do the bare minimum: - check that a function successfully compiled and cached will be also successfully reloaded from the cache, and returns the exact same function. - check that a trivial modification in the external mapping of `UserExternalNameRef -> UserExternalName` hits the cache, and that other modifications don't hit the cache. - This last check is less efficient and less likely to happen, so probably should be rethought a bit. Thanks to both @alexcrichton and @cfallin for your very useful feedback on Zulip. Some numbers show that for a large wasm module we're using internally, this is a 20% compile-time speedup, because so many `FunctionStencil`s are the same, even within a single module. For a group of modules that have a lot of code in common, we get hit rates up to 70% when they're used together. When a single function changes in a wasm module, every other function is reloaded; that's still slower than I expect (between 10% and 50% of the overall compile time), so there's likely room for improvement. Fixes #4155.	2022-08-12 16:47:43 +00:00
Nick Fitzgerald	532fb22af6	Cranelift: Remove the `LowerCtx` trait (#4697 ) The trait had only one implementation: the `Lower` struct. It is easier to just use that directly, and not introduce unnecessary layers of generics and abstractions. Once upon a time, there was hope that we would have other implementations of the `LowerCtx` trait, that did things like lower CLIF to SMTLIB for verification. However, this is not practical these days given the way that the trait has evolved over time, and our verification efforts are focused on ISLE now anyways, and we're actually making some progress on that front (much more than anyone ever did on a second `LowerCtx` trait implementation!)	2022-08-11 16:54:17 -07:00
Afonso Bordado	3ea1813173	x64: Add native lowering for scalar `fma` (#4539 ) Use `vfmadd213{ss,sd}` for these lowerings.	2022-08-11 22:48:16 +00:00
Trevor Elliott	0c2e0494bd	x64: Lower fcvt_from_uint in ISLE (#4684 ) * Add a test for the existing behavior of fcvt_from_unit * Migrate the I8, I16, I32 cases of fcvt_from_uint * Implement the I64 case of fcvt_from_uint * Add a test for the existing behavior of fcvt_from_uint.f64x2 * Migrate fcvt_from_uint.f64x2 to ISLE * Lower the last case of `fcvt_from_uint` * Add a test for `fcvt_from_uint` * Finish lowering fcmp_from_uint * Format	2022-08-11 12:28:41 -07:00
Afonso Bordado	e4adc46e6d	cranelift: Fix shifts and implement rotates in interpreter (#4519 ) * cranelift: Fix shifts and implement rotates in interpreter * x64: Implement `rotl`/`rotr` for some small type combinations	2022-08-11 12:15:52 -07:00
Afonso Bordado	c5bc368cfe	cranelift: Add COFF TLS Support (#4546 ) * cranelift: Implement COFF TLS Relocations * cranelift: Emit SecRel relocations * cranelift: Handle _tls_index symbol in backend	2022-08-11 09:33:40 -07:00
Trevor Elliott	a25d52046b	x64: Migrate fcvt_from_sint and fcvt_low_from_sint to ISLE (#4650 ) https://github.com/bytecodealliance/wasmtime/pull/4650	2022-08-10 10:49:02 -07:00
Trevor Elliott	63c2d1e0c3	x64: Remove unnecessary register use when comparing against constants (#4645 ) https://github.com/bytecodealliance/wasmtime/pull/4645	2022-08-09 23:53:51 +00:00
bjorn3	a4aa7258de	Remove some dead code from the abi code (#4653 ) These were originally used by the old backend framework as part of legalizing function signatures for the respective ABI.	2022-08-09 12:21:55 -07:00
Chris Fallin	de8d44d0e5	Cranelift: MachBuffer: apply branch peephole opts one last time at buffer tail. (#4652 ) The `MachBuffer` applies a set of peephole-optimization rules to do branch threading, leverage fallthrough paths, eliminate empty blocks, and flip conditional branches where needed to make branches more efficient starting from naive always-branch-at-end-of-BB code. This works by applying the rules at every label-bind, which is equivalent to applying them at the end of every basic block, where branches are usually inserted. However, this misses one case: the end of the buffer! Currently we don't optimize any redundant or foldable branches at the very end of the machine code. This usually doesn't matter when the function ends in an epilogue with `ret` as the last instruction. However, when cold blocks exist, it can actually matter. Thanks to @mchesser for pointing out this issue in #4636.	2022-08-09 10:38:48 -07:00
Trevor Elliott	ed7dfd3925	x64: Peephole optimization for `x < 0` (#4625 ) https://github.com/bytecodealliance/wasmtime/pull/4625 Fixes #4607	2022-08-09 09:45:53 -07:00
Michael Chesser	8aee85ebaa	Propagate cold annotations to edge blocks (#4636 ) Update the lowering stage to mark edge blocks as cold if either the predecessor or successor block is cold.	2022-08-09 05:05:57 +00:00
Trevor Elliott	0c2a48f682	x64: Migrate selectif and selectif_spectre_guard to ISLE (#4619 ) https://github.com/bytecodealliance/wasmtime/pull/4619	2022-08-05 09:36:11 -07:00
Trevor Elliott	cd847d071d	x64: Migrate br_table to ISLE (#4615 ) https://github.com/bytecodealliance/wasmtime/pull/4615	2022-08-04 22:12:37 +00:00
Trevor Elliott	dc8362ceec	x64: Finish migrating brz and brnz to ISLE (#4614 ) https://github.com/bytecodealliance/wasmtime/pull/4614	2022-08-04 12:58:43 -07:00
Trevor Elliott	1fc11bbe51	x64: Migrate brff and I128 branching instructions to ISLE (#4599 ) https://github.com/bytecodealliance/wasmtime/pull/4599	2022-08-04 08:58:50 -07:00
Trevor Elliott	301be7438e	x64: Begin migrating branch instructions to ISLE (#4587 ) https://github.com/bytecodealliance/wasmtime/pull/4587	2022-08-03 20:28:52 +00:00
Ulrich Weigand	b9dd48e34b	[s390x, abi_impl] Support struct args using explicit pointers (#4585 ) This adds support for StructArgument on s390x. The ABI for this platform requires that the address of the buffer holding the copy of the struct argument is passed from caller to callee as hidden pointer, using a register or overflow stack slot. To implement this, I've added an optional "pointer" filed to ABIArg::StructArg, and code to handle the pointer both in common abi_impl code and the s390x back-end. One notable change necessary to make this work involved the "copy_to_arg_order" mechanism. Currently, for struct args we only need to copy the data (and that need to happen before setting up any other args), while for non-struct args we only need to set up the appropriate registers or stack slots. This order is ensured by sorting the arguments appropriately into a "copy_to_arg_order" list. However, for struct args with explicit pointers we need to both copy the data (again, before everything else), and set up a register or stack slot. Since we now need to touch the argument twice, we cannot solve the ordering problem by a simple sort. Instead, the abi_impl common code now provided two callbacks, emit_copy_regs_to_buffer and emit_copy_regs_to_arg, and expects the back end to first call copy..to_buffer for all args, and then call copy.._to_arg for all args. This required updates to all back ends. In the s390x back end, in addition to the new ABI code, I'm now adding code to actually copy the struct data, using the MVC instruction (for small buffers) or a memcpy libcall (for larger buffers). This also requires a bit of new infrastructure: - MVC is the first memory-to-memory instruction we use, which needed a bit of memory argument tweaking - We also need to set up the infrastructure to emit libcalls. (This implements the first half of issue #4565.)	2022-08-03 19:00:07 +00:00
Anton Kirilov	a897742593	Initial back-edge CFI implementation (#3606 ) Give the user the option to sign and to authenticate function return addresses with the operations introduced by the Pointer Authentication extension to the Arm instruction set architecture. Copyright (c) 2021, Arm Limited.	2022-08-03 11:08:29 -07:00
Afonso Bordado	709716bb8e	cranelift: Implement scalar FMA on x86 (#4460 ) x86 does not have dedicated instructions for scalar FMA, lower to a libcall which seems to be what llvm does.	2022-08-03 10:29:10 -07:00
Nick Fitzgerald	55215bbd1e	Use a `SmallVec` for `ABIArgSlot`s (#4586 ) These are always length 1 for Wasm benchmarks. <h3>Sightglass Benchmark Results</h3> ``` compilation :: nanoseconds :: benchmarks/spidermonkey/benchmark.wasm Δ = 328624015.86 ± 40274677.93 (confidence = 99%) main.so is 0.88x to 0.91x faster than slots-smallvec.so! slots-smallvec.so is 1.10x to 1.13x faster than main.so! [3070752447 3203778792.55 3446269274] main.so [2503544039 2875154776.69 3197966713] slots-smallvec.so compilation :: nanoseconds :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 9685705.06 ± 3221286.87 (confidence = 99%) main.so is 0.91x to 0.96x faster than slots-smallvec.so! slots-smallvec.so is 1.05x to 1.09x faster than main.so! [129356493 145594942.79 165038803] main.so [118555011 135909237.73 188780619] slots-smallvec.so compilation :: nanoseconds :: benchmarks/bz2/benchmark.wasm No difference in performance. [79080493 86757564.46 112649639] main.so [78083384 85934125.69 94992743] slots-smallvec.so ```	2022-08-02 17:40:36 -07:00
Nick Fitzgerald	ab1cf3df2d	Use a `SmallVec` for `ABIArg`s (#4584 ) Instead of a regular `Vec`. These vectors are usually very small, for example here is the histogram of sizes when running Sightglass's `pulldown-cmark` benchmark: ``` ;; Number of samples = 10332 ;; Min = 0 ;; Max = 11 ;; ;; Mean = 2.496128532713901 ;; Standard deviation = 2.2859559855427243 ;; Variance = 5.225594767838607 ;; ;; Each ∎ is a count of 62 ;; 0 .. 1 [ 3134 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 1 .. 2 [ 2032 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 2 .. 3 [ 159 ]: ∎∎ 3 .. 4 [ 838 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎ 4 .. 5 [ 970 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 5 .. 6 [ 2566 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 6 .. 7 [ 303 ]: ∎∎∎∎ 7 .. 8 [ 272 ]: ∎∎∎∎ 8 .. 9 [ 40 ]: 9 .. 10 [ 18 ]: ``` By using a `SmallVec` with capacity of 6 we avoid the vast majority of heap allocations and get some nice benchmark wins of up to ~1.11x faster compilation. <h3>Sightglass Benchmark Results</h3> ``` compilation :: nanoseconds :: benchmarks/spidermonkey/benchmark.wasm Δ = 340361395.90 ± 63384608.15 (confidence = 99%) main.so is 0.88x to 0.92x faster than smallvec.so! smallvec.so is 1.09x to 1.13x faster than main.so! [3101467423 3425524333.41 4060621653] main.so [2820915877 3085162937.51 3375167352] smallvec.so compilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm Δ = 988446098.59 ± 184075718.89 (confidence = 99%) main.so is 0.88x to 0.92x faster than smallvec.so! smallvec.so is 1.09x to 1.13x faster than main.so! [9006994951 9948091070.66 11792481990] main.so [8192243090 8959644972.07 9801848982] smallvec.so compilation :: nanoseconds :: benchmarks/bz2/benchmark.wasm Δ = 7854567.87 ± 2215491.16 (confidence = 99%) main.so is 0.89x to 0.94x faster than smallvec.so! smallvec.so is 1.07x to 1.12x faster than main.so! [80354527 93864666.76 119789198] main.so [77554917 86010098.89 94726994] smallvec.so compilation :: cycles :: benchmarks/bz2/benchmark.wasm Δ = 22810509.85 ± 6434024.63 (confidence = 99%) main.so is 0.89x to 0.94x faster than smallvec.so! smallvec.so is 1.07x to 1.12x faster than main.so! [233358190 272593088.57 347880715] main.so [225227821 249782578.72 275097380] smallvec.so compilation :: nanoseconds :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 10849521.41 ± 4324757.85 (confidence = 99%) main.so is 0.90x to 0.96x faster than smallvec.so! smallvec.so is 1.04x to 1.10x faster than main.so! [133875427 156859544.47 222455440] main.so [126073854 146010023.06 181611647] smallvec.so compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 31508176.97 ± 12559561.91 (confidence = 99%) main.so is 0.90x to 0.96x faster than smallvec.so! smallvec.so is 1.04x to 1.10x faster than main.so! [388788638 455536988.31 646034523] main.so [366132033 424028811.34 527419755] smallvec.so ```	2022-08-02 15:53:44 -07:00
Nick Fitzgerald	42bba452a6	Cranelift: Add instructions for getting the current stack/frame/return pointers (#4573 ) * Cranelift: Add instructions for getting the current stack/frame pointers and return address This is the initial part of https://github.com/bytecodealliance/wasmtime/issues/4535 * x64: Remove `Amode::RbpOffset` and use `Amode::ImmReg` instead We just special case getting operands from `Amode`s now. * Fix s390x `get_return_address`; require `preserve_frame_pointers=true` * Assert that `Amode::ImmRegRegShift` doesn't use rbp/rsp * Handle non-allocatable registers in Amode::with_allocs * Use "stack" instead of "r15" on s390x * r14 is an allocatable register on s390x, so it shouldn't be used with `MovPReg`	2022-08-02 14:37:17 -07:00
Chris Fallin	8dddd6f1f7	Cranelift: Remove `ifcmp_sp` opcode. (#4578 ) This was temporarily added back in #3502 due to a need from Lucet; now that Lucet is EOL, the opcode is no longer needed and we can remove it.	2022-08-02 13:15:39 -07:00
Chris Fallin	43f1765272	Cranellift: remove Baldrdash support and related features. (#4571 ) * Cranellift: remove Baldrdash support and related features. As noted in Mozilla's bugzilla bug 1781425 [1], the SpiderMonkey team has recently determined that their current form of integration with Cranelift is too hard to maintain, and they have chosen to remove it from their codebase. If and when they decide to build updated support for Cranelift, they will adopt different approaches to several details of the integration. In the meantime, after discussion with the SpiderMonkey folks, they agree that it makes sense to remove the bits of Cranelift that exist to support the integration ("Baldrdash"), as they will not need them. Many of these bits are difficult-to-maintain special cases that are not actually tested in Cranelift proper: for example, the Baldrdash integration required Cranelift to emit function bodies without prologues/epilogues, and instead communicate very precise information about the expected frame size and layout, then stitched together something post-facto. This was brittle and caused a lot of incidental complexity ("fallthrough returns", the resulting special logic in block-ordering); this is just one example. As another example, one particular Baldrdash ABI variant processed stack args in reverse order, so our ABI code had to support both traversal orders. We had a number of other Baldrdash-specific settings as well that did various special things. This PR removes Baldrdash ABI support, the `fallthrough_return` instruction, and pulls some threads to remove now-unused bits as a result of those two, with the understanding that the SpiderMonkey folks will build new functionality as needed in the future and we can perhaps find cleaner abstractions to make it all work. [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1781425 * Review feedback. * Fix (?) DWARF debug tests: add `--disable-cache` to wasmtime invocations. The debugger tests invoke `wasmtime` from within each test case under the control of a debugger (gdb or lldb). Some of these tests started to inexplicably fail in CI with unrelated changes, and the failures were only inconsistently reproducible locally. It seems to be cache related: if we disable cached compilation on the nested `wasmtime` invocations, the tests consistently pass. * Review feedback.	2022-08-02 19:37:56 +00:00
Benjamin Bouvier	ff37c9d8a4	[cranelift] Rejigger the `compile` API (#4540 ) * Move `emit_to_memory` to `MachCompileResult` This small refactoring makes it clearer to me that emitting to memory doesn't require anything else from the compilation `Context`. While it's a trivial change, it's a small public API change that shouldn't cause too much trouble, and doesn't seem RFC-worthy. Happy to hear different opinions about this, though! * hide the MachCompileResult behind a method * Add a `CompileError` wrapper type that references a `Function` * Rename MachCompileResult to CompiledCode * Additionally remove the last unsafe API in cranelift-codegen	2022-08-02 12:05:40 -07:00
Trevor Elliott	586ec95c11	ISLE: Allow shadowing in let expressions (#4562 ) * Support shadowing in isle * Re-run the isle build.rs if the examples change * Print error messages when isle tests fail * Move run tests * Refactor `let` uses that don't need to introduce unique names	2022-08-01 21:10:28 +00:00
Trevor Elliott	25782b527e	x64: Migrate trapif and trapff to ISLE (#4545 ) https://github.com/bytecodealliance/wasmtime/pull/4545	2022-08-01 11:24:11 -07:00
Benjamin Bouvier	8d0224341c	cranelift: Introduce a feature to enable `trace` logs (#4484 ) * Don't use `log::trace` directly but a feature-enabled `trace` macro * Don't emit disassembly based on the log level	2022-08-01 11:19:15 +02:00
Trevor Elliott	29d4edc76b	x64: Migrate call and call_indirect to ISLE (#4542 ) https://github.com/bytecodealliance/wasmtime/pull/4542	2022-07-28 13:10:03 -07:00
Afonso Bordado	0508932174	cranelift: Align Scalar and SIMD shift semantics (#4520 ) * cranelift: Reorganize test suite Group some SIMD operations by instruction. * cranelift: Deduplicate some shift tests Also, new tests with the mod behaviour * aarch64: Lower shifts with mod behaviour * x64: Lower shifts with mod behaviour * wasmtime: Don't mask SIMD shifts	2022-07-27 17:54:00 +00:00
Trevor Elliott	7ac6134894	x64: Shrink Inst from 72 to 48 bytes (#4514 ) https://github.com/bytecodealliance/wasmtime/pull/4514	2022-07-27 10:39:22 -07:00
Afonso Bordado	02c3b47db2	x64: Implement SIMD `fma` (#4474 ) * x64: Add VEX Instruction Encoder This uses a similar builder pattern to the EVEX Encoder. Does not yet support memory accesses. * x64: Add FMA Flag * x64: Implement SIMD `fma` * x64: Use 4 register Vex Inst * x64: Reorder VEX pretty print args	2022-07-25 22:01:02 +00:00
Trevor Elliott	9e9e043174	x64: Migrate the return and fallthrough_return lowerings to ISLE (#4518 ) https://github.com/bytecodealliance/wasmtime/pull/4518	2022-07-25 21:28:52 +00:00
Trevor Elliott	ee7e4f4c6b	x64: Port func_addr and symbol_value to ISLE (#4485 ) https://github.com/bytecodealliance/wasmtime/pull/4485	2022-07-25 11:11:16 -07:00
Afonso Bordado	af62037f62	cranelift: Restrict `br_table` to `i32` indices (#4510 ) * cranelift: Restrict `br_table` to `i32` indices In #4498 it was proposed that we should only accept `i32` indices to `br_table`. The rationale for this is that larger types lead the users to a false sense of flexibility (since we don't support jump tables larger than u32's), and narrower types are not well tested paths that would be safer if we removed them. * cranelift: Reduce directly from i128 to i32 in Switch	2022-07-22 23:32:40 +00:00
Nick Fitzgerald	b24c561ceb	cranelift: Don't log CLIF and assembly at debug level (#4503 ) Too verbose. Only log them at trace level.	2022-07-21 15:31:05 -07:00
Trevor Elliott	06407dd337	Add a test to prevent x64 Inst size slipping further (#4489 ) * Add a test to prevent x64 Inst size slipping further * Enable the test for 64-bit pointers only	2022-07-21 00:01:33 +00:00
Trevor Elliott	b519c975cb	x64: Port fdemote and fvdemote to ISLE (#4449 ) https://github.com/bytecodealliance/wasmtime/pull/4449	2022-07-18 14:26:23 -07:00

1 2 3 4 5 ...

551 Commits