wasmtime

Author	SHA1	Message	Date
Trevor Elliott	0c2e0494bd	x64: Lower fcvt_from_uint in ISLE (#4684 ) * Add a test for the existing behavior of fcvt_from_unit * Migrate the I8, I16, I32 cases of fcvt_from_uint * Implement the I64 case of fcvt_from_uint * Add a test for the existing behavior of fcvt_from_uint.f64x2 * Migrate fcvt_from_uint.f64x2 to ISLE * Lower the last case of `fcvt_from_uint` * Add a test for `fcvt_from_uint` * Finish lowering fcmp_from_uint * Format	2022-08-11 12:28:41 -07:00
Afonso Bordado	e4adc46e6d	cranelift: Fix shifts and implement rotates in interpreter (#4519 ) * cranelift: Fix shifts and implement rotates in interpreter * x64: Implement `rotl`/`rotr` for some small type combinations	2022-08-11 12:15:52 -07:00
Ulrich Weigand	67870d1518	s390x: Support both big- and little-endian vector lane order (#4682 ) This implements the s390x back-end portion of the solution for https://github.com/bytecodealliance/wasmtime/issues/4566 We now support both big- and little-endian vector lane order in code generation. The order used for a function is determined by the function's ABI: if it uses a Wasmtime ABI, it will use little-endian lane order, and big-endian lane order otherwise. (This ensures that all raw_bitcast instructions generated by both wasmtime and other cranelift frontends can always be implemented as a no-op.) Lane order affects the implementation of a number of operations: - Vector immediates - Vector memory load / store (in big- and little-endian variants) - Operations explicitly using lane numbers (insertlane, extractlane, shuffle, swizzle) - Operations implicitly using lane numbers (iadd_pairwise, narrow/widen, promote/demote, fcvt_low, vhigh_bits) In addition, when calling a function using a different lane order, we need to lane-swap all vector values passed or returned in registers. A small number of changes to common code were also needed: - Ensure we always select a Wasmtime calling convention on s390x in crates/cranelift (func_signature). - Fix vector immediates for filetests/runtests. In PR #4427, I attempted to fix this by byte-swapping the V128 value, but with the new scheme, we'd instead need to perform a per-lane byte swap. Since we do not know the actual type in write_to_slice and read_from_slice, this isn't easily possible. Revert this part of PR #4427 again, and instead just mark the memory buffer as little-endian when emitting the trampoline; the back-end will then emit correct code to load the constant. - Change a runtest in simd-bitselect-to-vselect.clif to no longer make little-endian lane order assumptions. - Remove runtests in simd-swizzle.clif that make little-endian lane order assumptions by relying on implicit type conversion when using a non-i16x8 swizzle result type (this feature should probably be removed anyway). Tested with both wasmtime and cg_clif.	2022-08-11 12:10:46 -07:00
Afonso Bordado	c5bc368cfe	cranelift: Add COFF TLS Support (#4546 ) * cranelift: Implement COFF TLS Relocations * cranelift: Emit SecRel relocations * cranelift: Handle _tls_index symbol in backend	2022-08-11 09:33:40 -07:00
Afonso Bordado	268ddf2f6c	cranelift: Implement pinned reg in interpreter (#4375 )	2022-08-10 21:33:45 +00:00
Afonso Bordado	11f0b003eb	cranelift: Build a runtest case from fuzzer TestCase's (#4590 ) * cranelift: Build a runtest case from fuzzer TestCase's * cranelift: Add a default expected output for a fuzzgen case	2022-08-10 21:17:11 +00:00
bjorn3	54f9587569	Don't use libtest harness for filetests (#4655 ) We are using our own test harness for filetests and embedding it in libtest isn't useful. It only hides test output until the end and results in unnecessary noise.	2022-08-10 13:48:15 -07:00
Ulrich Weigand	be36dd6b1e	s390x: Enable object backend (#4680 ) This enables the object backend for s390x, in particular the processing of all required relocations. This uncovered a bug: we need to use PLT relocations for the target of calls, which we currently do not. Fixed by adding a new S390xPLTRel32Dbl reloc type and using it where needed.	2022-08-10 20:07:54 +00:00
Jamey Sharp	ecb91c0b06	List preset's settings in generated comment (#4679 ) Figuring out which boolean settings go into each preset is not easy by inspecting the DSL source (e.g. meta/src/isa/x86.rs). This patch extends the comments in the Rust that's generated by that DSL to list the names of the settings together with the name of the preset.	2022-08-10 19:56:23 +00:00
Trevor Elliott	a25d52046b	x64: Migrate fcvt_from_sint and fcvt_low_from_sint to ISLE (#4650 ) https://github.com/bytecodealliance/wasmtime/pull/4650	2022-08-10 10:49:02 -07:00
bjorn3	f8c0a88299	Fix sret for AArch64 (#4634 ) * Fix sret for AArch64 AArch64 requires the struct return address argument to be stored in the x8 register. This register is never used for regular arguments. * Add extra sret tests for x86_64	2022-08-10 10:34:51 -07:00
Ulrich Weigand	50fcab2984	s390x: Implement tls_value (#4616 ) Implement the tls_value for s390 in the ELF general-dynamic mode. Notable differences to the x86_64 implementation are: - We use a __tls_get_offset libcall instead of __tls_get_addr. - The current thread pointer (stored in a pair of access registers) needs to be added to the result of __tls_get_offset. - __tls_get_offset has a variant ABI that requires the address of the GOT (global offset table) is passed in %r12. This means we need a new libcall entries for __tls_get_offset. In addition, we also need a way to access _GLOBAL_OFFSET_TABLE_. The latter is a "magic" symbol with a well-known name defined by the ABI and recognized by the linker. This patch introduces a new ExternalName::KnownSymbol variant to support such names (originally due to @afonso360). We also need to emit a relocation on a symbol placed in a constant pool, as well as an extra relocation on the call to __tls_get_offset required for TLS linker optimization. Needed by the cg_clif frontend.	2022-08-10 10:02:07 -07:00
Afonso Bordado	30e2a9bd29	cranelift: Upgrade libm to 0.2.4 (#4670 ) * cranelift: Upgrade libm to 0.2.4 This resolves an issue with incorrect fmaf on the x86_64-pc-windows-gnu target under some inputs. See: #4517 * supply-chain: Vet `libm` 0.2.4	2022-08-10 16:08:39 +00:00
Trevor Elliott	63c2d1e0c3	x64: Remove unnecessary register use when comparing against constants (#4645 ) https://github.com/bytecodealliance/wasmtime/pull/4645	2022-08-09 23:53:51 +00:00
Afonso Bordado	4d2a2cfae6	cranelift: Use `cranelift-jit` in runtests (#4453 ) * cranelift: Use JIT in runtests Using `cranelift-jit` in run tests allows us to preform relocations and libcalls. This is important since some instruction lowerings fallback to libcall's when an extension is missing, or when it's too complicated to implement manually. This is also a first step to being able to test `call`'s between functions in the runtest suite. It should also make it easier to eventually test TLS relocations, symbol resolution and ABI's. Another benefit of this is that we also get to test the JIT more, since it now runs the runtests, and gets some fuzzing via `fuzzgen` (which uses the `SingleFunctionCompiler`). This change causes regressions in terms of runtime for the filetests. I haven't done any serious benchmarking but what I've been seeing is that it now takes about ~3 seconds to run the testsuite while it previously took around 2 seconds. * Add FMA tests for X86	2022-08-09 14:54:25 -07:00
Afonso Bordado	97b2680f20	cranelift: Remove legalized_to_pointer from function generator (#4665 )	2022-08-09 21:47:26 +00:00
Afonso Bordado	d5de91b953	cranelift: Fuzz cold blocks (#4654 )	2022-08-09 19:43:08 +00:00
bjorn3	a4aa7258de	Remove some dead code from the abi code (#4653 ) These were originally used by the old backend framework as part of legalizing function signatures for the respective ABI.	2022-08-09 12:21:55 -07:00
Trevor Elliott	6b6fc9ec3e	ISLE: Fix a bug with extractor ordering (#4661 ) https://github.com/bytecodealliance/wasmtime/pull/4661 Co-authored-by: Chris Fallin <chris@cfallin.org>	2022-08-09 19:19:32 +00:00
Chris Fallin	953f83e6ac	Cranelift: disallow marking entry block 'cold'. (#4659 ) This is a nonsensical constraint: the entry block must come first in the compiled code's layout, so it cannot also be sunk to the end of the function. This PR modifies the CLIF verifier to disallow this situation entirely. It also adds an assert during final block-order computation to catch the problem (and avoid a silent miscompile) even if the verifier is disabled. Fixes #4656.	2022-08-09 11:52:30 -07:00
Chris Fallin	de8d44d0e5	Cranelift: MachBuffer: apply branch peephole opts one last time at buffer tail. (#4652 ) The `MachBuffer` applies a set of peephole-optimization rules to do branch threading, leverage fallthrough paths, eliminate empty blocks, and flip conditional branches where needed to make branches more efficient starting from naive always-branch-at-end-of-BB code. This works by applying the rules at every label-bind, which is equivalent to applying them at the end of every basic block, where branches are usually inserted. However, this misses one case: the end of the buffer! Currently we don't optimize any redundant or foldable branches at the very end of the machine code. This usually doesn't matter when the function ends in an epilogue with `ret` as the last instruction. However, when cold blocks exist, it can actually matter. Thanks to @mchesser for pointing out this issue in #4636.	2022-08-09 10:38:48 -07:00
Trevor Elliott	ed7dfd3925	x64: Peephole optimization for `x < 0` (#4625 ) https://github.com/bytecodealliance/wasmtime/pull/4625 Fixes #4607	2022-08-09 09:45:53 -07:00
Afonso Bordado	a36a52a017	cranelift: Print error message when basic blocks are invalid (#4591 )	2022-08-09 09:28:41 -07:00
Afonso Bordado	dd6e790090	cranelift: Fuzz Argument Extensions in clif-fuzzer (#4589 )	2022-08-09 09:03:38 -07:00
Michael Chesser	8aee85ebaa	Propagate cold annotations to edge blocks (#4636 ) Update the lowering stage to mark edge blocks as cold if either the predecessor or successor block is cold.	2022-08-09 05:05:57 +00:00
Chris Fallin	863659e04f	VCode emission: account for RA spill/reload/moves in worst-case block size. (#4644 ) To determine whether we need to insert a "veneer island" of branch-range extension veneers, we need to know ahead of emitting a basic block the worst-case size of that block. This is because veneers only go between blocks (we could plop one in the middle of a block but that would require another jump around it and would probably pessimize some code significantly), and we can't back up once we emit a block. To compute this worst-case size, we take the number of instructions and multiply by the largest possible size of one pseudoinst (e.g., on aarch64, this is 44 bytes; it explicitly excludes the `EmitIsland` pseudo-op which is used before large jumptable inline offset tables are emitted). This is conservative, but it always works, and veneers are somewhat rare in practice (function body >1MiB on aarch64 for example). Unfortunately this logic didn't account for the spill/reload/move instructions inserted by the register allocator, and in one example in issue #4629, a block had only one instruction but 482 edge-moves (!). This came at just the wrong time as we were approaching the 1MiB limit on aarch64. This PR fixes that issue, and fixes the logic to actually look at the correct next block (next in `final_order` rather than numerically next), as a bonus correctness fix. Fixes #4629.	2022-08-08 13:57:18 -07:00
Damian Heaton	e463890f26	Port `AvgRound` & `SqmulRoundSat` to ISLE (AArch64) (#4639 ) Ported the existing implementations of the following opcodes on AArch64 to ISLE: - `AvgRound` - Also introduced support for `i64x2` vectors, as per the docs. - `SqmulRoundSat` Copyright (c) 2022 Arm Limited	2022-08-08 11:35:43 -07:00
Damian Heaton	47a67d752b	Split `Fmla` and `Bsl` out into new `VecRRRMod` op (#4638 ) Separates the following opcodes for AArch64 into a separate `VecALUModOp` enum, which is emitted via the `VecRRRMod` instruction. This separates vector ALU instructions which modify a register from instructions which write to a new register: - `Bsl` - `Fmla` Addresses [a discussion](https://github.com/bytecodealliance/wasmtime/pull/4608#discussion_r937975581) in #4608. Copyright (c) 2022 Arm Limited	2022-08-08 11:33:13 -07:00
Chris Fallin	c5e3c0cafb	AArch64: don't assert inst within worst-case size when island emitted. (#4627 ) We assert after emitting each instruction that its size was less than the "worst-case size", which is used to determine when we need to proactively emit an island so pending branch fixups don't go out of bounds. However, the `EmitIsland` pseudo-inst itself can cause an arbitrarily large island to be emitted; this should not have to fit within the worst-case size (because island size is explicitly accounted for by the threshold computation). This PR fixes the assert accordingly. Fixes #4626.	2022-08-05 17:27:56 -07:00
Nick Fitzgerald	95e72db458	Some little Cranelift logging things (#4624 ) * Cranelift: Don't print "skipped TEST can't run aarch64" on x64, etc It's way too noisy. Move it to the logs. * Cranelift: Enable Cranelift trace logs in `clif-util` by default * cranelift-filetest: use `log::warn!` for warnings Instead of `println!` * rustfmt	2022-08-05 13:25:24 -07:00
Damian Heaton	eb332b8369	Convert `fma`, `valltrue` & `vanytrue` to ISLE (AArch64) (#4608 ) * Convert `fma`, `valltrue` & `vanytrue` to ISLE (AArch64) Ported the existing implementations of the following opcodes to ISLE on AArch64: - `fma` - Introduced missing support for `fma` on vector values, as per the docs. - `valltrue` - `vanytrue` Also fixed `fcmp` on scalar values in the interpreter, and enabled interpreter tests in `simd-fma.clif`. This introduces the `FMLA` machine instruction. Copyright (c) 2022 Arm Limited * Add comments for `Fmla` and `Bsl` Copyright (c) 2022 Arm Limited	2022-08-05 09:47:56 -07:00
Nick Fitzgerald	1ed7b43e62	Cranelift: Remove unused `ABICaller::signature` method (#4621 ) And the `ABICallerImpl::ir_sig` field that was used to implement that method. This removes 56 bytes from the size of `ABICallerImpl` and gives us speed ups to compilation of about 7% on all benchmarks. ``` compilation :: nanoseconds :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 8205119.48 ± 4069474.25 (confidence = 99%) main.so is 0.91x to 0.97x faster than feature.so! feature.so is 1.03x to 1.10x faster than main.so! [117729152 132258110.36 167484097] main.so [107486500 124052990.88 138008797] feature.so compilation :: nanoseconds :: benchmarks/bz2/benchmark.wasm Δ = 4645258.32 ± 1981104.59 (confidence = 99%) main.so is 0.92x to 0.97x faster than feature.so! feature.so is 1.03x to 1.08x faster than main.so! [76562171 85504479.28 93116863] main.so [75180650 80859220.96 90591978] feature.so compilation :: nanoseconds :: benchmarks/spidermonkey/benchmark.wasm Δ = 150575617.54 ± 65021102.57 (confidence = 99%) main.so is 0.92x to 0.97x faster than feature.so! feature.so is 1.03x to 1.08x faster than main.so! [2573089039 2843117485.10 3175982602] main.so [2559784932 2692541867.56 3143529008] feature.so ```	2022-08-05 09:46:46 -07:00
Trevor Elliott	0c2a48f682	x64: Migrate selectif and selectif_spectre_guard to ISLE (#4619 ) https://github.com/bytecodealliance/wasmtime/pull/4619	2022-08-05 09:36:11 -07:00
wasmtime-publish	412fa04911	Bump Wasmtime to 0.41.0 (#4620 ) Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	2022-08-04 20:02:19 -05:00
Ulrich Weigand	f552a53654	s390x: Implement bitrev (#4617 ) Since we do not have an instruction for this, this is a simple open-coded implementation. Needed by the cg_clif frontend.	2022-08-04 16:24:55 -07:00
Trevor Elliott	cd847d071d	x64: Migrate br_table to ISLE (#4615 ) https://github.com/bytecodealliance/wasmtime/pull/4615	2022-08-04 22:12:37 +00:00
Ulrich Weigand	b17b1eb25d	[s390x, abi_impl] Add i128 support (#4598 ) This adds full i128 support to the s390x target, including new filetests and enabling the existing i128 runtest on s390x. The ABI requires that i128 is passed and returned via implicit pointer, but the front end still generates direct i128 types in call. This means we have to implement ABI support to implicitly convert i128 types to pointers when passing arguments. To do so, we add a new variant ABIArg::ImplicitArg. This acts like StructArg, except that the value type is the actual target type, not a pointer type. The required conversions have to be inserted in the prologue and at function call sites. Note that when dereferencing the implicit pointer in the prologue, we may require a temp register: the pointer may be passed on the stack so it needs to be loaded first, but the value register may be in the wrong class for pointer values. In this case, we use the "stack limit" register, which should be available at this point in the prologue. For return values, we use a mechanism similar to the one used for supporting multiple return values in the Wasmtime ABI. The only difference is that the hidden pointer to the return buffer must be the first, not last, argument in this case. (This implements the second half of issue #4565.)	2022-08-04 20:41:26 +00:00
Trevor Elliott	dc8362ceec	x64: Finish migrating brz and brnz to ISLE (#4614 ) https://github.com/bytecodealliance/wasmtime/pull/4614	2022-08-04 12:58:43 -07:00
Teymour Aldridge	ad223c5234	Add `try_use_var` method to `cranelift-frontend`. (#4588 ) * Add `try_use_var` method to `cranelift-frontend`. - Unlike `use_var`, this method does not panic if the variable has not been defined before use * Add `try_declare_var` and `try_def_var`. - Also implement Error for error enums. * Use `write!` macro. * Add `write!` use I missed.	2022-08-04 16:19:15 +00:00
Trevor Elliott	1fc11bbe51	x64: Migrate brff and I128 branching instructions to ISLE (#4599 ) https://github.com/bytecodealliance/wasmtime/pull/4599	2022-08-04 08:58:50 -07:00
Damian Heaton	12a9705fbc	Port `Shuffle` to ISLE (AArch64) (#4596 ) * Port `Shuffle` to ISLE (AArch64) Ported the existing implementation of `Shuffle` for AArch64 to ISLE. Copyright (c) 2022 Arm Limited * Cleanup by shadowing `rn`, `rn2`, and `_` Copyright (c) 2022 Arm Limited	2022-08-04 08:43:23 -07:00
Nick Fitzgerald	70ce288dc7	Save exit Wasm FP and PC in component-to-host trampolines (#4601 ) * Wasmtime: Add a pointer to `VMRuntimeLimits` in component contexts * Save exit Wasm FP and PC in component-to-host trampolines Fixes #4535 * Add comment about why we deref the trampoline's FP * Update some tests to use new `vmruntime_limits_*` methods	2022-08-04 10:27:30 -05:00
Jamey Sharp	f69acd6187	Upgrade regalloc2 -> 0.3.2 (#4603 ) Includes a modest improvement in memory usage and performance by removing analysis that was only used during fuzzing.	2022-08-04 00:06:13 +00:00
Trevor Elliott	301be7438e	x64: Begin migrating branch instructions to ISLE (#4587 ) https://github.com/bytecodealliance/wasmtime/pull/4587	2022-08-03 20:28:52 +00:00
Ulrich Weigand	b9dd48e34b	[s390x, abi_impl] Support struct args using explicit pointers (#4585 ) This adds support for StructArgument on s390x. The ABI for this platform requires that the address of the buffer holding the copy of the struct argument is passed from caller to callee as hidden pointer, using a register or overflow stack slot. To implement this, I've added an optional "pointer" filed to ABIArg::StructArg, and code to handle the pointer both in common abi_impl code and the s390x back-end. One notable change necessary to make this work involved the "copy_to_arg_order" mechanism. Currently, for struct args we only need to copy the data (and that need to happen before setting up any other args), while for non-struct args we only need to set up the appropriate registers or stack slots. This order is ensured by sorting the arguments appropriately into a "copy_to_arg_order" list. However, for struct args with explicit pointers we need to both copy the data (again, before everything else), and set up a register or stack slot. Since we now need to touch the argument twice, we cannot solve the ordering problem by a simple sort. Instead, the abi_impl common code now provided two callbacks, emit_copy_regs_to_buffer and emit_copy_regs_to_arg, and expects the back end to first call copy..to_buffer for all args, and then call copy.._to_arg for all args. This required updates to all back ends. In the s390x back end, in addition to the new ABI code, I'm now adding code to actually copy the struct data, using the MVC instruction (for small buffers) or a memcpy libcall (for larger buffers). This also requires a bit of new infrastructure: - MVC is the first memory-to-memory instruction we use, which needed a bit of memory argument tweaking - We also need to set up the infrastructure to emit libcalls. (This implements the first half of issue #4565.)	2022-08-03 19:00:07 +00:00
Anton Kirilov	a897742593	Initial back-edge CFI implementation (#3606 ) Give the user the option to sign and to authenticate function return addresses with the operations introduced by the Pointer Authentication extension to the Arm instruction set architecture. Copyright (c) 2021, Arm Limited.	2022-08-03 11:08:29 -07:00
Afonso Bordado	709716bb8e	cranelift: Implement scalar FMA on x86 (#4460 ) x86 does not have dedicated instructions for scalar FMA, lower to a libcall which seems to be what llvm does.	2022-08-03 10:29:10 -07:00
Nick Fitzgerald	55215bbd1e	Use a `SmallVec` for `ABIArgSlot`s (#4586 ) These are always length 1 for Wasm benchmarks. <h3>Sightglass Benchmark Results</h3> ``` compilation :: nanoseconds :: benchmarks/spidermonkey/benchmark.wasm Δ = 328624015.86 ± 40274677.93 (confidence = 99%) main.so is 0.88x to 0.91x faster than slots-smallvec.so! slots-smallvec.so is 1.10x to 1.13x faster than main.so! [3070752447 3203778792.55 3446269274] main.so [2503544039 2875154776.69 3197966713] slots-smallvec.so compilation :: nanoseconds :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 9685705.06 ± 3221286.87 (confidence = 99%) main.so is 0.91x to 0.96x faster than slots-smallvec.so! slots-smallvec.so is 1.05x to 1.09x faster than main.so! [129356493 145594942.79 165038803] main.so [118555011 135909237.73 188780619] slots-smallvec.so compilation :: nanoseconds :: benchmarks/bz2/benchmark.wasm No difference in performance. [79080493 86757564.46 112649639] main.so [78083384 85934125.69 94992743] slots-smallvec.so ```	2022-08-02 17:40:36 -07:00
Nick Fitzgerald	ab1cf3df2d	Use a `SmallVec` for `ABIArg`s (#4584 ) Instead of a regular `Vec`. These vectors are usually very small, for example here is the histogram of sizes when running Sightglass's `pulldown-cmark` benchmark: ``` ;; Number of samples = 10332 ;; Min = 0 ;; Max = 11 ;; ;; Mean = 2.496128532713901 ;; Standard deviation = 2.2859559855427243 ;; Variance = 5.225594767838607 ;; ;; Each ∎ is a count of 62 ;; 0 .. 1 [ 3134 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 1 .. 2 [ 2032 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 2 .. 3 [ 159 ]: ∎∎ 3 .. 4 [ 838 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎ 4 .. 5 [ 970 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 5 .. 6 [ 2566 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 6 .. 7 [ 303 ]: ∎∎∎∎ 7 .. 8 [ 272 ]: ∎∎∎∎ 8 .. 9 [ 40 ]: 9 .. 10 [ 18 ]: ``` By using a `SmallVec` with capacity of 6 we avoid the vast majority of heap allocations and get some nice benchmark wins of up to ~1.11x faster compilation. <h3>Sightglass Benchmark Results</h3> ``` compilation :: nanoseconds :: benchmarks/spidermonkey/benchmark.wasm Δ = 340361395.90 ± 63384608.15 (confidence = 99%) main.so is 0.88x to 0.92x faster than smallvec.so! smallvec.so is 1.09x to 1.13x faster than main.so! [3101467423 3425524333.41 4060621653] main.so [2820915877 3085162937.51 3375167352] smallvec.so compilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm Δ = 988446098.59 ± 184075718.89 (confidence = 99%) main.so is 0.88x to 0.92x faster than smallvec.so! smallvec.so is 1.09x to 1.13x faster than main.so! [9006994951 9948091070.66 11792481990] main.so [8192243090 8959644972.07 9801848982] smallvec.so compilation :: nanoseconds :: benchmarks/bz2/benchmark.wasm Δ = 7854567.87 ± 2215491.16 (confidence = 99%) main.so is 0.89x to 0.94x faster than smallvec.so! smallvec.so is 1.07x to 1.12x faster than main.so! [80354527 93864666.76 119789198] main.so [77554917 86010098.89 94726994] smallvec.so compilation :: cycles :: benchmarks/bz2/benchmark.wasm Δ = 22810509.85 ± 6434024.63 (confidence = 99%) main.so is 0.89x to 0.94x faster than smallvec.so! smallvec.so is 1.07x to 1.12x faster than main.so! [233358190 272593088.57 347880715] main.so [225227821 249782578.72 275097380] smallvec.so compilation :: nanoseconds :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 10849521.41 ± 4324757.85 (confidence = 99%) main.so is 0.90x to 0.96x faster than smallvec.so! smallvec.so is 1.04x to 1.10x faster than main.so! [133875427 156859544.47 222455440] main.so [126073854 146010023.06 181611647] smallvec.so compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 31508176.97 ± 12559561.91 (confidence = 99%) main.so is 0.90x to 0.96x faster than smallvec.so! smallvec.so is 1.04x to 1.10x faster than main.so! [388788638 455536988.31 646034523] main.so [366132033 424028811.34 527419755] smallvec.so ```	2022-08-02 15:53:44 -07:00
Nick Fitzgerald	edf7f9f2bb	wasmtime: Add lots of logging for `externref`s and `table_ops` fuzz target (#4583 ) I essentially add these same logs back in every time I'm debugging something related to this fuzz target or `externref`s in general. Probably like 5 times I've added roughly these logs. We should just make them available whenever we need them via `RUST_LOG=wasmtime_runtime=trace`. This also changes a couple `if let`s to `unwrap`s that are now infallible after	2022-08-02 15:06:44 -07:00

1 2 3 4 5 ...

3870 Commits