wasmtime

Author	SHA1	Message	Date
Afonso Bordado	2beaf7352f	cranelift: Test calling across different calling conventions (#4801 ) * cranelift: Test calling across different calling conventions * cranelift: Use `wasmtime_system_v` calling convention for cross cc tests	2022-08-31 15:16:41 -07:00
Alex Crichton	328727644f	Add some audits for some low-hanging fruit (#4836 ) I looked through some of our smaller dependencies to vet them and add an audit for them. These were the ones that were all "obviously correct" to me before I ran out of steam reviewing other crates.	2022-08-31 21:44:18 +00:00
Trevor Elliott	dde2c5a3b6	Align functions according to their ISA's requirements (#4826 ) Add a function_alignment function to the TargetIsa trait, and use it to align functions when generating objects. Additionally, collect the maximum alignment required for pc-relative constants in functions and pass that value out. Use the max of these two values when padding functions for alignment. This fixes a bug on x86_64 where rip-relative loads to sse registers could cause a segfault, as functions weren't always guaranteed to be aligned to 16-byte addresses. Fixes #4812	2022-08-31 14:41:44 -07:00
Nick Fitzgerald	f18a1f1488	Cranelift: Deduplicate ABI signatures during lowering (#4829 ) * Cranelift: Deduplicate ABI signatures during lowering This commit creates the `SigSet` type which interns and deduplicates the ABI signatures that we create from `ir::Signature`s. The ABI signatures are now referred to indirectly via a `Sig` (which is a `cranelift_entity` ID), and we pass around a `SigSet` to anything that needs to access the actual underlying `SigData` (which is what `ABISig` used to be). I had to change a couple methods to return a `SmallInstVec` instead of emitting directly to work around what would otherwise be shared and exclusive borrows of the lowering context overlapping. I don't expect any of these to heap allocate in practice. This does not remove the often-unnecessary allocations caused by `ensure_struct_return_ptr_is_returned`. That is left for follow up work. This also opens the door for further shuffling of signature data into more efficient representations in the future, now that we have `SigSet` to store it all in one place and it is threaded through all the code. We could potentially move each signature's parameter and return vectors into one big vector shared between all signatures, for example, which could cut down on allocations and shrink the size of `SigData` since those `SmallVec`s have pretty large inline capacity. Overall, this refactoring gives a 1-7% speedup for compilation on `pulldown-cmark`: ``` compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 8754213.66 ± 7526266.23 (confidence = 99%) dedupe.so is 1.01x to 1.07x faster than main.so! [191003295 234620642.20 280597986] dedupe.so [197626699 243374855.86 321816763] main.so compilation :: cycles :: benchmarks/bz2/benchmark.wasm No difference in performance. [170406200 194299792.68 253001201] dedupe.so [172071888 193230743.11 223608329] main.so compilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [3870997347 4437735062.59 5216007266] dedupe.so [4019924063 4424595349.24 4965088931] main.so ``` * Use full path instead of import to avoid warnings in some build configurations Warnings will then cause CI to fail. * Move `SigSet` into `VCode`	2022-08-31 20:39:32 +00:00
Alex Crichton	62c5af68b5	components: Limit the recursive size of types in Wasmtime (#4825 ) * components: Limit the recursive size of types in Wasmtime This commit is aimed at fixing #4814 by placing a hard limit on the maximal recursive depth a type may have in the component model. The component model theoretically allows for infinite recursion but many various types of operations within the component model are naturally written as recursion over the structure of a type which can lead to stack overflow with deeply recursive types. Some examples of recursive operations are: * Lifting and lowering a type - currently the recursion here is modeled in Rust directly with `#[derive]` implementations as well as the implementations for the `Val` type. * Compilation of adapter trampolines which iterates over the type structure recursively. * Historically many various calculations like the size of a type, the flattened representation of a type, etc, were all done recursively. Many of these are more efficiently done via other means but it was still natural to implement these recursively initially. By placing a hard limit on type recursion Wasmtime won't be able to load some otherwise-valid modules. The hope, though, is that no human-written program is likely to ever reach this limit. This limit can be revised and/or the locations with recursion revised if it's ever reached. The implementation of this feature is done by generalizing the current flattened-representation calculation which now keeps track of a type's depth and size. The size calculation isn't used just yet but I plan to use it in fixing #4816 and it was natural enough to write here as well. The depth is checked after a type is translated and if it exceeds the maximum then an error is returned. Additionally the `Arbitrary for Type` implementation was updated to prevent generation of a type that's too-recursive. Closes #4814 * Remove unused size calculation * Bump up just under the limit	2022-08-31 18:29:04 +00:00
Alex Crichton	99c6d7c083	components: Improve heuristic for splitting adapters (#4827 ) This commit is a (second?) attempt at improving the generation of adapter modules to avoid excessively large functions for fuzz-generated inputs. The first iteration of adapters simply translated an entire type inline per-function. This proved problematic however since the size of the adapter function was on the order of the overall size of a type, which can be exponential for a type that is otherwise defined in linear size. The second iteration of adapters performed a split where memory-based types would always be translated with individual functions. The theory here was that once a type was memory-based it was large enough to not warrant inline translation in the original function and a separate outlined function could be shared and otherwise used to deduplicate portions of the original giant function. This again proved problematic, however, since the splitting heuristic was quite naive and didn't take into account large stack-based types. This third iteration in this commit replaces the previous system with a similar but slightly more general one. Each adapter function now has a concept of fuel which is decremented each time a layer of a type is translated. When fuel runs out further translations are deferred to outlined functions. The fuel counter should hopefully provide a sort of reasonable upper bound on the size of a function and the outlined functions should ideally provide the ability to be called from multiple places and therefore deduplicate what would otherwise be a massive function. This final iteration is another attempt at guaranteeing that an adapter module is linear in size with respect to the input type section of the original module. Additionally this iteration uniformly handles stack and memory-based translations which means that stack-based translations can't go wild in their function size and memory-based translations may benefit slightly from having at least a little bit of inlining internally. The immediate impact of this is that the `component_api` fuzzer seems to be running at a faster rate than before. Otherwise #4825 is sufficient to invalidate preexisting fuzz-bugs and this PR is hopefully the final nail in the coffin to prevent further timeouts for small inputs cropping up. Closes #4816	2022-08-31 12:09:45 -05:00
Trevor Elliott	fb8b9838fe	Add MInst.XmmUnaryRmRImm to handle rounding instructions (#4823 ) Add a new pseudo-instruction, XmmUnaryRmRImm, to handle instructions like roundss that only use their first register argument for the instruction's result. This has the added benefit of allowing the isle wrappers for those instructions to take an XmmMem argument, allowing for more cases where loads may be merged.	2022-08-31 08:29:32 -07:00
Afonso Bordado	cf7cb10036	cranelift: Add some filetests documentation (#4833 )	2022-08-31 08:15:10 -07:00
Chris Fallin	186c7c3b89	x64: clean up regalloc-related semantics on several instructions. (#4811 ) * x64: clean up regalloc-related semantics on several instructions. This PR removes all uses of "modify" operands on instructions in the x64 backend, and also removes all uses of "pinned vregs", or vregs that are explicitly tied to particular physical registers. In place of both of these mechanisms, which are legacies of the old regalloc design and supported via compatibility code, the backend now uses operand constraints. This is more flexible as it allows the regalloc to see the liveranges and constraints without "reverse-engineering" move instructions. Eventually, after removing all such uses (including in other backends and by the ABI code), we can remove the compatibility code in regalloc2, significantly simplifying its liverange-construction frontend and thus allowing for higher confidence in correctness as well as possibly a bit more compilation speed. Curiously, there are a few extra move instructions now; they are likely poor splitting decisions and I can try to chase these down later. * Fix cranelift-codegen tests. * Review feedback.	2022-08-30 17:21:14 -07:00
Afonso Bordado	3ce3eeb668	cranelift: Register all functions in test file for interpreter (#4817 ) * cranelift: Implement `bnot` in interpreter * cranelift: Register all functions in test file for interpreter * cranelift: Relax signature checking for bools and vectors	2022-08-30 15:45:21 -07:00
Trevor Elliott	da0d8781b5	Add a template for fuzz bugs (#4808 ) Add a template for fuzz bugs Co-authored-by: Jamey Sharp <jsharp@fastly.com>	2022-08-30 14:14:43 -07:00
Nick Fitzgerald	ff0e84ecf4	Wasmtime: fix stack walking across frames from different stores (#4779 ) We were previously implicitly assuming that all Wasm frames in a stack used the same `VMRuntimeLimits` as the previous frame we walked, but this is not true when Wasm in store A calls into the host which then calls into Wasm in store B: \| ... \| \| Host \| \| +-----------------+ \| stack \| Wasm in store A \| \| grows +-----------------+ \| down \| Host \| \| +-----------------+ \| \| Wasm in store B \| V +-----------------+ Trying to walk this stack would previously result in a runtime panic. The solution is to push the maintenance of our list of saved Wasm FP/SP/PC registers that allow us to identify contiguous regions of Wasm frames on the stack deeper into `CallThreadState`. The saved registers list is now maintained whenever updating the `CallThreadState` linked list by making the `CallThreadState::prev` field private and only accessible via a getter and setter, where the setter always maintains our invariants.	2022-08-30 18:28:00 +00:00
Alex Crichton	09c93c70cc	Remove the `ansi_term` transitive dependency (#4822 ) Only used during tests but this resolves #4742 by slimming the dependency tree.	2022-08-30 17:29:17 +00:00
Chris Fallin	1a59b3e6c6	AArch64: port `tls_value` to ISLE. (#4821 )	2022-08-30 16:51:15 +00:00
Trevor Elliott	b033aba61b	Move the nop lowering to ISLE, and remove the final return from lower.rs (#4809 ) Lower nop in ISLE in the x64 backend, and remove the final Ok(()) from the lower function to assert that all cases that aren't handled in ISLE will panic.	2022-08-30 09:14:20 -07:00
Damian Heaton	3d9d759380	Port `fcmp` to ISLE (AArch64) (#4819 ) Ported the existing implementation of `fcmp` for AArch64 to ISLE. This also ports the `lower_vector_comparison` method to ISLE. Copyright (c) 2022 Arm Limited	2022-08-30 09:06:15 -07:00
TheGreatRambler	e910b8fbfb	Android support (#4606 ) * Add android aarch64 support into c-api * Remove target test and clean up CMake script c-api * Deduplicate ExternalProject_Add in c-api Android support	2022-08-30 09:08:26 -05:00
Bobby Holley	52d88facdd	Import cargo-vet audits from Mozilla (#4792 ) * Bump cargo-vet to 0.3. * Add Mozilla as a trusted import for audits.	2022-08-30 09:01:53 -05:00
Chris Fallin	b1fb4d7c35	Fix lowering issue in x64 vany_true: sinking and using original value. (#4815 ) The x64 lowring of `vany_true` both sinks mergeable loads and uses the original register. This PR fixes the lowering to force the value into a register first. Ideally we should solve the issue by catching this in the ISLE type system, as described in #4745, but this resolves the issue for now. Fixes #4807.	2022-08-29 22:22:12 -07:00
Chris Fallin	2b4b257834	Revert "cranelift: Register all functions in test file for interpreter (#4800 )" (#4810 ) This reverts commit `500a9f17be`.	2022-08-30 01:15:11 +00:00
Chris Fallin	955d4e4ba1	AArch64: port load and store operations to ISLE. (#4785 ) This retains `lower_amode` in the handwritten code (@akirilov-arm reports that there is an upcoming patch to port this), but tweaks it slightly to take a `Value` rather than an `Inst`.	2022-08-29 17:45:55 -07:00
Trevor Elliott	5d05d7676f	Improve the `fmt` output of the instantiate fuzz target (#4804 ) Add an Arbitrary instance for the input to the instantiate fuzz target, so that cargo fuzz fmt instantiate <file> produces more meaningful output.	2022-08-30 00:09:19 +00:00
Afonso Bordado	500a9f17be	cranelift: Register all functions in test file for interpreter (#4800 ) * cranelift: Implement `bnot` in interpreter * cranelift: Register all functions in test file for interpreter	2022-08-29 23:39:50 +00:00
Nick Fitzgerald	5392d7cdd7	cranelift: Merge `abi` and `abi_impl` modules (#4805 )	2022-08-29 23:20:36 +00:00
Jamey Sharp	dd81e5a64f	Don't let fuzz targets import `arbitrary` directly (#4806 ) The version of the `arbitrary` crate used in fuzz targets needs to be the same as the version used in `libfuzzer-sys`. That's why the latter crate re-exports the former. But we need to make sure to consistently use the re-exported version. That's most easily done if that's the only version we have available. However, `fuzz/Cargo.toml` declared a direct dependency on `arbitrary`, making it available for import, and leading to that version being used in a couple places. There were two copies of `arbitrary` built before, even though they were the same version: one with the `derive` feature turned on, through the direct dependency, and one with it turned off when imported through `libfuzzer-sys`. So I haven't specifically tested this but fuzzer builds might be slightly faster now. I have not removed the build-dep on `arbitrary`, because `build.rs` is not invoked by libFuzzer and so it doesn't matter what version of `arbitrary` it uses. Our other crates, like `cranelift-fuzzgen` and `wasmtime-fuzzing`, can still accidentally use a different version of `arbitrary` than the fuzz targets which rely on them. This commit only fixes the direct cases within `fuzz/**`.	2022-08-29 23:06:41 +00:00
Jamey Sharp	4882347868	Disable funcref generation for fuzz tests with inputs (#4797 ) This fixes #4757, fixes #4758, and fixes new fuzzbugs that are probably coming after we merged #4667.	2022-08-29 14:30:26 -07:00
Afonso Bordado	07767c3d4a	cranelift: Enable i128 shifts (#4783 )	2022-08-29 14:30:03 -07:00
Afonso Bordado	7663cc1c3d	cranelift: Disable i128 divs on fuzzgen (#4771 )	2022-08-29 14:29:51 -07:00
Afonso Bordado	9a8bd5be02	cranelift: Add LibCalls to the interpreter (#4782 ) * cranelift: Add libcall handlers to interpreter * cranelift: Fuzz IshlI64 libcall * cranelift: Revert back to fuzzing udivi64 * cranelift: Use sdiv as a fuzz libcall * cranelift: Register Sdiv in fuzzgen * cranelift: Add multiple libcalls to fuzzer * cranelift: Register a single libcall handler * cranelift: Simplify args checking in interpreter * cranelift: Remove unused LibCalls * cranelift: Cleanup interpreter libcall types * cranelift: Fix Interpreter Docs	2022-08-29 13:36:33 -07:00
Chris Fallin	a6eb24bd4f	AArch64: port misc ops to ISLE. (#4796 ) * Add some precise-output compile tests for aarch64. * AArch64: port misc ops to ISLE. - get_pinned_reg / set_pinned_reg - bitcast - stack_addr - extractlane - insertlane - vhigh_bits - iadd_ifcout - fcvt_low_from_sint	2022-08-29 12:56:39 -07:00
Johnnie Birch	6368c6b188	Modifies fcvt_to_sint and fcvt_to_unit clif to make scalar only (#4794 ) Closes #4693.	2022-08-29 08:40:39 -07:00
Jamey Sharp	573ae0c60b	cranelift-fuzzgen: use a different namespace (#4795 ) Otherwise I get a panic with "Duplicate function with name u0:1 found!" at fuzz/fuzz_targets/cranelift-fuzzgen.rs:76:10.	2022-08-26 23:55:37 +00:00
Trevor Elliott	25d960f9c4	x64: Lower tlsvalue, sqmul_round_sat, and uunarrow in ISLE (#4793 ) Lower tlsvalue, sqmul_round_sat, and uunarrow in ISLE.	2022-08-26 16:33:48 -07:00
Chris Fallin	8e8dfdf5f9	AArch64: Migrate calls and returns to ISLE. (#4788 )	2022-08-26 16:26:39 -07:00
Trevor Elliott	ca6d648e37	x64: Ensure that constants are always 16 bytes for XmmMem (#4790 ) Ensure that constants generated for the memory case of XmmMem values are always 16 bytes, ensuring that we don't accidantally perform an unaligned load. Fixes #4761	2022-08-26 20:04:38 +00:00
Afonso Bordado	7a9078d9cc	cranelift: Allow `call` and `call_indirect` in runtests (#4667 ) * cranelift: Change test runner order Changes the ordering of runtests to run per target and then per function. This change doesn't do a lot by itself, but helps future refactorings of runtests. * cranelift: Rename SingleFunctionCompiler to TestCaseCompiler * cranelift: Skip runtests per target instead of per run * cranelift: Deduplicate test names With the upcoming changes to the runtest infrastructure we require unique ExtNames for all tests. Note that for test names we have a 16 character limit on test names, and must be unique within those 16 characters. * cranelift: Add TestFileCompiler to runtests TestFileCompiler allows us to compile the entire file once, and then call the trampolines for each test. The previous code was compiling the function for each invocation of a test. * cranelift: Deduplicate ExtName for avg_round tests * cranelift: Rename functions as they are defined. The JIT internally only deals with User functions, and cannot link test name funcs. This also caches trampolines by signature. * cranelift: Preserve original name when reporting errors. * cranelift: Rename aarch64 test functions * cranelift: Add `call` and `call_indirect` tests! * cranelift: Add pauth runtests for aarch64 * cranelift: Rename duplicate s390x tests * cranelift: Delete `i128_bricmp_of` function from i128-bricmp It looks like we forgot to delete it when it was moved to `i128-bricmp-overflow`, and since it didn't have a run invocation it was never compiled. However, s390x does not support this, and panics when lowering. * cranelift: Add `colocated` call tests * cranelift: Rename more `s390x` tests * cranelift: Add pauth + sign_return_address call tests * cranelift: Undeduplicate test names With the latest main changes we now support unlimited length test names. This commit reverts: 52274676ff631c630f9879dd32e756566d3e700f 7989edc172493547cdf63e180bb58365e8a43a42 25c8a8395527d98976be6a34baa3b0b214776739 792e8cfa8f748077f9d80fe7ee5e958b7124e83b * cranelift: Add LibCall tests * cranelift: Revert more test names These weren't auto reverted by the previous revert. * cranelift: Disable libcall tests for aarch64 * cranelift: Runtest fibonacci tests * cranelift: Misc cleanup	2022-08-26 12:42:16 -07:00
Dan Gohman	9b3477f602	Implement the remaining socket-related WASI functions. (#4776 ) * Implement the remaining socket-related WASI functions. The original WASI specification included `sock_read`, `sock_write`, and `shutdown`. Now that we have some sockets support, implement these additional functions, to make it easier for people porting existing code to WASI. It's expected that this will all be subsumed by the wasi-sockets proposal, but for now, this is a relatively small change which should hopefully unblock people trying to use the current `accept` support. * Update to system-interface 0.22, which has fixes for Windows.	2022-08-26 11:39:51 -07:00
Dan Gohman	a68fa86aad	Make wasi-common-std-sync's dependency on system-interface private. (#4784 ) * Make wasi-common-std-sync's dependency on system-interface private. Change some `pub` functions which exposed system-interface types to be non-`pub`. And, change `from_sysif_fdflags` functions to `get_fd_flags` functions that take `impl AsFilelike` arguments instead of system-interface types. With these changes, system-interface is no longer exposed in the public API. * Add a public API for `is_read_write` too. Implementors using types implementing `AsFilelike` may want to use the same `is_read_write` logic, without explicitly depending on system-interface, so provide a function that provides that.	2022-08-26 11:39:00 -07:00
Trevor Elliott	c1f9736938	x64: Lower vany_true, vall_true, vhigh_bits, iconcat, and isplit in ISLE (#4787 ) Lower vany_true, vall_true, vhigh_bits, iconcat, and isplit in ISLE.	2022-08-26 09:07:22 -07:00
Dan Gohman	05ffdc26ec	Implement I/O timeouts that specify the REALTIME clock. (#4777 ) POSIX specifies that functions like `nanosleep` use the REALTIME clock, so allow WASI `poll_oneoff` calls to use the REALTIME clock, at least for non-absolute intervals. POSIX specifies that the timeouts should not be affected by subsequent `clock_settime` calls, so they behave the same way as MONOTONIC clock requests, so we can implement them as monotonic requests.	2022-08-25 10:18:48 -07:00
Trevor Elliott	9386409607	x64: Lower extractlane, scalar_to_vector, and splat in ISLE (#4780 ) Lower extractlane, scalar_to_vector and splat in ISLE. This PR also makes some changes to the SinkableLoad api * change the return type of sink_load to RegMem as there are more functions available for dealing with RegMem * add reg_mem_to_reg_mem_imm and register it as an automatic conversion	2022-08-25 09:38:03 -07:00
Andrew Brown	d3c463aac0	[fuzz] Configure the `differential` target (#4773 ) This change is a follow-on from #4515 to add the ability to configure the `differential` fuzz target by limiting which engines and modules are used for fuzzing. This is incredibly useful when troubleshooting, e.g., when an engine is more prone to failure, we can target that engine exclusively. The effect of this configuration is visible in the statistics now printed out from #4739. Engines are configured using the `ALLOWED_ENGINES` environment variable. We can either subtract from the set of allowed engines (e.g., `ALLOWED_ENGINES=-v8`) or build up a set of allowed engines (e.g., `ALLOWED_ENGINES=wasmi,spec`), but not both at the same time. `ALLOWED_ENGINES` only configures the left-hand side engine; the right-hand side is always Wasmtime. When omitted, `ALLOWED_ENGINES` defaults to [`wasmtime`, `wasmi`, `spec`, `v8`]. The generated WebAssembly modules are configured using `ALLOWED_MODULES`. This environment variables works the same as above but the available options are: [`wasm-smith`, `single-inst`].	2022-08-24 15:49:48 -07:00
Chris Fallin	c664fb6f70	Update ISLE error message: unknown term in expr, not pattern. (#4775 ) This was likely a copy-paste from the `ast::Pattern` case, but here it is checking a term name in `ast::Expr` and so should say "... in expression", not "... in pattern".	2022-08-24 22:07:29 +00:00
Trevor Elliott	b8b6f2781e	x64: Lower shuffle and swizzle in ISLE (#4772 ) Lower `shuffle` and `swizzle` in ISLE. This PR surfaced a bug with the lowering of `shuffle` when avx512vl and avx512vbmi are enabled: we use `vpermi2b` as the implementation, but panic if the immediate shuffle mask contains any out-of-bounds values. The behavior when the avx512 extensions are not present is that out-of-bounds values are turned into `0` in the result. I've resolved this by detecting when the shuffle immediate has out-of-bounds indices in the avx512-enabled lowering, and generating an additional mask to zero out the lanes where those indices occur. This brings the avx512 case into line with the semantics of the `shuffle` op: `94bcbe8446/cranelift/codegen/meta/src/shared/instructions.rs (L1495-L1498)`	2022-08-24 21:49:51 +00:00
Andrew Brown	b4c25ef63e	[fuzz] Simplify macros used by single-instruction generator (#4774 ) This removes the multiple macros used previously to describe the WebAssembly instruction signatures and replaces them with a single one--`inst!`.	2022-08-24 20:10:12 +00:00
Damian Heaton	94bcbe8446	Port `Fcopysign`..`FcvtToSintSat` to ISLE (AArch64) (#4753 ) * Port `Fcopysign`..``FcvtToSintSat` to ISLE (AArch64) Ported the existing implementations of the following opcodes to ISLE on AArch64: - `Fcopysign` - Also introduced missing support for `fcopysign` on vector values, as per the docs. - This introduces the vector encoding for the `SLI` machine instruction. - `FcvtToUint` - `FcvtToSint` - `FcvtFromUint` - `FcvtFromSint` - `FcvtToUintSat` - `FcvtToSintSat` Copyright (c) 2022 Arm Limited * Document helpers and abstract conversion checks	2022-08-24 10:37:14 -07:00
Afonso Bordado	7e3c481f4e	cranelift: Avoid lowering VEX instructions with memory encodings (#4768 )	2022-08-24 10:35:06 -07:00
Afonso Bordado	d394edcefe	x64: Mask shift amounts for small types (#4752 ) * x64: Mask shift amounts for small types * cranelift: Disable i128 shifts in fuzzer again They are fixed. But we had a bunch of fuzzgen issues come in, and we don't want to accidentaly mark them as fixed * cranelift: Avoid masking shifts for 32 and 64 bit cases * cranelift: Add const shift tests and fix them * cranelift: Remove const `rotl` cases Now that `put_masked_in_imm8_gpr` works properly we can simplify rotl/rotr	2022-08-24 10:31:38 -07:00
Jamey Sharp	9cb987c678	Don't limit ExternalName::TestName length (#4764 ) In order to keep the `ExternalName` enum small, the `TestcaseName` struct was limited to 17 bytes: a 1 byte length and a 16 byte buffer. Due to alignment, that made `ExternalName` 20 bytes. That fixed-size buffer means that the names of functions in Cranelift filetests are truncated to fit, which limits our ability to give tests meaningful names. And I think meaningful names are important in tests. This patch replaces the inline `TestcaseName` buffer with a heap-allocated slice. We don't care about performance for test names, so an indirection out to the heap is fine in that case. But we do care somewhat about the size of `ExternalName` when it's used during compiles. On 64-bit systems, `Box<[u8]>` is 16 bytes, so `TestcaseName` gets one byte smaller. Unfortunately, its alignment is 8 bytes, so `ExternalName` grows from 20 to 24 bytes. According to `valgrind --tool=dhat`, this change has very little effect on compiler performance. Building wasmtime with `--no-default-features --release`, and compiling the pulldown-cmark benchmark from Sightglass, I measured these differences between `main` and this patch: - total number of allocations didn't change (`ExternalName::TestCase` is not used in normal compiles) - 592 more bytes allocated over the process lifetime, out of 171.5MiB - 320 more bytes allocated at peak heap size, out of 12MiB - 0.24% more instructions executed - 16,987 more bytes written - 12,120 _fewer_ bytes read	2022-08-23 21:17:30 -07:00
Trevor Elliott	4bdfa76370	x64: Migrate get_pinned_reg, set_pinned_reg, vconst, and raw_bitcast to ISLE (#4763 ) https://github.com/bytecodealliance/wasmtime/pull/4763	2022-08-23 16:32:00 -07:00

1 2 3 4 5 ...

10363 Commits