wasmtime

Author	SHA1	Message	Date
Alex Crichton	755cd4311e	Update max tuple size in component api fuzzing (#4675 ) Fixes a build failure on #4673 where tuples of length 16 don't implement `Debug` from the standard library.	2022-08-11 20:24:48 +00:00
Alex Crichton	380db48ce6	Enable the `memory-init-cow` feature building the C API (#4690 ) This feature was accidentally disabled by default when building the C API.	2022-08-11 20:09:46 +00:00
Trevor Elliott	0c2e0494bd	x64: Lower fcvt_from_uint in ISLE (#4684 ) * Add a test for the existing behavior of fcvt_from_unit * Migrate the I8, I16, I32 cases of fcvt_from_uint * Implement the I64 case of fcvt_from_uint * Add a test for the existing behavior of fcvt_from_uint.f64x2 * Migrate fcvt_from_uint.f64x2 to ISLE * Lower the last case of `fcvt_from_uint` * Add a test for `fcvt_from_uint` * Finish lowering fcmp_from_uint * Format	2022-08-11 12:28:41 -07:00
Andrew Brown	c4fd6a95da	[fuzz] Remove unnecessary allocation (#4689 ) This resolves a comment @jameysharp made in a previous PR.	2022-08-11 19:26:33 +00:00
Afonso Bordado	e4adc46e6d	cranelift: Fix shifts and implement rotates in interpreter (#4519 ) * cranelift: Fix shifts and implement rotates in interpreter * x64: Implement `rotl`/`rotr` for some small type combinations	2022-08-11 12:15:52 -07:00
Ulrich Weigand	67870d1518	s390x: Support both big- and little-endian vector lane order (#4682 ) This implements the s390x back-end portion of the solution for https://github.com/bytecodealliance/wasmtime/issues/4566 We now support both big- and little-endian vector lane order in code generation. The order used for a function is determined by the function's ABI: if it uses a Wasmtime ABI, it will use little-endian lane order, and big-endian lane order otherwise. (This ensures that all raw_bitcast instructions generated by both wasmtime and other cranelift frontends can always be implemented as a no-op.) Lane order affects the implementation of a number of operations: - Vector immediates - Vector memory load / store (in big- and little-endian variants) - Operations explicitly using lane numbers (insertlane, extractlane, shuffle, swizzle) - Operations implicitly using lane numbers (iadd_pairwise, narrow/widen, promote/demote, fcvt_low, vhigh_bits) In addition, when calling a function using a different lane order, we need to lane-swap all vector values passed or returned in registers. A small number of changes to common code were also needed: - Ensure we always select a Wasmtime calling convention on s390x in crates/cranelift (func_signature). - Fix vector immediates for filetests/runtests. In PR #4427, I attempted to fix this by byte-swapping the V128 value, but with the new scheme, we'd instead need to perform a per-lane byte swap. Since we do not know the actual type in write_to_slice and read_from_slice, this isn't easily possible. Revert this part of PR #4427 again, and instead just mark the memory buffer as little-endian when emitting the trampoline; the back-end will then emit correct code to load the constant. - Change a runtest in simd-bitselect-to-vselect.clif to no longer make little-endian lane order assumptions. - Remove runtests in simd-swizzle.clif that make little-endian lane order assumptions by relying on implicit type conversion when using a non-i16x8 swizzle result type (this feature should probably be removed anyway). Tested with both wasmtime and cg_clif.	2022-08-11 12:10:46 -07:00
Alex Crichton	c1c48b4386	Don't be clever about representing non-CoW images (#4691 ) This commit fixes a build warning on Rust 1.63 when the `memory-init-cow` feature is disabled in the `wasmtime-runtime` crate. Some "tricks" were used prior to have the `MemoryImage` type be an empty `enum {}` but that wreaks havoc with warnings so this commit instead just makes it a unit struct and makes all methods panic (as they shouldn't be hit anyway).	2022-08-11 18:16:28 +00:00
Afonso Bordado	c5bc368cfe	cranelift: Add COFF TLS Support (#4546 ) * cranelift: Implement COFF TLS Relocations * cranelift: Emit SecRel relocations * cranelift: Handle _tls_index symbol in backend	2022-08-11 09:33:40 -07:00
Benjamin Bouvier	a40b253792	Uncomment unwind stack frame tests that now pass on aarch64 (#4687 ) Thanks to #4431 and @fitzgen who implemented it!	2022-08-11 15:09:04 +00:00
Andrew Brown	c3e31c9946	[fuzz] Document Wasm-JS conversions (#4683 ) During differential execution against V8, Wasm values need to be converted back and forth from JS values. This change documents the location in the specification where this is defined.	2022-08-10 23:43:43 +00:00
Afonso Bordado	268ddf2f6c	cranelift: Implement pinned reg in interpreter (#4375 )	2022-08-10 21:33:45 +00:00
Afonso Bordado	11f0b003eb	cranelift: Build a runtest case from fuzzer TestCase's (#4590 ) * cranelift: Build a runtest case from fuzzer TestCase's * cranelift: Add a default expected output for a fuzzgen case	2022-08-10 21:17:11 +00:00
Alex Crichton	597eb6f4ce	Limit the type hierarchies in component fuzzing (#4668 ) * Limit the type hierarchies in component fuzzing For now `wasmparser` has a hard limit on the size of tuples and such at 1000 recursive types within the tuple itself. Respect this limit by limiting the width of recursive types generated for the `component_api` fuzzer. This commit unifies this new requirement with the preexisting `TupleArray` and `NonEmptyArray` types into one `VecInRange<T, L, H>` which allow expressing all of these various requirements in one type. * Fix a compile error on `main` * Review comments	2022-08-10 20:49:51 +00:00
bjorn3	54f9587569	Don't use libtest harness for filetests (#4655 ) We are using our own test harness for filetests and embedding it in libtest isn't useful. It only hides test output until the end and results in unnecessary noise.	2022-08-10 13:48:15 -07:00
Dan Gohman	918debfe59	Stop returning `NOTCAPABLE` errors from WASI calls. (#4666 ) * Stop returning `NOTCAPABLE` errors from WASI calls. `ENOTCAPABLE` was an error code that is used as part of the rights system, from CloudABI. There is a set of flags associated with each file descriptor listing which operations can be performed with the file descriptor, and if an attempt is made to perform an operation with a file descriptor that isn't permitted by its rights flags, it fails with `ENOTCAPABLE`. WASI is removing the rights system. For example, WebAssembly/wasi-libc#294 removed support for translating `ENOTCAPABLE` into POSIX error codes, on the assumption that engines should stop using it. So as another step to migrating away from the rights system, remove uses of the `ENOTCAPABLE` error. * Update crates/wasi-common/src/file.rs Co-authored-by: Jamey Sharp <jamey@minilop.net> * Update crates/wasi-common/src/dir.rs Co-authored-by: Jamey Sharp <jamey@minilop.net> Co-authored-by: Jamey Sharp <jamey@minilop.net>	2022-08-10 13:44:23 -07:00
Ulrich Weigand	be36dd6b1e	s390x: Enable object backend (#4680 ) This enables the object backend for s390x, in particular the processing of all required relocations. This uncovered a bug: we need to use PLT relocations for the target of calls, which we currently do not. Fixed by adding a new S390xPLTRel32Dbl reloc type and using it where needed.	2022-08-10 20:07:54 +00:00
Jamey Sharp	ecb91c0b06	List preset's settings in generated comment (#4679 ) Figuring out which boolean settings go into each preset is not easy by inspecting the DSL source (e.g. meta/src/isa/x86.rs). This patch extends the comments in the Rust that's generated by that DSL to list the names of the settings together with the name of the preset.	2022-08-10 19:56:23 +00:00
Trevor Elliott	a25d52046b	x64: Migrate fcvt_from_sint and fcvt_low_from_sint to ISLE (#4650 ) https://github.com/bytecodealliance/wasmtime/pull/4650	2022-08-10 10:49:02 -07:00
bjorn3	f8c0a88299	Fix sret for AArch64 (#4634 ) * Fix sret for AArch64 AArch64 requires the struct return address argument to be stored in the x8 register. This register is never used for regular arguments. * Add extra sret tests for x86_64	2022-08-10 10:34:51 -07:00
Ulrich Weigand	50fcab2984	s390x: Implement tls_value (#4616 ) Implement the tls_value for s390 in the ELF general-dynamic mode. Notable differences to the x86_64 implementation are: - We use a __tls_get_offset libcall instead of __tls_get_addr. - The current thread pointer (stored in a pair of access registers) needs to be added to the result of __tls_get_offset. - __tls_get_offset has a variant ABI that requires the address of the GOT (global offset table) is passed in %r12. This means we need a new libcall entries for __tls_get_offset. In addition, we also need a way to access _GLOBAL_OFFSET_TABLE_. The latter is a "magic" symbol with a well-known name defined by the ABI and recognized by the linker. This patch introduces a new ExternalName::KnownSymbol variant to support such names (originally due to @afonso360). We also need to emit a relocation on a symbol placed in a constant pool, as well as an extra relocation on the call to __tls_get_offset required for TLS linker optimization. Needed by the cg_clif frontend.	2022-08-10 10:02:07 -07:00
Andrew Brown	354daf5b93	[fuzz] Fix issues with single-inst module generator (#4674 ) * [fuzz] Fix signature of `i64.extend32_s` single-instruction test This single-instruction test incorrectly attempted to convert an `i32` to an `i64`; the correct signature is `i64 -> i64`. See the [WebAssembly specification](https://webassembly.github.io/spec/core/bikeshed/#a7-index-of-instructions). * [fuzz] Fix typo in single-instruction function generator Previously, the `unary!` macro created functions that used two operands instead of the expected one.	2022-08-10 16:47:02 +00:00
Alex Crichton	96a2ba70b4	Update 0.40.0 release notes (#4660 ) * Update 0.40.0 release notes Not a ton happened in terms of user-facing improvements here so I outlined some internal changes as well. The cumulative effect of improving compile times is Sightglass showing 30-40% improvements for major benchmarks. Additionally I wrote down a note indicating that this is likely the last `0.` release and the next release of Wasmtime on September 20 is planned to be 1.0. Remove perf-related relnotes * Call out s390x simd at the top-level	2022-08-10 16:23:27 +00:00
Afonso Bordado	30e2a9bd29	cranelift: Upgrade libm to 0.2.4 (#4670 ) * cranelift: Upgrade libm to 0.2.4 This resolves an issue with incorrect fmaf on the x86_64-pc-windows-gnu target under some inputs. See: #4517 * supply-chain: Vet `libm` 0.2.4	2022-08-10 16:08:39 +00:00
Alex Crichton	fd28d94352	Shield compiled modules from their appended metadata (#4609 ) This commit fixes #4600 in a somewhat roundabout fashion. Currently the `main` branch of Wasmtime exhibits unusual behavior: * If `./ci/run-tests.sh` is run then the `cache_accounts_for_opt_level` test does not fail. * If `cargo test -p wasmtime --lib` is run, however, then the test fails. This test is indeed being run as part of `./ci/run-tests.sh` and it's also passing in CI. The exact failure is that part of the debuginfo support we have takes an existing ELF image, copies it, and then appends some information to inform profilers/gdb about the image. This code is all quite old at this point and not 100% optimal, but that's at least where we're at. The problem is that the appended `ProgramHeader64` is not aligned correctly during `cargo test -p wasmtime --lib`, which is the panic that happens causing the test to fail. The reason, however, that this test passes with `./ci/run-tests.sh` is that the alignment of `ProgramHeader64` is 1 instead of 8. The reason for that is that the `object` crate has an `unaligned` feature which forcibly unaligns all primitives to 1 byte instead of their natural alignment. During `cargo test -p wasmtime --lib` this feature is not enabled but during `./ci/run-tests.sh` this feature is enabled. The feature is currently enabled through inclusion of the `backtrace` crate which only happens for some tests in some crates. The alignment issue explains why the test fails on a single crate test but fails on the whole workspace tests. The next issue I investigated was if this test ever passed. It turns out that on v0.39.0 this test passed, and the regression to main was introduced during #4571. That PR, however, has nothing to do with any of this! The reason that this showed up as causing a "regression" however is because it changed cranelift settings which changed the size of serialized metadata at the end of a Wasmtime cache object. Wasmtime compiled artifacts are ELF images with Wasmtime-specific metadata appended after them. This appended metadata was making its way all the way through to the gdbjit image itself which mean that while the end of the ELF file itself was properly aligned the space after the Wasmtime metadata was not aligned. This metadata changes in size over time as Cranelift settings change which explains why #4571 was the "source" of the regression. The fix in this commit is to discard the extra Wasmtime metadata when creating an `MmapVec` representing the underlying ELF image. This is already supported with `MmapVec::drain` so it was relatively easy to insert that. This means that the gdbjit image starts with just the ELF file itself which is always aligned at the end, which gets the test passing with/without the `unaligned` feature in the `object` crate.	2022-08-10 09:58:34 -05:00
Andrew Brown	7fa89c4a4f	[fuzz] Fix order of operands passed in to `wasm-spec-interpreter` (#4672 ) In #4671, the meta-differential fuzz target was finding errors when running certain Wasm modules (specifically `shr_s` in that case). @conrad-watt diagnosed the issue as a missing reversal in the operands passed to the spec interpreter. This change fixes #4671 and adds an additional unit test to keep it fixed.	2022-08-10 09:55:33 -05:00
Trevor Elliott	63c2d1e0c3	x64: Remove unnecessary register use when comparing against constants (#4645 ) https://github.com/bytecodealliance/wasmtime/pull/4645	2022-08-09 23:53:51 +00:00
Afonso Bordado	4d2a2cfae6	cranelift: Use `cranelift-jit` in runtests (#4453 ) * cranelift: Use JIT in runtests Using `cranelift-jit` in run tests allows us to preform relocations and libcalls. This is important since some instruction lowerings fallback to libcall's when an extension is missing, or when it's too complicated to implement manually. This is also a first step to being able to test `call`'s between functions in the runtest suite. It should also make it easier to eventually test TLS relocations, symbol resolution and ABI's. Another benefit of this is that we also get to test the JIT more, since it now runs the runtests, and gets some fuzzing via `fuzzgen` (which uses the `SingleFunctionCompiler`). This change causes regressions in terms of runtime for the filetests. I haven't done any serious benchmarking but what I've been seeing is that it now takes about ~3 seconds to run the testsuite while it previously took around 2 seconds. * Add FMA tests for X86	2022-08-09 14:54:25 -07:00
Afonso Bordado	97b2680f20	cranelift: Remove legalized_to_pointer from function generator (#4665 )	2022-08-09 21:47:26 +00:00
Nick Fitzgerald	b17a734a57	Fix unused result that is `#[must_use]` (#4663 ) Fixes this compiler warning: ``` warning: unused return value of `Box::<T>::from_raw` that must be used --> crates/bench-api/src/lib.rs:351:9 \| 351 \| Box::from_raw(state as *mut BenchState); \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| ```	2022-08-09 13:17:43 -07:00
Alex Crichton	bd70dbebbd	Deduplicate some size/align calculations (#4658 ) This commit is an effort to reduce the amount of complexity around managing the size/alignment calculations of types in the canonical ABI. Previously the logic for the size/alignment of a type was spread out across a number of locations. While each individual calculation is not really the most complicated thing in the world having the duplication in so many places was constantly worrying me. I've opted in this commit to centralize all of this within the runtime at least, and now there's only one "duplicate" of this information in the fuzzing infrastructure which is to some degree less important to deduplicate. This commit introduces a new `CanonicalAbiInfo` type to house all abi size/align information for both memory32 and memory64. This new type is then used pervasively throughout fused adapter compilation, dynamic `Val` management, and typed functions. This type was also able to reduce the complexity of the macro-generated code meaning that even `wasmtime-component-macro` is performing less math than it was before. One other major feature of this commit is that this ABI information is now saved within a `ComponentTypes` structure. This avoids recursive querying of size/align information frequently and instead effectively caching it. This was a worry I had for the fused adapter compiler which frequently sought out size/align information and would recursively descend each type tree each time. The `fact-valid-module` fuzzer is now nearly 10x faster in terms of iterations/s which I suspect is due to this caching.	2022-08-09 14:52:20 -05:00
Afonso Bordado	d5de91b953	cranelift: Fuzz cold blocks (#4654 )	2022-08-09 19:43:08 +00:00
bjorn3	a4aa7258de	Remove some dead code from the abi code (#4653 ) These were originally used by the old backend framework as part of legalizing function signatures for the respective ABI.	2022-08-09 12:21:55 -07:00
Trevor Elliott	6b6fc9ec3e	ISLE: Fix a bug with extractor ordering (#4661 ) https://github.com/bytecodealliance/wasmtime/pull/4661 Co-authored-by: Chris Fallin <chris@cfallin.org>	2022-08-09 19:19:32 +00:00
Chris Fallin	953f83e6ac	Cranelift: disallow marking entry block 'cold'. (#4659 ) This is a nonsensical constraint: the entry block must come first in the compiled code's layout, so it cannot also be sunk to the end of the function. This PR modifies the CLIF verifier to disallow this situation entirely. It also adds an assert during final block-order computation to catch the problem (and avoid a silent miscompile) even if the verifier is disabled. Fixes #4656.	2022-08-09 11:52:30 -07:00
Alex Crichton	66025636fd	Remove a layer of recursion in adapter compilation (#4657 ) In #4640 a feature was added to adapter modules that whenever translation goes through memory it instead goes through a helper function as opposed to inlining it directly. The generation of the helper function happened recursively at compile time, however, and sure enough oss-fuzz has found an input which blows the host stack at compile time. This commit removes the compile-time recursion from the adapter compiler when translating these helper functions by deferring the translation to a worklist which is processed after the original function is translated. This makes the stack-based recursion instead heap-based, removing the stack overflow.	2022-08-09 12:59:53 -05:00
Chris Fallin	de8d44d0e5	Cranelift: MachBuffer: apply branch peephole opts one last time at buffer tail. (#4652 ) The `MachBuffer` applies a set of peephole-optimization rules to do branch threading, leverage fallthrough paths, eliminate empty blocks, and flip conditional branches where needed to make branches more efficient starting from naive always-branch-at-end-of-BB code. This works by applying the rules at every label-bind, which is equivalent to applying them at the end of every basic block, where branches are usually inserted. However, this misses one case: the end of the buffer! Currently we don't optimize any redundant or foldable branches at the very end of the machine code. This usually doesn't matter when the function ends in an epilogue with `ret` as the last instruction. However, when cold blocks exist, it can actually matter. Thanks to @mchesser for pointing out this issue in #4636.	2022-08-09 10:38:48 -07:00
Trevor Elliott	ed7dfd3925	x64: Peephole optimization for `x < 0` (#4625 ) https://github.com/bytecodealliance/wasmtime/pull/4625 Fixes #4607	2022-08-09 09:45:53 -07:00
Afonso Bordado	a36a52a017	cranelift: Print error message when basic blocks are invalid (#4591 )	2022-08-09 09:28:41 -07:00
Afonso Bordado	dd6e790090	cranelift: Fuzz Argument Extensions in clif-fuzzer (#4589 )	2022-08-09 09:03:38 -07:00
Alex Crichton	867f5c1244	Update behavior of zero-length lists/strings (#4648 ) The spec was expected to change to not bounds-check 0-byte lists/strings but has since been updated to match `memory.copy` which does indeed check the pointer for 0-byte copies.	2022-08-09 09:26:33 -05:00
Michael Chesser	8aee85ebaa	Propagate cold annotations to edge blocks (#4636 ) Update the lowering stage to mark edge blocks as cold if either the predecessor or successor block is cold.	2022-08-09 05:05:57 +00:00
Nick Fitzgerald	0b1f51f804	Remove unnecessary parens around expression (#4647 ) Fixes a compiler warning.	2022-08-08 15:48:03 -07:00
Nick Fitzgerald	e81ad3c7eb	cli-flags: Don't ignore the first flag in `CommonOptions::parse_from_str` (#4642 )	2022-08-08 15:25:15 -07:00
Alex Crichton	c816a52746	Reuse locals in adapter trampolines (#4646 ) This commit implements a scheme I've been meaning to work on in the adapter compiler where instead of always generating a fresh local for all operations locals may now be reused. Locals generated are explicitly free'd when their lexical scope has ended, allowing reuse in translation of later types in the adapter. This also implements a new scheme for initializing locals where previously a local could simply be generated, but now the local must be fused with its initializer where a `local.{tee,set}` instruction is always generated. This should help prevent a bug I ran into with strings where one usage of a local was forgotten to be initialized which meant that when it was used during a loop it may have had a stale value from before. Modeling this in Rust isn't possible at compile time unfortunately so I opted for the next best thing, runtime panics. If a local is accidentally not released back to the pool of free locals then it will panic. The fuzzer for simply generating and validating adapter modules should be good at exercising this and it weeded out a few forgotten free's and should be good now.	2022-08-08 21:18:04 +00:00
Chris Fallin	863659e04f	VCode emission: account for RA spill/reload/moves in worst-case block size. (#4644 ) To determine whether we need to insert a "veneer island" of branch-range extension veneers, we need to know ahead of emitting a basic block the worst-case size of that block. This is because veneers only go between blocks (we could plop one in the middle of a block but that would require another jump around it and would probably pessimize some code significantly), and we can't back up once we emit a block. To compute this worst-case size, we take the number of instructions and multiply by the largest possible size of one pseudoinst (e.g., on aarch64, this is 44 bytes; it explicitly excludes the `EmitIsland` pseudo-op which is used before large jumptable inline offset tables are emitted). This is conservative, but it always works, and veneers are somewhat rare in practice (function body >1MiB on aarch64 for example). Unfortunately this logic didn't account for the spill/reload/move instructions inserted by the register allocator, and in one example in issue #4629, a block had only one instruction but 482 edge-moves (!). This came at just the wrong time as we were approaching the 1MiB limit on aarch64. This PR fixes that issue, and fixes the logic to actually look at the correct next block (next in `final_order` rather than numerically next), as a bonus correctness fix. Fixes #4629.	2022-08-08 13:57:18 -07:00
Nick Fitzgerald	ec47335b9c	wasmtime: Add a `Config::native_unwind_info` method (#4643 ) This method configures whether native unwind information (e.g. `.eh_frame` on Linux) is generated or not. This helps integrate with third-party stack capturing tools, such as the system unwinder or the `backtrace` crate. It does not affect whether Wasmtime can capture stack traces in Wasm code that it is running or not. Unwind info is always enabled on Windows, since the Windows ABI requires it. This configuration option defaults to true. Additionally, we deprecate `Config::wasm_backtrace` since we can always cheaply capture stack traces ever since https://github.com/bytecodealliance/wasmtime/pull/4431. Fixes https://github.com/bytecodealliance/wasmtime/issues/4554	2022-08-08 13:54:51 -07:00
Damian Heaton	e463890f26	Port `AvgRound` & `SqmulRoundSat` to ISLE (AArch64) (#4639 ) Ported the existing implementations of the following opcodes on AArch64 to ISLE: - `AvgRound` - Also introduced support for `i64x2` vectors, as per the docs. - `SqmulRoundSat` Copyright (c) 2022 Arm Limited	2022-08-08 11:35:43 -07:00
Damian Heaton	47a67d752b	Split `Fmla` and `Bsl` out into new `VecRRRMod` op (#4638 ) Separates the following opcodes for AArch64 into a separate `VecALUModOp` enum, which is emitted via the `VecRRRMod` instruction. This separates vector ALU instructions which modify a register from instructions which write to a new register: - `Bsl` - `Fmla` Addresses [a discussion](https://github.com/bytecodealliance/wasmtime/pull/4608#discussion_r937975581) in #4608. Copyright (c) 2022 Arm Limited	2022-08-08 11:33:13 -07:00
Alex Crichton	866ec46613	Implement roundtrip fuzzing of component adapters (#4640 ) * Improve the `component_api` fuzzer on a few dimensions * Update the generated component to use an adapter module. This involves two core wasm instances communicating with each other to test that data flows through everything correctly. The intention here is to fuzz the fused adapter compiler. String encoding options have been plumbed here to exercise differences in string encodings. * Use `Cow<'static, ...>` and `static` declarations for each static test case to try to cut down on rustc codegen time. * Add `Copy` to derivation of fuzzed enums to make `derive(Clone)` smaller. * Use `Store<Box<dyn Any>>` to try to cut down on codegen by monomorphizing fewer `Store<T>` implementation. * Add debug logging to print out what's flowing in and what's flowing out for debugging failures. * Improve `Debug` representation of dynamic value types to more closely match their Rust counterparts. * Fix a variant issue with adapter trampolines Previously the offset of the payload was calculated as the discriminant aligned up to the alignment of a singular case, but instead this needs to be aligned up to the alignment of all cases to ensure all cases start at the same location. * Fix a copy/paste error when copying masked integers A 32-bit load was actually doing a 16-bit load by accident since it was copied from the 16-bit load-and-mask case. * Fix f32/i64 conversions in adapter modules The adapter previously erroneously converted the f32 to f64 and then to i64, where instead it should go from f32 to i32 to i64. * Fix zero-sized flags in adapter modules This commit corrects the size calculation for zero-sized flags in adapter modules. cc #4592 * Fix a variant size calculation bug in adapters This fixes the same issue found with variants during normal host-side fuzzing earlier where the size of a variant needs to align up the summation of the discriminant and the maximum case size. * Implement memory growth in libc bump realloc Some fuzz-generated test cases are copying lists large enough to exceed one page of memory so bake in a `memory.grow` to the bump allocator as well. * Avoid adapters of exponential size This commit is an attempt to avoid adapters being exponentially sized with respect to the type hierarchy of the input. Previously all adaptation was done inline within each adapter which meant that if something was structured as `tuple<T, T, T, T, ...>` the translation of `T` would be inlined N times. For very deeply nested types this can quickly create an exponentially sized adapter with types of the form: (type $t0 (list u8)) (type $t1 (tuple $t0 $t0)) (type $t2 (tuple $t1 $t1)) (type $t3 (tuple $t2 $t2)) ;; ... where the translation of `t4` has 8 different copies of translating `t0`. This commit changes the translation of types through memory to almost always go through a helper function. The hope here is that it doesn't lose too much performance because types already reside in memory. This can still lead to exponentially sized adapter modules to a lesser degree where if the translation all happens on the "stack", e.g. via `variant`s and their flat representation then many copies of one translation could still be made. For now this commit at least gets the problem under control for fuzzing where fuzzing doesn't trivially find type hierarchies that take over a minute to codegen the adapter module. One of the main tricky parts of this implementation is that when a function is generated the index that it will be placed at in the final module is not known at that time. To solve this the encoded form of the `Call` instruction is saved in a relocation-style format where the `Call` isn't encoded but instead saved into a different area for encoding later. When the entire adapter module is encoded to wasm these pseudo-`Call` instructions are encoded as real instructions at that time. * Fix some memory64 issues with string encodings Introduced just before #4623 I had a few mistakes related to 64-bit memories and mixing 32/64-bit memories. * Actually insert into the `translate_mem_funcs` map This... was the whole point of having the map! * Assert memory growth succeeds in bump allocator	2022-08-08 18:01:45 +00:00
Alex Crichton	650979ae40	Implement strings in adapter modules (#4623 ) * Implement strings in adapter modules This commit is a hefty addition to Wasmtime's support for the component model. This implements the final remaining type (in the current type hierarchy) unimplemented in adapter module trampolines: strings. Strings are the most complicated type to implement in adapter trampolines because they are highly structured chunks of data in memory (according to specific encodings). Additionally each lift/lower operation can choose its own encoding for strings meaning that Wasmtime, the host, may have to convert between any pairwise ordering of string encodings. The `CanonicalABI.md` in the component-model repo in general specifies all the fiddly bits of string encoding so there's not a ton of wiggle room for Wasmtime to get creative. This PR largely "just" implements that. The high-level architecture of this implementation is: * Fused adapters are first identified to determine src/dst string encodings. This statically fixes what transcoding operation is being performed. * The generated adapter will be responsible for managing calls to `realloc` and performing bounds checks. The adapter itself does not perform memory copies or validation of string contents, however. Instead each transcoding operation is modeled as an imported function into the adapter module. This means that the adapter module dynamically, during compile time, determines what string transcoders are needed. Note that an imported transcoder is not only parameterized over the transcoding operation but additionally which memory is the source and which is the destination. * The imported core wasm functions are modeled as a new `CoreDef::Transcoder` structure. These transcoders end up being small Cranelift-compiled trampolines. The Cranelift-compiled trampoline will load the actual base pointer of memory and add it to the relative pointers passed as function arguments. This trampoline then calls a transcoder "libcall" which enters Rust-defined functions for actual transcoding operations. * Each possible transcoding operation is implemented in Rust with a unique name and a unique signature depending on the needs of the transcoder. I've tried to document inline what each transcoder does. This means that the `Module::translate_string` in adapter modules is by far the largest translation method. The main reason for this is due to the management around calling the imported transcoder functions in the face of validating string pointer/lengths and performing the dance of `realloc`-vs-transcode at the right time. I've tried to ensure that each individual case in transcoding is documented well enough to understand what's going on as well. Additionally in this PR is a full implementation in the host for the `latin1+utf16` encoding which means that both lifting and lowering host strings now works with this encoding. Currently the implementation of each transcoder function is likely far from optimal. Where possible I've leaned on the standard library itself and for latin1-related things I'm leaning on the `encoding_rs` crate. I initially tried to implement everything with `encoding_rs` but was unable to uniformly do so easily. For now I settled on trying to get a known-correct (even in the face of endianness) implementation for all of these transcoders. If an when performance becomes an issue it should be possible to implement more optimized versions of each of these transcoding operations. Testing this commit has been somewhat difficult and my general plan, like with the `(list T)` type, is to rely heavily on fuzzing to cover the various cases here. In this PR though I've added a simple test that pushes some statically known strings through all the pairs of encodings between source and destination. I've attempted to pick "interesting" strings that one way or another stress the various paths in each transcoding operation to ideally get full branch coverage there. Additionally a suite of "negative" tests have also been added to ensure that validity of encoding is actually checked. * Fix a temporarily commented out case * Fix wasmtime-runtime tests * Update deny.toml configuration * Add `BSD-3-Clause` for the `encoding_rs` crate * Remove some unused licenses * Add an exemption for `encoding_rs` for now * Split up the `translate_string` method Move out all the closures and package up captured state into smaller lists of arguments. * Test out-of-bounds for zero-length strings	2022-08-08 16:01:57 +00:00

1 2 3 4 5 ...

10164 Commits