7c67378ab6c3348743f313b7aa78cc17f20d1640
7 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
bc8e36a6af |
Refactor and optimize the flat type calculations (#4708)
* Optimize flat type representation calculations Previously calculating the flat type representation would be done recursively for an entire type tree every time it was visited. Additionally the flat type representation was entirely built only to be thrown away if it was too large at the end. This chiefly presented a source of recursion based on the type structure in the component model which fuzzing does not like as it reports stack overflows. This commit overhauls the representation of flat types in Wasmtime by caching the representation for each type in the compile-time `ComponentTypesBuilder` structure. This avoids recalculating each time the flat representation is queried and additionally allows opportunity to have more short-circuiting to avoid building overly-large vectors. * Remove duplicate flat count calculation in wasmtime Roughly share the infrastructure in the `wasmtime-environ` crate, namely the non-recursive and memoizing nature of the calculation. * Fix component fuzz build * Fix example compile |
||
|
|
bd70dbebbd |
Deduplicate some size/align calculations (#4658)
This commit is an effort to reduce the amount of complexity around managing the size/alignment calculations of types in the canonical ABI. Previously the logic for the size/alignment of a type was spread out across a number of locations. While each individual calculation is not really the most complicated thing in the world having the duplication in so many places was constantly worrying me. I've opted in this commit to centralize all of this within the runtime at least, and now there's only one "duplicate" of this information in the fuzzing infrastructure which is to some degree less important to deduplicate. This commit introduces a new `CanonicalAbiInfo` type to house all abi size/align information for both memory32 and memory64. This new type is then used pervasively throughout fused adapter compilation, dynamic `Val` management, and typed functions. This type was also able to reduce the complexity of the macro-generated code meaning that even `wasmtime-component-macro` is performing less math than it was before. One other major feature of this commit is that this ABI information is now saved within a `ComponentTypes` structure. This avoids recursive querying of size/align information frequently and instead effectively caching it. This was a worry I had for the fused adapter compiler which frequently sought out size/align information and would recursively descend each type tree each time. The `fact-valid-module` fuzzer is now nearly 10x faster in terms of iterations/s which I suspect is due to this caching. |
||
|
|
866ec46613 |
Implement roundtrip fuzzing of component adapters (#4640)
* Improve the `component_api` fuzzer on a few dimensions * Update the generated component to use an adapter module. This involves two core wasm instances communicating with each other to test that data flows through everything correctly. The intention here is to fuzz the fused adapter compiler. String encoding options have been plumbed here to exercise differences in string encodings. * Use `Cow<'static, ...>` and `static` declarations for each static test case to try to cut down on rustc codegen time. * Add `Copy` to derivation of fuzzed enums to make `derive(Clone)` smaller. * Use `Store<Box<dyn Any>>` to try to cut down on codegen by monomorphizing fewer `Store<T>` implementation. * Add debug logging to print out what's flowing in and what's flowing out for debugging failures. * Improve `Debug` representation of dynamic value types to more closely match their Rust counterparts. * Fix a variant issue with adapter trampolines Previously the offset of the payload was calculated as the discriminant aligned up to the alignment of a singular case, but instead this needs to be aligned up to the alignment of all cases to ensure all cases start at the same location. * Fix a copy/paste error when copying masked integers A 32-bit load was actually doing a 16-bit load by accident since it was copied from the 16-bit load-and-mask case. * Fix f32/i64 conversions in adapter modules The adapter previously erroneously converted the f32 to f64 and then to i64, where instead it should go from f32 to i32 to i64. * Fix zero-sized flags in adapter modules This commit corrects the size calculation for zero-sized flags in adapter modules. cc #4592 * Fix a variant size calculation bug in adapters This fixes the same issue found with variants during normal host-side fuzzing earlier where the size of a variant needs to align up the summation of the discriminant and the maximum case size. * Implement memory growth in libc bump realloc Some fuzz-generated test cases are copying lists large enough to exceed one page of memory so bake in a `memory.grow` to the bump allocator as well. * Avoid adapters of exponential size This commit is an attempt to avoid adapters being exponentially sized with respect to the type hierarchy of the input. Previously all adaptation was done inline within each adapter which meant that if something was structured as `tuple<T, T, T, T, ...>` the translation of `T` would be inlined N times. For very deeply nested types this can quickly create an exponentially sized adapter with types of the form: (type $t0 (list u8)) (type $t1 (tuple $t0 $t0)) (type $t2 (tuple $t1 $t1)) (type $t3 (tuple $t2 $t2)) ;; ... where the translation of `t4` has 8 different copies of translating `t0`. This commit changes the translation of types through memory to almost always go through a helper function. The hope here is that it doesn't lose too much performance because types already reside in memory. This can still lead to exponentially sized adapter modules to a lesser degree where if the translation all happens on the "stack", e.g. via `variant`s and their flat representation then many copies of one translation could still be made. For now this commit at least gets the problem under control for fuzzing where fuzzing doesn't trivially find type hierarchies that take over a minute to codegen the adapter module. One of the main tricky parts of this implementation is that when a function is generated the index that it will be placed at in the final module is not known at that time. To solve this the encoded form of the `Call` instruction is saved in a relocation-style format where the `Call` isn't encoded but instead saved into a different area for encoding later. When the entire adapter module is encoded to wasm these pseudo-`Call` instructions are encoded as real instructions at that time. * Fix some memory64 issues with string encodings Introduced just before #4623 I had a few mistakes related to 64-bit memories and mixing 32/64-bit memories. * Actually insert into the `translate_mem_funcs` map This... was the whole point of having the map! * Assert memory growth succeeds in bump allocator |
||
|
|
650979ae40 |
Implement strings in adapter modules (#4623)
* Implement strings in adapter modules This commit is a hefty addition to Wasmtime's support for the component model. This implements the final remaining type (in the current type hierarchy) unimplemented in adapter module trampolines: strings. Strings are the most complicated type to implement in adapter trampolines because they are highly structured chunks of data in memory (according to specific encodings). Additionally each lift/lower operation can choose its own encoding for strings meaning that Wasmtime, the host, may have to convert between any pairwise ordering of string encodings. The `CanonicalABI.md` in the component-model repo in general specifies all the fiddly bits of string encoding so there's not a ton of wiggle room for Wasmtime to get creative. This PR largely "just" implements that. The high-level architecture of this implementation is: * Fused adapters are first identified to determine src/dst string encodings. This statically fixes what transcoding operation is being performed. * The generated adapter will be responsible for managing calls to `realloc` and performing bounds checks. The adapter itself does not perform memory copies or validation of string contents, however. Instead each transcoding operation is modeled as an imported function into the adapter module. This means that the adapter module dynamically, during compile time, determines what string transcoders are needed. Note that an imported transcoder is not only parameterized over the transcoding operation but additionally which memory is the source and which is the destination. * The imported core wasm functions are modeled as a new `CoreDef::Transcoder` structure. These transcoders end up being small Cranelift-compiled trampolines. The Cranelift-compiled trampoline will load the actual base pointer of memory and add it to the relative pointers passed as function arguments. This trampoline then calls a transcoder "libcall" which enters Rust-defined functions for actual transcoding operations. * Each possible transcoding operation is implemented in Rust with a unique name and a unique signature depending on the needs of the transcoder. I've tried to document inline what each transcoder does. This means that the `Module::translate_string` in adapter modules is by far the largest translation method. The main reason for this is due to the management around calling the imported transcoder functions in the face of validating string pointer/lengths and performing the dance of `realloc`-vs-transcode at the right time. I've tried to ensure that each individual case in transcoding is documented well enough to understand what's going on as well. Additionally in this PR is a full implementation in the host for the `latin1+utf16` encoding which means that both lifting and lowering host strings now works with this encoding. Currently the implementation of each transcoder function is likely far from optimal. Where possible I've leaned on the standard library itself and for latin1-related things I'm leaning on the `encoding_rs` crate. I initially tried to implement everything with `encoding_rs` but was unable to uniformly do so easily. For now I settled on trying to get a known-correct (even in the face of endianness) implementation for all of these transcoders. If an when performance becomes an issue it should be possible to implement more optimized versions of each of these transcoding operations. Testing this commit has been somewhat difficult and my general plan, like with the `(list T)` type, is to rely heavily on fuzzing to cover the various cases here. In this PR though I've added a simple test that pushes some statically known strings through all the pairs of encodings between source and destination. I've attempted to pick "interesting" strings that one way or another stress the various paths in each transcoding operation to ideally get full branch coverage there. Additionally a suite of "negative" tests have also been added to ensure that validity of encoding is actually checked. * Fix a temporarily commented out case * Fix wasmtime-runtime tests * Update deny.toml configuration * Add `BSD-3-Clause` for the `encoding_rs` crate * Remove some unused licenses * Add an exemption for `encoding_rs` for now * Split up the `translate_string` method Move out all the closures and package up captured state into smaller lists of arguments. * Test out-of-bounds for zero-length strings |
||
|
|
ed8908efcf |
implement fuzzing for component types (#4537)
This addresses #4307. For the static API we generate 100 arbitrary test cases at build time, each of which includes 0-5 parameter types, a result type, and a WAT fragment containing an imported function and an exported function. The exported function calls the imported function, which is implemented by the host. At runtime, the fuzz test selects a test case at random and feeds it zero or more sets of arbitrary parameters and results, checking that values which flow host-to-guest and guest-to-host make the transition unchanged. The fuzz test for the dynamic API follows a similar pattern, the only difference being that test cases are generated at runtime. Signed-off-by: Joel Dice <joel.dice@fermyon.com> |
||
|
|
893fadb485 |
components: Fix support for 0-sized flags (#4560)
This commit goes through and updates support in the various argument passing routines to support 0-sized flags. A bit of a degenerate case but clarified in WebAssembly/component-model#76 as intentional. |
||
|
|
7c67e620c4 |
support dynamic function calls in component model (#4442)
* support dynamic function calls in component model This addresses #4310, introducing a new `component::values::Val` type for representing component values dynamically, as well as `component::types::Type` for representing the corresponding interface types. It also adds a `call` method to `component::func::Func`, which takes a slice of `Val`s as parameters and returns a `Result<Val>` representing the result. Note that I've moved `post_return` and `call_raw` from `TypedFunc` to `Func` since there was nothing specific to `TypedFunc` about them, and I wanted to reuse them. The code in both is unchanged beyond the trivial tweaks to make them fit in their new home. Signed-off-by: Joel Dice <joel.dice@fermyon.com> * order variants and match cases more consistently Signed-off-by: Joel Dice <joel.dice@fermyon.com> * implement lift for String, Box<str>, etc. This also removes the redundant `store` parameter from `Type::load`. Signed-off-by: Joel Dice <joel.dice@fermyon.com> * implement code review feedback This fixes a few issues: - Bad offset calculation when lowering - Missing variant padding - Style issues regarding `types::Handle` - Missed opportunities to reuse `Lift` and `Lower` impls It also adds forwarding `Lift` impls for `Box<[T]>`, `Vec<T>`, etc. Signed-off-by: Joel Dice <joel.dice@fermyon.com> * move `new_*` methods to specific `types` structs Per review feedback, I've moved `Type::new_record` to `Record::new_val` and added a `Type::unwrap_record` method; likewise for the other kinds of types. Signed-off-by: Joel Dice <joel.dice@fermyon.com> * make tuple, option, and expected type comparisons recursive These types should compare as equal across component boundaries as long as their type parameters are equal. Signed-off-by: Joel Dice <joel.dice@fermyon.com> * improve error diagnostic in `Type::check` We now distinguish between more failure cases to provide an informative error message. Signed-off-by: Joel Dice <joel.dice@fermyon.com> * address review feedback - Remove `WasmStr::to_str_from_memory` and `WasmList::get_from_memory` - add `try_new` methods to various `values` types - avoid using `ExactSizeIterator::len` where we can't trust it - fix over-constrained bounds on forwarded `ComponentType` impls Signed-off-by: Joel Dice <joel.dice@fermyon.com> * rearrange code per review feedback - Move functions from `types` to `values` module so we can make certain struct fields private - Rename `try_new` to just `new` Signed-off-by: Joel Dice <joel.dice@fermyon.com> * remove special-case equality test for tuples, options, and expecteds Instead, I've added a FIXME comment and will open an issue to do recursive structural equality testing. Signed-off-by: Joel Dice <joel.dice@fermyon.com> |