Implement roundtrip fuzzing of component adapters (#4640)
* Improve the `component_api` fuzzer on a few dimensions * Update the generated component to use an adapter module. This involves two core wasm instances communicating with each other to test that data flows through everything correctly. The intention here is to fuzz the fused adapter compiler. String encoding options have been plumbed here to exercise differences in string encodings. * Use `Cow<'static, ...>` and `static` declarations for each static test case to try to cut down on rustc codegen time. * Add `Copy` to derivation of fuzzed enums to make `derive(Clone)` smaller. * Use `Store<Box<dyn Any>>` to try to cut down on codegen by monomorphizing fewer `Store<T>` implementation. * Add debug logging to print out what's flowing in and what's flowing out for debugging failures. * Improve `Debug` representation of dynamic value types to more closely match their Rust counterparts. * Fix a variant issue with adapter trampolines Previously the offset of the payload was calculated as the discriminant aligned up to the alignment of a singular case, but instead this needs to be aligned up to the alignment of all cases to ensure all cases start at the same location. * Fix a copy/paste error when copying masked integers A 32-bit load was actually doing a 16-bit load by accident since it was copied from the 16-bit load-and-mask case. * Fix f32/i64 conversions in adapter modules The adapter previously erroneously converted the f32 to f64 and then to i64, where instead it should go from f32 to i32 to i64. * Fix zero-sized flags in adapter modules This commit corrects the size calculation for zero-sized flags in adapter modules. cc #4592 * Fix a variant size calculation bug in adapters This fixes the same issue found with variants during normal host-side fuzzing earlier where the size of a variant needs to align up the summation of the discriminant and the maximum case size. * Implement memory growth in libc bump realloc Some fuzz-generated test cases are copying lists large enough to exceed one page of memory so bake in a `memory.grow` to the bump allocator as well. * Avoid adapters of exponential size This commit is an attempt to avoid adapters being exponentially sized with respect to the type hierarchy of the input. Previously all adaptation was done inline within each adapter which meant that if something was structured as `tuple<T, T, T, T, ...>` the translation of `T` would be inlined N times. For very deeply nested types this can quickly create an exponentially sized adapter with types of the form: (type $t0 (list u8)) (type $t1 (tuple $t0 $t0)) (type $t2 (tuple $t1 $t1)) (type $t3 (tuple $t2 $t2)) ;; ... where the translation of `t4` has 8 different copies of translating `t0`. This commit changes the translation of types through memory to almost always go through a helper function. The hope here is that it doesn't lose too much performance because types already reside in memory. This can still lead to exponentially sized adapter modules to a lesser degree where if the translation all happens on the "stack", e.g. via `variant`s and their flat representation then many copies of one translation could still be made. For now this commit at least gets the problem under control for fuzzing where fuzzing doesn't trivially find type hierarchies that take over a minute to codegen the adapter module. One of the main tricky parts of this implementation is that when a function is generated the index that it will be placed at in the final module is not known at that time. To solve this the encoded form of the `Call` instruction is saved in a relocation-style format where the `Call` isn't encoded but instead saved into a different area for encoding later. When the entire adapter module is encoded to wasm these pseudo-`Call` instructions are encoded as real instructions at that time. * Fix some memory64 issues with string encodings Introduced just before #4623 I had a few mistakes related to 64-bit memories and mixing 32/64-bit memories. * Actually insert into the `translate_mem_funcs` map This... was the whole point of having the map! * Assert memory growth succeeds in bump allocator
This commit is contained in:
@@ -8,7 +8,8 @@
|
||||
|
||||
use arbitrary::{Arbitrary, Unstructured};
|
||||
use proc_macro2::{Ident, TokenStream};
|
||||
use quote::{format_ident, quote};
|
||||
use quote::{format_ident, quote, ToTokens};
|
||||
use std::borrow::Cow;
|
||||
use std::fmt::{self, Debug, Write};
|
||||
use std::iter;
|
||||
use std::ops::Deref;
|
||||
@@ -328,7 +329,7 @@ fn variant_size_and_alignment<'a>(
|
||||
}
|
||||
}
|
||||
|
||||
fn make_import_and_export(params: &[Type], result: &Type) -> Box<str> {
|
||||
fn make_import_and_export(params: &[Type], result: &Type) -> String {
|
||||
let params_lowered = params
|
||||
.iter()
|
||||
.flat_map(|ty| ty.lowered())
|
||||
@@ -400,7 +401,6 @@ fn make_import_and_export(params: &[Type], result: &Type) -> Box<str> {
|
||||
)"#
|
||||
)
|
||||
}
|
||||
.into()
|
||||
}
|
||||
|
||||
fn make_rust_name(name_counter: &mut u32) -> Ident {
|
||||
@@ -509,7 +509,7 @@ pub fn rust_type(ty: &Type, name_counter: &mut u32, declarations: &mut TokenStre
|
||||
let name = make_rust_name(name_counter);
|
||||
|
||||
declarations.extend(quote! {
|
||||
#[derive(ComponentType, Lift, Lower, PartialEq, Debug, Clone, Arbitrary)]
|
||||
#[derive(ComponentType, Lift, Lower, PartialEq, Debug, Copy, Clone, Arbitrary)]
|
||||
#[component(enum)]
|
||||
enum #name {
|
||||
#cases
|
||||
@@ -677,13 +677,17 @@ fn write_component_type(
|
||||
#[derive(Debug)]
|
||||
pub struct Declarations {
|
||||
/// Type declarations (if any) referenced by `params` and/or `result`
|
||||
pub types: Box<str>,
|
||||
pub types: Cow<'static, str>,
|
||||
/// Parameter declarations used for the imported and exported functions
|
||||
pub params: Box<str>,
|
||||
pub params: Cow<'static, str>,
|
||||
/// Result declaration used for the imported and exported functions
|
||||
pub result: Box<str>,
|
||||
pub result: Cow<'static, str>,
|
||||
/// A WAT fragment representing the core function import and export to use for testing
|
||||
pub import_and_export: Box<str>,
|
||||
pub import_and_export: Cow<'static, str>,
|
||||
/// String encoding to use for host -> component
|
||||
pub encoding1: StringEncoding,
|
||||
/// String encoding to use for component -> host
|
||||
pub encoding2: StringEncoding,
|
||||
}
|
||||
|
||||
impl Declarations {
|
||||
@@ -694,7 +698,44 @@ impl Declarations {
|
||||
params,
|
||||
result,
|
||||
import_and_export,
|
||||
encoding1,
|
||||
encoding2,
|
||||
} = self;
|
||||
let mk_component = |name: &str, encoding: StringEncoding| {
|
||||
format!(
|
||||
r#"
|
||||
(component ${name}
|
||||
(import "echo" (func $f (type $sig)))
|
||||
|
||||
(core instance $libc (instantiate $libc))
|
||||
|
||||
(core func $f_lower (canon lower
|
||||
(func $f)
|
||||
(memory $libc "memory")
|
||||
(realloc (func $libc "realloc"))
|
||||
string-encoding={encoding}
|
||||
))
|
||||
|
||||
(core instance $i (instantiate $m
|
||||
(with "libc" (instance $libc))
|
||||
(with "host" (instance (export "{IMPORT_FUNCTION}" (func $f_lower))))
|
||||
))
|
||||
|
||||
(func (export "echo") (type $sig)
|
||||
(canon lift
|
||||
(core func $i "echo")
|
||||
(memory $libc "memory")
|
||||
(realloc (func $libc "realloc"))
|
||||
string-encoding={encoding}
|
||||
)
|
||||
)
|
||||
)
|
||||
"#
|
||||
)
|
||||
};
|
||||
|
||||
let c1 = mk_component("c1", *encoding2);
|
||||
let c2 = mk_component("c2", *encoding1);
|
||||
|
||||
format!(
|
||||
r#"
|
||||
@@ -704,18 +745,6 @@ impl Declarations {
|
||||
{REALLOC_AND_FREE}
|
||||
)
|
||||
|
||||
(core instance $libc (instantiate $libc))
|
||||
|
||||
{types}
|
||||
|
||||
(import "{IMPORT_FUNCTION}" (func $f {params} {result}))
|
||||
|
||||
(core func $f_lower (canon lower
|
||||
(func $f)
|
||||
(memory $libc "memory")
|
||||
(realloc (func $libc "realloc"))
|
||||
))
|
||||
|
||||
(core module $m
|
||||
(memory (import "libc" "memory") 1)
|
||||
(func $realloc (import "libc" "realloc") (param i32 i32 i32 i32) (result i32))
|
||||
@@ -723,18 +752,16 @@ impl Declarations {
|
||||
{import_and_export}
|
||||
)
|
||||
|
||||
(core instance $i (instantiate $m
|
||||
(with "libc" (instance $libc))
|
||||
(with "host" (instance (export "{IMPORT_FUNCTION}" (func $f_lower))))
|
||||
))
|
||||
{types}
|
||||
|
||||
(func (export "echo") {params} {result}
|
||||
(canon lift
|
||||
(core func $i "echo")
|
||||
(memory $libc "memory")
|
||||
(realloc (func $libc "realloc"))
|
||||
)
|
||||
)
|
||||
(type $sig (func {params} {result}))
|
||||
(import "{IMPORT_FUNCTION}" (func $f (type $sig)))
|
||||
|
||||
{c1}
|
||||
{c2}
|
||||
(instance $c1 (instantiate $c1 (with "echo" (func $f))))
|
||||
(instance $c2 (instantiate $c2 (with "echo" (func $c1 "echo"))))
|
||||
(export "echo" (func $c2 "echo"))
|
||||
)"#,
|
||||
)
|
||||
.into()
|
||||
@@ -748,6 +775,10 @@ pub struct TestCase {
|
||||
pub params: Box<[Type]>,
|
||||
/// The type of the result to be returned by the function
|
||||
pub result: Type,
|
||||
/// String encoding to use from host-to-component.
|
||||
pub encoding1: StringEncoding,
|
||||
/// String encoding to use from component-to-host.
|
||||
pub encoding2: StringEncoding,
|
||||
}
|
||||
|
||||
impl TestCase {
|
||||
@@ -781,7 +812,9 @@ impl TestCase {
|
||||
types: types.into(),
|
||||
params,
|
||||
result,
|
||||
import_and_export,
|
||||
import_and_export: import_and_export.into(),
|
||||
encoding1: self.encoding1,
|
||||
encoding2: self.encoding2,
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -795,6 +828,36 @@ impl<'a> Arbitrary<'a> for TestCase {
|
||||
.take(MAX_ARITY)
|
||||
.collect::<arbitrary::Result<Box<[_]>>>()?,
|
||||
result: input.arbitrary()?,
|
||||
encoding1: input.arbitrary()?,
|
||||
encoding2: input.arbitrary()?,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Copy, Clone, Debug, Arbitrary)]
|
||||
pub enum StringEncoding {
|
||||
Utf8,
|
||||
Utf16,
|
||||
Latin1OrUtf16,
|
||||
}
|
||||
|
||||
impl fmt::Display for StringEncoding {
|
||||
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
|
||||
match self {
|
||||
StringEncoding::Utf8 => fmt::Display::fmt(&"utf8", f),
|
||||
StringEncoding::Utf16 => fmt::Display::fmt(&"utf16", f),
|
||||
StringEncoding::Latin1OrUtf16 => fmt::Display::fmt(&"latin1+utf16", f),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl ToTokens for StringEncoding {
|
||||
fn to_tokens(&self, tokens: &mut TokenStream) {
|
||||
let me = match self {
|
||||
StringEncoding::Utf8 => quote!(Utf8),
|
||||
StringEncoding::Utf16 => quote!(Utf16),
|
||||
StringEncoding::Latin1OrUtf16 => quote!(Latin1OrUtf16),
|
||||
};
|
||||
tokens.extend(quote!(component_fuzz_util::StringEncoding::#me));
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user