[fuzz] Add a meta-differential fuzz target (#4515)

* [fuzz] Add `Module` enum, refactor `ModuleConfig` This change adds a way to create either a single-instruction module or a regular (big) `wasm-smith` module. It has some slight refactorings in preparation for the use of this new code. * [fuzz] Add `DiffValue` for differential evaluation In order to evaluate functions with randomly-generated values, we needed a common way to generate these values. Using the Wasmtime `Val` type is not great because we would like to be able to implement various traits on the new value type, e.g., to convert `Into` and `From` boxed values of other engines we differentially fuzz against. This new type, `DiffValue`, gives us a common ground for all the conversions and comparisons between the other engine types. * [fuzz] Add interface for differential engines In order to randomly choose an engine to fuzz against, we expect all of the engines to meet a common interface. The traits in this commit allow us to instantiate a module from its binary form, evaluate exported functions, and (possibly) hash the exported items of the instance. This change has some missing pieces, though: - the `wasm-spec-interpreter` needs some work to be able to create instances, evaluate a function by name, and expose exported items - the `v8` engine is not implemented yet due to the complexity of its Rust lifetimes * [fuzz] Use `ModuleFeatures` instead of existing configuration When attempting to use both wasm-smith and single-instruction modules, there is a mismatch in how we communicate what an engine must be able to support. In the first case, we could use the `ModuleConfig`, a wrapper for wasm-smith's `SwarmConfig`, but single-instruction modules do not have a `SwarmConfig`--the many options simply don't apply. Here, we instead add `ModuleFeatures` and adapt a `ModuleConfig` to that. `ModuleFeatures` then becomes the way to communicate what features an engine must support to evaluate functions in a module. * [fuzz] Add a new fuzz target using the meta-differential oracle This change adds the `differential_meta` target to the list of fuzz targets. I expect that sometime soon this could replace the other `differential*` targets, as it almost checks all the things those check. The major missing piece is that currently it only chooses single-instruction modules instead of also generating arbitrary modules using `wasm-smith`. Also, this change adds the concept of an ignorable error: some differential engines will choke with certain inputs (e.g., `wasmi` might have an old opcode mapping) which we do not want to flag as fuzz bugs. Here we wrap those errors in `DiffIgnoreError` and then use a new helper trait, `DiffIgnorable`, to downcast and inspect the `anyhow` error to only panic on non-ignorable errors; the ignorable errors are converted to one of the `arbitrary::Error` variants, which we already ignore. * [fuzz] Compare `DiffValue` NaNs more leniently Because arithmetic NaNs can contain arbitrary payload bits, checking that two differential executions should produce the same result should relax the comparison of the `F32` and `F64` types (and eventually `V128` as well... TODO). This change adds several considerations, however, so that in the future we make the comparison a bit stricter, e.g., re: canonical NaNs. This change, however, just matches the current logic used by other fuzz targets. * review: allow hashing mutate the instance state @alexcrichton requested that the interface be adapted to accommodate Wasmtime's API, in which even reading from an instance could trigger mutation of the store. * review: refactor where configurations are made compatible See @alexcrichton's [suggestion](https://github.com/bytecodealliance/wasmtime/pull/4515#discussion_r928974376). * review: convert `DiffValueType` using `TryFrom` See @alexcrichton's [comment](https://github.com/bytecodealliance/wasmtime/pull/4515#discussion_r928962394). * review: adapt target implementation to Wasmtime-specific RHS This change is joint work with @alexcrichton to adapt the structure of the fuzz target to his comments [here](https://github.com/bytecodealliance/wasmtime/pull/4515#pullrequestreview-1073247791). This change: - removes `ModuleFeatures` and the `Module` enum (for big and small modules) - upgrades `SingleInstModule` to filter out cases that are not valid for a given `ModuleConfig` - adds `DiffEngine::name()` - constructs each `DiffEngine` using a `ModuleConfig`, eliminating `DiffIgnoreError` completely - prints an execution rate to the `differential_meta` target Still TODO: - `get_exported_function_signatures` could be re-written in terms of the Wasmtime API instead `wasmparser` - the fuzzer crashes eventually, we think due to the signal handler interference between OCaml and Wasmtime - the spec interpreter has several cases that we skip for now but could be fuzzed with further work Co-authored-by: Alex Crichton <alex@alexcrichton.com> * fix: avoid SIGSEGV by explicitly initializing OCaml runtime first * review: use Wasmtime's API to retrieve exported functions Co-authored-by: Alex Crichton <alex@alexcrichton.com>
2022-08-18 17:22:58 -07:00
parent 8b6019909b
commit 5ec92d59d2
14 changed files with 1046 additions and 53 deletions
--- a/crates/fuzzing/src/generators/config.rs
+++ b/crates/fuzzing/src/generators/config.rs
@@ -92,6 +92,8 @@ impl Config {
            limits.tables = 1;
            limits.table_elements = 1_000;

+            limits.size = 1_000_000;
+
            match &mut self.wasmtime.memory_config {
                MemoryConfig::Normal(config) => {
                    config.static_memory_maximum_size = Some(limits.memory_pages * 0x10000);
@@ -101,6 +103,34 @@ impl Config {
        }
    }

+    /// Force `self` to be a configuration compatible with `other`. This is
+    /// useful for differential execution to avoid unhelpful fuzz crashes when
+    /// one engine has a feature enabled and the other does not.
+    pub fn make_compatible_with(&mut self, other: &Self) {
+        // Use the same `wasm-smith` configuration as `other` because this is
+        // used for determining what Wasm features are enabled in the engine
+        // (see `to_wasmtime`).
+        self.module_config = other.module_config.clone();
+
+        // Use the same allocation strategy between the two configs.
+        //
+        // Ideally this wouldn't be necessary, but, during differential
+        // evaluation, if the `lhs` is using ondemand and the `rhs` is using the
+        // pooling allocator (or vice versa), then the module may have been
+        // generated in such a way that is incompatible with the other
+        // allocation strategy.
+        //
+        // We can remove this in the future when it's possible to access the
+        // fields of `wasm_smith::Module` to constrain the pooling allocator
+        // based on what was actually generated.
+        self.wasmtime.strategy = other.wasmtime.strategy.clone();
+        if let InstanceAllocationStrategy::Pooling { .. } = &other.wasmtime.strategy {
+            // Also use the same memory configuration when using the pooling
+            // allocator.
+            self.wasmtime.memory_config = other.wasmtime.memory_config.clone();
+        }
+    }
+
    /// Uses this configuration and the supplied source of data to generate
    /// a wasm module.
    ///
@@ -112,13 +142,7 @@ impl Config {
        input: &mut Unstructured<'_>,
        default_fuel: Option<u32>,
    ) -> arbitrary::Result<wasm_smith::Module> {
-        let mut module = wasm_smith::Module::new(self.module_config.config.clone(), input)?;
-
-        if let Some(default_fuel) = default_fuel {
-            module.ensure_termination(default_fuel);
-        }
-
-        Ok(module)
+        self.module_config.generate(input, default_fuel)
    }

    /// Indicates that this configuration should be spec-test-compliant,
--- a/crates/fuzzing/src/generators/module_config.rs
+++ b/crates/fuzzing/src/generators/module_config.rs
@@ -1,4 +1,4 @@
-//! Generate a configuration for generating a Wasm module.
+//! Generate a Wasm module and the configuration for generating it.

 use arbitrary::{Arbitrary, Unstructured};
 use wasm_smith::SwarmConfig;
@@ -36,3 +36,26 @@ impl<'a> Arbitrary<'a> for ModuleConfig {
        Ok(ModuleConfig { config })
    }
 }
+
+impl ModuleConfig {
+    /// Uses this configuration and the supplied source of data to generate a
+    /// Wasm module.
+    ///
+    /// If a `default_fuel` is provided, the resulting module will be configured
+    /// to ensure termination; as doing so will add an additional global to the
+    /// module, the pooling allocator, if configured, must also have its globals
+    /// limit updated.
+    pub fn generate(
+        &self,
+        input: &mut Unstructured<'_>,
+        default_fuel: Option<u32>,
+    ) -> arbitrary::Result<wasm_smith::Module> {
+        let mut module = wasm_smith::Module::new(self.config.clone(), input)?;
+
+        if let Some(default_fuel) = default_fuel {
+            module.ensure_termination(default_fuel);
+        }
+
+        Ok(module)
+    }
+}
--- a/crates/fuzzing/src/generators/single_inst_module.rs
+++ b/crates/fuzzing/src/generators/single_inst_module.rs
@@ -1,6 +1,7 @@
 //! Generate Wasm modules that contain a single instruction.

-use arbitrary::{Arbitrary, Unstructured};
+use super::ModuleConfig;
+use arbitrary::Unstructured;
 use wasm_encoder::{
    CodeSection, ExportKind, ExportSection, Function, FunctionSection, Instruction, Module,
    TypeSection, ValType,
@@ -13,17 +14,38 @@ const FUNCTION_NAME: &'static str = "test";
 ///
 /// By explicitly defining the parameter and result types (versus generating the
 /// module directly), we can more easily generate values of the right type.
-#[derive(Clone, Debug)]
+#[derive(Clone)]
 pub struct SingleInstModule<'a> {
    instruction: Instruction<'a>,
    parameters: &'a [ValType],
    results: &'a [ValType],
+    feature: fn(&ModuleConfig) -> bool,
 }

 impl<'a> SingleInstModule<'a> {
-    /// Generate a binary Wasm module with a single exported function, `test`,
+    /// Choose a single-instruction module that matches `config`.
+    pub fn new(u: &mut Unstructured<'a>, config: &mut ModuleConfig) -> arbitrary::Result<&'a Self> {
+        // To avoid skipping modules unnecessarily during fuzzing, fix up the
+        // `ModuleConfig` to match the inherent limits of a single-instruction
+        // module.
+        config.config.min_funcs = 1;
+        config.config.max_funcs = 1;
+        config.config.min_tables = 0;
+        config.config.max_tables = 0;
+        config.config.min_memories = 0;
+        config.config.max_memories = 0;
+
+        // Only select instructions that match the `ModuleConfig`.
+        let instructions = &INSTRUCTIONS
+            .iter()
+            .filter(|i| (i.feature)(config))
+            .collect::<Vec<_>>();
+        u.choose(&instructions[..]).copied()
+    }
+
+    /// Encode a binary Wasm module with a single exported function, `test`,
    /// that executes the single instruction.
-    pub fn encode(&self) -> Vec<u8> {
+    pub fn to_bytes(&self) -> Vec<u8> {
        let mut module = Module::new();

        // Encode the type section.
@@ -61,12 +83,6 @@ impl<'a> SingleInstModule<'a> {
    }
 }

-impl<'a> Arbitrary<'a> for &SingleInstModule<'_> {
-    fn arbitrary(u: &mut Unstructured<'a>) -> arbitrary::Result<Self> {
-        u.choose(&INSTRUCTIONS)
-    }
-}
-
 // MACROS
 //
 // These macros make it a bit easier to define the instructions available for
@@ -91,39 +107,52 @@ macro_rules! valtype {

 macro_rules! binary {
    ($inst:ident, $rust_ty:tt) => {
-        binary! { $inst, valtype!($rust_ty), valtype!($rust_ty) }
+        binary! { $inst, $rust_ty, $rust_ty }
    };
-    ($inst:ident, $arguments_ty:expr,  $result_ty:expr) => {
+    ($inst:ident, $arguments_ty:tt,  $result_ty:tt) => {
        SingleInstModule {
            instruction: Instruction::$inst,
-            parameters: &[$arguments_ty, $arguments_ty],
-            results: &[$result_ty],
+            parameters: &[valtype!($arguments_ty), valtype!($arguments_ty)],
+            results: &[valtype!($result_ty)],
+            feature: |_| true,
        }
    };
 }

 macro_rules! compare {
    ($inst:ident, $rust_ty:tt) => {
-        binary! { $inst, valtype!($rust_ty), ValType::I32 }
+        binary! { $inst, $rust_ty, i32 }
    };
 }

 macro_rules! unary {
    ($inst:ident, $rust_ty:tt) => {
-        unary! { $inst, valtype!($rust_ty), valtype!($rust_ty) }
+        unary! { $inst, $rust_ty, $rust_ty }
    };
-    ($inst:ident, $argument_ty:expr, $result_ty:expr) => {
+    ($inst:ident, $argument_ty:tt, $result_ty:tt) => {
        SingleInstModule {
            instruction: Instruction::$inst,
-            parameters: &[$argument_ty],
-            results: &[$result_ty],
+            parameters: &[valtype!($argument_ty)],
+            results: &[valtype!($result_ty)],
+            feature: |_| true,
+        }
+    };
+    ($inst:ident, $argument_ty:tt, $result_ty:tt, $feature:expr) => {
+        SingleInstModule {
+            instruction: Instruction::$inst,
+            parameters: &[valtype!($argument_ty)],
+            results: &[valtype!($result_ty)],
+            feature: $feature,
        }
    };
 }

 macro_rules! convert {
    ($inst:ident, $from_ty:tt -> $to_ty:tt) => {
-        unary! { $inst, valtype!($from_ty), valtype!($to_ty) }
+        unary! { $inst, $from_ty, $to_ty }
+    };
+    ($inst:ident, $from_ty:tt -> $to_ty:tt, $feature:expr) => {
+        unary! { $inst, $from_ty, $to_ty, $feature }
    };
 }

@@ -172,7 +201,7 @@ static INSTRUCTIONS: &[SingleInstModule] = &[
    binary!(I64Rotr, i64),
    // Integer comparison.
    unary!(I32Eqz, i32),
-    unary!(I64Eqz, ValType::I64, ValType::I32),
+    unary!(I64Eqz, i64, i32),
    compare!(I32Eq, i32),
    compare!(I64Eq, i64),
    compare!(I32Ne, i32),
@@ -236,11 +265,11 @@ static INSTRUCTIONS: &[SingleInstModule] = &[
    compare!(F32Ge, f32),
    compare!(F64Ge, f64),
    // Integer conversions ("to integer").
-    unary!(I32Extend8S, i32),
-    unary!(I32Extend16S, i32),
-    unary!(I64Extend8S, i64),
-    unary!(I64Extend16S, i64),
-    convert!(I64Extend32S, i64 -> i64),
+    unary!(I32Extend8S, i32, i32, |c| c.config.sign_extension_enabled),
+    unary!(I32Extend16S, i32, i32, |c| c.config.sign_extension_enabled),
+    unary!(I64Extend8S, i64, i64, |c| c.config.sign_extension_enabled),
+    unary!(I64Extend16S, i64, i64, |c| c.config.sign_extension_enabled),
+    convert!(I64Extend32S, i64 -> i64, |c| c.config.sign_extension_enabled),
    convert!(I32WrapI64, i64 -> i32),
    convert!(I64ExtendI32S, i32 -> i64),
    convert!(I64ExtendI32U, i32 -> i64),
@@ -252,14 +281,14 @@ static INSTRUCTIONS: &[SingleInstModule] = &[
    convert!(I64TruncF32U, f32 -> i64),
    convert!(I64TruncF64S, f64 -> i64),
    convert!(I64TruncF64U, f64 -> i64),
-    convert!(I32TruncSatF32S, f32 -> i32),
-    convert!(I32TruncSatF32U, f32 -> i32),
-    convert!(I32TruncSatF64S, f64 -> i32),
-    convert!(I32TruncSatF64U, f64 -> i32),
-    convert!(I64TruncSatF32S, f32 -> i64),
-    convert!(I64TruncSatF32U, f32 -> i64),
-    convert!(I64TruncSatF64S, f64 -> i64),
-    convert!(I64TruncSatF64U, f64 -> i64),
+    convert!(I32TruncSatF32S, f32 -> i32, |c| c.config.saturating_float_to_int_enabled),
+    convert!(I32TruncSatF32U, f32 -> i32, |c| c.config.saturating_float_to_int_enabled),
+    convert!(I32TruncSatF64S, f64 -> i32, |c| c.config.saturating_float_to_int_enabled),
+    convert!(I32TruncSatF64U, f64 -> i32, |c| c.config.saturating_float_to_int_enabled),
+    convert!(I64TruncSatF32S, f32 -> i64, |c| c.config.saturating_float_to_int_enabled),
+    convert!(I64TruncSatF32U, f32 -> i64, |c| c.config.saturating_float_to_int_enabled),
+    convert!(I64TruncSatF64S, f64 -> i64, |c| c.config.saturating_float_to_int_enabled),
+    convert!(I64TruncSatF64U, f64 -> i64, |c| c.config.saturating_float_to_int_enabled),
    convert!(I32ReinterpretF32, f32 -> i32),
    convert!(I64ReinterpretF64, f64 -> i64),
    // Floating-point conversions ("to float").
@@ -287,8 +316,9 @@ mod test {
            instruction: Instruction::I32Add,
            parameters: &[ValType::I32, ValType::I32],
            results: &[ValType::I32],
+            feature: |_| true,
        };
-        let wasm = sut.encode();
+        let wasm = sut.to_bytes();
        let wat = wasmprinter::print_bytes(wasm).unwrap();
        assert_eq!(
            wat,
@@ -307,7 +337,7 @@ mod test {
    #[test]
    fn instructions_encode_to_valid_modules() {
        for inst in INSTRUCTIONS {
-            assert!(wat::parse_bytes(&inst.encode()).is_ok());
+            assert!(wat::parse_bytes(&inst.to_bytes()).is_ok());
        }
    }
 }
--- a/crates/fuzzing/src/generators/value.rs
+++ b/crates/fuzzing/src/generators/value.rs
@@ -0,0 +1,177 @@
+//! Generate Wasm values, primarily for differential execution.
+
+use arbitrary::{Arbitrary, Unstructured};
+use std::hash::Hash;
+
+/// A value passed to and from evaluation. Note that reference types are not
+/// (yet) supported.
+#[derive(Clone, Debug)]
+#[allow(missing_docs)]
+pub enum DiffValue {
+    I32(i32),
+    I64(i64),
+    F32(u32),
+    F64(u64),
+    V128(u128),
+}
+
+impl DiffValue {
+    fn ty(&self) -> DiffValueType {
+        match self {
+            DiffValue::I32(_) => DiffValueType::I32,
+            DiffValue::I64(_) => DiffValueType::I64,
+            DiffValue::F32(_) => DiffValueType::F32,
+            DiffValue::F64(_) => DiffValueType::F64,
+            DiffValue::V128(_) => DiffValueType::V128,
+        }
+    }
+
+    /// Generate a [`DiffValue`] of the given `ty` type.
+    ///
+    /// This function will bias the returned value 50% of the time towards one
+    /// of a set of known values (e.g., NaN, -1, 0, infinity, etc.).
+    pub fn arbitrary_of_type(
+        u: &mut Unstructured<'_>,
+        ty: DiffValueType,
+    ) -> arbitrary::Result<Self> {
+        use DiffValueType::*;
+        let val = match ty {
+            I32 => DiffValue::I32(biased_arbitrary_value(u, KNOWN_I32_VALUES)?),
+            I64 => DiffValue::I64(biased_arbitrary_value(u, KNOWN_I64_VALUES)?),
+            F32 => {
+                // TODO once `to_bits` is stable as a `const` function, move
+                // this to a `const` definition.
+                let known_f32_values = &[
+                    f32::NAN.to_bits(),
+                    f32::INFINITY.to_bits(),
+                    f32::NEG_INFINITY.to_bits(),
+                    f32::MIN.to_bits(),
+                    (-1.0f32).to_bits(),
+                    (0.0f32).to_bits(),
+                    (1.0f32).to_bits(),
+                    f32::MAX.to_bits(),
+                ];
+                DiffValue::F32(biased_arbitrary_value(u, known_f32_values)?)
+            }
+            F64 => {
+                // TODO once `to_bits` is stable as a `const` function, move
+                // this to a `const` definition.
+                let known_f64_values = &[
+                    f64::NAN.to_bits(),
+                    f64::INFINITY.to_bits(),
+                    f64::NEG_INFINITY.to_bits(),
+                    f64::MIN.to_bits(),
+                    (-1.0f64).to_bits(),
+                    (0.0f64).to_bits(),
+                    (1.0f64).to_bits(),
+                    f64::MAX.to_bits(),
+                ];
+                DiffValue::F64(biased_arbitrary_value(u, known_f64_values)?)
+            }
+            V128 => DiffValue::V128(biased_arbitrary_value(u, KNOWN_U128_VALUES)?),
+        };
+        arbitrary::Result::Ok(val)
+    }
+}
+
+const KNOWN_I32_VALUES: &[i32] = &[i32::MIN, -1, 0, 1, i32::MAX];
+const KNOWN_I64_VALUES: &[i64] = &[i64::MIN, -1, 0, 1, i64::MAX];
+const KNOWN_U128_VALUES: &[u128] = &[u128::MIN, 1, u128::MAX];
+
+/// Helper function to pick a known value from the list of `known_values` half
+/// the time.
+fn biased_arbitrary_value<'a, T>(
+    u: &mut Unstructured<'a>,
+    known_values: &[T],
+) -> arbitrary::Result<T>
+where
+    T: Arbitrary<'a> + Copy,
+{
+    let pick_from_known_values: bool = u.arbitrary()?;
+    if pick_from_known_values {
+        Ok(*u.choose(known_values)?)
+    } else {
+        u.arbitrary()
+    }
+}
+
+impl<'a> Arbitrary<'a> for DiffValue {
+    fn arbitrary(u: &mut Unstructured<'a>) -> arbitrary::Result<Self> {
+        let ty: DiffValueType = u.arbitrary()?;
+        DiffValue::arbitrary_of_type(u, ty)
+    }
+}
+
+impl Hash for DiffValue {
+    fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
+        self.ty().hash(state);
+        match self {
+            DiffValue::I32(n) => n.hash(state),
+            DiffValue::I64(n) => n.hash(state),
+            DiffValue::F32(n) => n.hash(state),
+            DiffValue::F64(n) => n.hash(state),
+            DiffValue::V128(n) => n.hash(state),
+        }
+    }
+}
+
+/// Implement equality checks. Note that floating-point values are not compared
+/// bit-for-bit in the case of NaNs: because Wasm floating-point numbers may be
+/// [arithmetic NaNs with arbitrary payloads] and Wasm operations are [not
+/// required to propagate NaN payloads], we simply check that both sides are
+/// NaNs here. We could be more strict, though: we could check that the NaN
+/// signs are equal and that [canonical NaN payloads remain canonical].
+///
+/// [arithmetic NaNs with arbitrary payloads]:
+///     https://webassembly.github.io/spec/core/bikeshed/index.html#floating-point%E2%91%A0
+/// [not required to propagate NaN payloads]:
+///     https://webassembly.github.io/spec/core/bikeshed/index.html#floating-point-operations%E2%91%A0
+/// [canonical NaN payloads remain canonical]:
+///     https://webassembly.github.io/spec/core/bikeshed/index.html#nan-propagation%E2%91%A0
+impl PartialEq for DiffValue {
+    fn eq(&self, other: &Self) -> bool {
+        match (self, other) {
+            (Self::I32(l0), Self::I32(r0)) => l0 == r0,
+            (Self::I64(l0), Self::I64(r0)) => l0 == r0,
+            (Self::V128(l0), Self::V128(r0)) => l0 == r0,
+            (Self::F32(l0), Self::F32(r0)) => {
+                let l0 = f32::from_bits(*l0);
+                let r0 = f32::from_bits(*r0);
+                l0 == r0 || (l0.is_nan() && r0.is_nan())
+            }
+            (Self::F64(l0), Self::F64(r0)) => {
+                let l0 = f64::from_bits(*l0);
+                let r0 = f64::from_bits(*r0);
+                l0 == r0 || (l0.is_nan() && r0.is_nan())
+            }
+            _ => false,
+        }
+    }
+}
+
+/// Enumerate the supported value types.
+#[derive(Clone, Debug, Arbitrary, Hash)]
+#[allow(missing_docs)]
+pub enum DiffValueType {
+    I32,
+    I64,
+    F32,
+    F64,
+    V128,
+}
+
+impl TryFrom<wasmtime::ValType> for DiffValueType {
+    type Error = &'static str;
+    fn try_from(ty: wasmtime::ValType) -> Result<Self, Self::Error> {
+        use wasmtime::ValType::*;
+        match ty {
+            I32 => Ok(Self::I32),
+            I64 => Ok(Self::I64),
+            F32 => Ok(Self::F32),
+            F64 => Ok(Self::F64),
+            V128 => Ok(Self::V128),
+            FuncRef => Err("unable to convert reference types"),
+            ExternRef => Err("unable to convert reference types"),
+        }
+    }
+}