[fuzz] Add a meta-differential fuzz target (#4515)

* [fuzz] Add `Module` enum, refactor `ModuleConfig` This change adds a way to create either a single-instruction module or a regular (big) `wasm-smith` module. It has some slight refactorings in preparation for the use of this new code. * [fuzz] Add `DiffValue` for differential evaluation In order to evaluate functions with randomly-generated values, we needed a common way to generate these values. Using the Wasmtime `Val` type is not great because we would like to be able to implement various traits on the new value type, e.g., to convert `Into` and `From` boxed values of other engines we differentially fuzz against. This new type, `DiffValue`, gives us a common ground for all the conversions and comparisons between the other engine types. * [fuzz] Add interface for differential engines In order to randomly choose an engine to fuzz against, we expect all of the engines to meet a common interface. The traits in this commit allow us to instantiate a module from its binary form, evaluate exported functions, and (possibly) hash the exported items of the instance. This change has some missing pieces, though: - the `wasm-spec-interpreter` needs some work to be able to create instances, evaluate a function by name, and expose exported items - the `v8` engine is not implemented yet due to the complexity of its Rust lifetimes * [fuzz] Use `ModuleFeatures` instead of existing configuration When attempting to use both wasm-smith and single-instruction modules, there is a mismatch in how we communicate what an engine must be able to support. In the first case, we could use the `ModuleConfig`, a wrapper for wasm-smith's `SwarmConfig`, but single-instruction modules do not have a `SwarmConfig`--the many options simply don't apply. Here, we instead add `ModuleFeatures` and adapt a `ModuleConfig` to that. `ModuleFeatures` then becomes the way to communicate what features an engine must support to evaluate functions in a module. * [fuzz] Add a new fuzz target using the meta-differential oracle This change adds the `differential_meta` target to the list of fuzz targets. I expect that sometime soon this could replace the other `differential*` targets, as it almost checks all the things those check. The major missing piece is that currently it only chooses single-instruction modules instead of also generating arbitrary modules using `wasm-smith`. Also, this change adds the concept of an ignorable error: some differential engines will choke with certain inputs (e.g., `wasmi` might have an old opcode mapping) which we do not want to flag as fuzz bugs. Here we wrap those errors in `DiffIgnoreError` and then use a new helper trait, `DiffIgnorable`, to downcast and inspect the `anyhow` error to only panic on non-ignorable errors; the ignorable errors are converted to one of the `arbitrary::Error` variants, which we already ignore. * [fuzz] Compare `DiffValue` NaNs more leniently Because arithmetic NaNs can contain arbitrary payload bits, checking that two differential executions should produce the same result should relax the comparison of the `F32` and `F64` types (and eventually `V128` as well... TODO). This change adds several considerations, however, so that in the future we make the comparison a bit stricter, e.g., re: canonical NaNs. This change, however, just matches the current logic used by other fuzz targets. * review: allow hashing mutate the instance state @alexcrichton requested that the interface be adapted to accommodate Wasmtime's API, in which even reading from an instance could trigger mutation of the store. * review: refactor where configurations are made compatible See @alexcrichton's [suggestion](https://github.com/bytecodealliance/wasmtime/pull/4515#discussion_r928974376). * review: convert `DiffValueType` using `TryFrom` See @alexcrichton's [comment](https://github.com/bytecodealliance/wasmtime/pull/4515#discussion_r928962394). * review: adapt target implementation to Wasmtime-specific RHS This change is joint work with @alexcrichton to adapt the structure of the fuzz target to his comments [here](https://github.com/bytecodealliance/wasmtime/pull/4515#pullrequestreview-1073247791). This change: - removes `ModuleFeatures` and the `Module` enum (for big and small modules) - upgrades `SingleInstModule` to filter out cases that are not valid for a given `ModuleConfig` - adds `DiffEngine::name()` - constructs each `DiffEngine` using a `ModuleConfig`, eliminating `DiffIgnoreError` completely - prints an execution rate to the `differential_meta` target Still TODO: - `get_exported_function_signatures` could be re-written in terms of the Wasmtime API instead `wasmparser` - the fuzzer crashes eventually, we think due to the signal handler interference between OCaml and Wasmtime - the spec interpreter has several cases that we skip for now but could be fuzzed with further work Co-authored-by: Alex Crichton <alex@alexcrichton.com> * fix: avoid SIGSEGV by explicitly initializing OCaml runtime first * review: use Wasmtime's API to retrieve exported functions Co-authored-by: Alex Crichton <alex@alexcrichton.com>
2022-08-18 17:22:58 -07:00
parent 8b6019909b
commit 5ec92d59d2
14 changed files with 1046 additions and 53 deletions
--- a/crates/fuzzing/src/oracles.rs
+++ b/crates/fuzzing/src/oracles.rs
@@ -10,14 +10,23 @@
 //! When an oracle finds a bug, it should report it to the fuzzing engine by
 //! panicking.

+#[cfg(feature = "fuzz-spec-interpreter")]
+pub mod diff_spec;
+pub mod diff_wasmi;
+pub mod diff_wasmtime;
 pub mod dummy;
+pub mod engine;
 mod stacks;

-use crate::generators;
+use self::diff_wasmtime::WasmtimeInstance;
+use self::engine::DiffInstance;
+use crate::generators::{self, DiffValue};
 use arbitrary::Arbitrary;
 use log::debug;
 pub use stacks::check_stacks;
 use std::cell::Cell;
+use std::collections::hash_map::DefaultHasher;
+use std::hash::Hasher;
 use std::rc::Rc;
 use std::sync::atomic::{AtomicUsize, Ordering::SeqCst};
 use std::sync::{Arc, Condvar, Mutex};
@@ -240,9 +249,10 @@ fn compile_module(
            if let generators::InstanceAllocationStrategy::Pooling { .. } =
                &config.wasmtime.strategy
            {
-                // When using the pooling allocator, accept failures to compile when arbitrary
-                // table element limits have been exceeded as there is currently no way
-                // to constrain the generated module table types.
+                // When using the pooling allocator, accept failures to compile
+                // when arbitrary table element limits have been exceeded as
+                // there is currently no way to constrain the generated module
+                // table types.
                let string = e.to_string();
                if string.contains("minimum element size") {
                    return None;
@@ -250,7 +260,7 @@ fn compile_module(

                // Allow modules-failing-to-compile which exceed the requested
                // size for each instance. This is something that is difficult
-                // to control and ensure it always suceeds, so we simply have a
+                // to control and ensure it always succeeds, so we simply have a
                // "random" instance size limit and if a module doesn't fit we
                // move on to the next fuzz input.
                if string.contains("instance allocation for this module requires") {
@@ -263,7 +273,17 @@ fn compile_module(
    }
 }

-fn instantiate_with_dummy(store: &mut Store<StoreLimits>, module: &Module) -> Option<Instance> {
+/// Create a Wasmtime [`Instance`] from a [`Module`] and fill in all imports
+/// with dummy values (e.g., zeroed values, immediately-trapping functions).
+/// Also, this function catches certain fuzz-related instantiation failures and
+/// returns `None` instead of panicking.
+///
+/// TODO: we should implement tracing versions of these dummy imports that
+/// record a trace of the order that imported functions were called in and with
+/// what values. Like the results of exported functions, calls to imports should
+/// also yield the same values for each configuration, and we should assert
+/// that.
+pub fn instantiate_with_dummy(store: &mut Store<StoreLimits>, module: &Module) -> Option<Instance> {
    // Creation of imports can fail due to resource limit constraints, and then
    // instantiation can naturally fail for a number of reasons as well. Bundle
    // the two steps together to match on the error below.
@@ -279,12 +299,14 @@ fn instantiate_with_dummy(store: &mut Store<StoreLimits>, module: &Module) -> Op
    // expected that fuzz-generated programs try to allocate lots of
    // stuff.
    if store.data().0.oom.get() {
+        log::debug!("failed to instantiate: OOM");
        return None;
    }

    // Allow traps which can happen normally with `unreachable` or a
    // timeout or such
-    if e.downcast_ref::<Trap>().is_some() {
+    if let Some(trap) = e.downcast_ref::<Trap>() {
+        log::debug!("failed to instantiate: {}", trap);
        return None;
    }

@@ -296,11 +318,13 @@ fn instantiate_with_dummy(store: &mut Store<StoreLimits>, module: &Module) -> Op
        // rather than positional-based resolution
        || string.contains("incompatible import type")
    {
+        log::debug!("failed to instantiate: {}", string);
        return None;
    }

    // Also allow failures to instantiate as a result of hitting instance limits
    if string.contains("concurrent instances has been reached") {
+        log::debug!("failed to instantiate: {}", string);
        return None;
    }

@@ -308,6 +332,55 @@ fn instantiate_with_dummy(store: &mut Store<StoreLimits>, module: &Module) -> Op
    panic!("failed to instantiate: {:?}", e);
 }

+/// Evaluate the function identified by `name` in two different engine
+/// instances--`lhs` and `rhs`.
+///
+/// # Panics
+///
+/// This will panic if the evaluation is different between engines (e.g.,
+/// results are different, hashed instance is different, one side traps, etc.).
+pub fn differential(
+    lhs: &mut dyn DiffInstance,
+    rhs: &mut WasmtimeInstance,
+    name: &str,
+    args: &[DiffValue],
+) -> anyhow::Result<()> {
+    log::debug!("Evaluating: {}({:?})", name, args);
+    let lhs_results = lhs.evaluate(name, args);
+    log::debug!(" -> results on {}: {:?}", lhs.name(), &lhs_results);
+    let rhs_results = rhs.evaluate(name, args);
+    log::debug!(" -> results on {}: {:?}", rhs.name(), &rhs_results);
+    match (lhs_results, rhs_results) {
+        // If the evaluation succeeds, we compare the results.
+        (Ok(lhs_results), Ok(rhs_results)) => assert_eq!(lhs_results, rhs_results),
+        // Both sides failed--this is an acceptable result (e.g., both sides
+        // trap at a divide by zero). We could compare the error strings perhaps
+        // (since the `lhs` and `rhs` could be failing for different reasons)
+        // but this seems good enough for now.
+        (Err(_), Err(_)) => {}
+        // A real bug is found if only one side fails.
+        (Ok(_), Err(_)) => panic!("only the `rhs` ({}) failed for this input", rhs.name()),
+        (Err(_), Ok(_)) => panic!("only the `lhs` ({}) failed for this input", lhs.name()),
+    };
+
+    let hash = |i: &mut dyn DiffInstance| -> anyhow::Result<u64> {
+        let mut hasher = DefaultHasher::new();
+        i.hash(&mut hasher)?;
+        Ok(hasher.finish())
+    };
+
+    if lhs.is_hashable() && rhs.is_hashable() {
+        log::debug!("Hashing instances:");
+        let lhs_hash = hash(lhs)?;
+        log::debug!(" -> hash of {}: {:?}", lhs.name(), lhs_hash);
+        let rhs_hash = hash(rhs)?;
+        log::debug!(" -> hash of {}: {:?}", rhs.name(), rhs_hash);
+        assert_eq!(lhs_hash, rhs_hash);
+    }
+
+    Ok(())
+}
+
 /// Instantiate the given Wasm module with each `Config` and call all of its
 /// exports. Modulo OOM, non-canonical NaNs, and usage of Wasm features that are
 /// or aren't enabled for different configs, we should get the same results when