Commit Graph

24 Commits

Author SHA1 Message Date
Alex Crichton
fc6328ae06 Temporarily disable SIMD fuzzing on CI (#3376)
We've got a large crop of fuzz-bugs from fuzzing with enabled-with-SIMD
on oss-fuzz but at this point the fuzz stats from oss-fuzz say that the
fuzzers like v8 are spending less than 50% of its time actually fuzzing
and presumably mostly hitting crashes and such. While we fix the other
issues this disables simd for fuzzing with v8 so we can try to see if we
can weed out other issues.
2021-09-20 14:17:19 -05:00
Alex Crichton
1532516a36 Use relative call instructions between wasm functions (#3275)
* Use relative `call` instructions between wasm functions

This commit is a relatively major change to the way that Wasmtime
generates code for Wasm modules and how functions call each other.
Prior to this commit all function calls between functions, even if they
were defined in the same module, were done indirectly through a
register. To implement this the backend would emit an absolute 8-byte
relocation near all function calls, load that address into a register,
and then call it. While this technique is simple to implement and easy
to get right, it has two primary downsides associated with it:

* Function calls are always indirect which means they are more difficult
  to predict, resulting in worse performance.

* Generating a relocation-per-function call requires expensive
  relocation resolution at module-load time, which can be a large
  contributing factor to how long it takes to load a precompiled module.

To fix these issues, while also somewhat compromising on the previously
simple implementation technique, this commit switches wasm calls within
a module to using the `colocated` flag enabled in Cranelift-speak, which
basically means that a relative call instruction is used with a
relocation that's resolved relative to the pc of the call instruction
itself.

When switching the `colocated` flag to `true` this commit is also then
able to move much of the relocation resolution from `wasmtime_jit::link`
into `wasmtime_cranelift::obj` during object-construction time. This
frontloads all relocation work which means that there's actually no
relocations related to function calls in the final image, solving both
of our points above.

The main gotcha in implementing this technique is that there are
hardware limitations to relative function calls which mean we can't
simply blindly use them. AArch64, for example, can only go +/- 64 MB
from the `bl` instruction to the target, which means that if the
function we're calling is a greater distance away then we would fail to
resolve that relocation. On x86_64 the limits are +/- 2GB which are much
larger, but theoretically still feasible to hit. Consequently the main
increase in implementation complexity is fixing this issue.

This issue is actually already present in Cranelift itself, and is
internally one of the invariants handled by the `MachBuffer` type. When
generating a function relative jumps between basic blocks have similar
restrictions. This commit adds new methods for the `MachBackend` trait
and updates the implementation of `MachBuffer` to account for all these
new branches. Specifically the changes to `MachBuffer` are:

* For AAarch64 the `LabelUse::Branch26` value now supports veneers, and
  AArch64 calls use this to resolve relocations.

* The `emit_island` function has been rewritten internally to handle
  some cases which previously didn't come up before, such as:

  * When emitting an island the deadline is now recalculated, where
    previously it was always set to infinitely in the future. This was ok
    prior since only a `Branch19` supported veneers and once it was
    promoted no veneers were supported, so without multiple layers of
    promotion the lack of a new deadline was ok.

  * When emitting an island all pending fixups had veneers forced if
    their branch target wasn't known yet. This was generally ok for
    19-bit fixups since the only kind getting a veneer was a 19-bit
    fixup, but with mixed kinds it's a bit odd to force veneers for a
    26-bit fixup just because a nearby 19-bit fixup needed a veneer.
    Instead fixups are now re-enqueued unless they're known to be
    out-of-bounds. This may run the risk of generating more islands for
    19-bit branches but it should also reduce the number of islands for
    between-function calls.

  * Otherwise the internal logic was tweaked to ideally be a bit more
    simple, but that's a pretty subjective criteria in compilers...

I've added some simple testing of this for now. A synthetic compiler
option was create to simply add padded 0s between functions and test
cases implement various forms of calls that at least need veneers. A
test is also included for x86_64, but it is unfortunately pretty slow
because it requires generating 2GB of output. I'm hoping for now it's
not too bad, but we can disable the test if it's prohibitive and
otherwise just comment the necessary portions to be sure to run the
ignored test if these parts of the code have changed.

The final end-result of this commit is that for a large module I'm
working with the number of relocations dropped to zero, meaning that
nothing actually needs to be done to the text section when it's loaded
into memory (yay!). I haven't run final benchmarks yet but this is the
last remaining source of significant slowdown when loading modules,
after I land a number of other PRs both active and ones that I only have
locally for now.

* Fix arm32

* Review comments
2021-09-01 13:27:38 -05:00
Alex Crichton
e68aa99588 Implement the memory64 proposal in Wasmtime (#3153)
* Implement the memory64 proposal in Wasmtime

This commit implements the WebAssembly [memory64 proposal][proposal] in
both Wasmtime and Cranelift. In terms of work done Cranelift ended up
needing very little work here since most of it was already prepared for
64-bit memories at one point or another. Most of the work in Wasmtime is
largely refactoring, changing a bunch of `u32` values to something else.

A number of internal and public interfaces are changing as a result of
this commit, for example:

* Acessors on `wasmtime::Memory` that work with pages now all return
  `u64` unconditionally rather than `u32`. This makes it possible to
  accommodate 64-bit memories with this API, but we may also want to
  consider `usize` here at some point since the host can't grow past
  `usize`-limited pages anyway.

* The `wasmtime::Limits` structure is removed in favor of
  minimum/maximum methods on table/memory types.

* Many libcall intrinsics called by jit code now unconditionally take
  `u64` arguments instead of `u32`. Return values are `usize`, however,
  since the return value, if successful, is always bounded by host
  memory while arguments can come from any guest.

* The `heap_addr` clif instruction now takes a 64-bit offset argument
  instead of a 32-bit one. It turns out that the legalization of
  `heap_addr` already worked with 64-bit offsets, so this change was
  fairly trivial to make.

* The runtime implementation of mmap-based linear memories has changed
  to largely work in `usize` quantities in its API and in bytes instead
  of pages. This simplifies various aspects and reflects that
  mmap-memories are always bound by `usize` since that's what the host
  is using to address things, and additionally most calculations care
  about bytes rather than pages except for the very edge where we're
  going to/from wasm.

Overall I've tried to minimize the amount of `as` casts as possible,
using checked `try_from` and checked arithemtic with either error
handling or explicit `unwrap()` calls to tell us about bugs in the
future. Most locations have relatively obvious things to do with various
implications on various hosts, and I think they should all be roughly of
the right shape but time will tell. I mostly relied on the compiler
complaining that various types weren't aligned to figure out
type-casting, and I manually audited some of the more obvious locations.
I suspect we have a number of hidden locations that will panic on 32-bit
hosts if 64-bit modules try to run there, but otherwise I think we
should be generally ok (famous last words). In any case I wouldn't want
to enable this by default naturally until we've fuzzed it for some time.

In terms of the actual underlying implementation, no one should expect
memory64 to be all that fast. Right now it's implemented with
"dynamic" heaps which have a few consequences:

* All memory accesses are bounds-checked. I'm not sure how aggressively
  Cranelift tries to optimize out bounds checks, but I suspect not a ton
  since we haven't stressed this much historically.

* Heaps are always precisely sized. This means that every call to
  `memory.grow` will incur a `memcpy` of memory from the old heap to the
  new. We probably want to at least look into `mremap` on Linux and
  otherwise try to implement schemes where dynamic heaps have some
  reserved pages to grow into to help amortize the cost of
  `memory.grow`.

The memory64 spec test suite is scheduled to now run on CI, but as with
all the other spec test suites it's really not all that comprehensive.
I've tried adding more tests for basic things as I've had to implement
guards for them, but I wouldn't really consider the testing adequate
from just this PR itself. I did try to take care in one test to actually
allocate a 4gb+ heap and then avoid running that in the pooling
allocator or in emulation because otherwise that may fail or take
excessively long.

[proposal]: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md

* Fix some tests

* More test fixes

* Fix wasmtime tests

* Fix doctests

* Revert to 32-bit immediate offsets in `heap_addr`

This commit updates the generation of addresses in wasm code to always
use 32-bit offsets for `heap_addr`, and if the calculated offset is
bigger than 32-bits we emit a manual add with an overflow check.

* Disable memory64 for spectest fuzzing

* Fix wrong offset being added to heap addr

* More comments!

* Clarify bytes/pages
2021-08-12 09:40:20 -05:00
Alex Crichton
bb85366a3b Enable simd fuzzing on oss-fuzz (#3152)
* Enable simd fuzzing on oss-fuzz

This commit generally enables the simd feature while fuzzing, which
should affect almost all fuzzers. For fuzzers that just throw random
data at the wall and see what sticks, this means that they'll now be
able to throw simd-shaped data at the wall and have it stick. For
wasm-smith-based fuzzers this commit also updates wasm-smith to 0.6.0
which allows further configuring the `SwarmConfig` after generation,
notably allowing `instantiate-swarm` to generate modules using simd
using `wasm-smith`. This should much more reliably feed simd-related
things into the fuzzers.

Finally, this commit updates wasmtime to avoid usage of the general
`wasm_smith::Module` generator to instead use a Wasmtime-specific custom
default configuration which enables various features we have
implemented.

* Allow dummy table creation to fail

Tables might creation for imports may exceed the memory limit on the
store, which we'll want to gracefully recover from and not fail the
fuzzers.
2021-08-05 16:24:42 -05:00
Alex Crichton
7ce46043dc Add guard pages to the front of linear memories (#2977)
* Add guard pages to the front of linear memories

This commit implements a safety feature for Wasmtime to place guard
pages before the allocation of all linear memories. Guard pages placed
after linear memories are typically present for performance (at least)
because it can help elide bounds checks. Guard pages before a linear
memory, however, are never strictly needed for performance or features.
The intention of a preceding guard page is to help insulate against bugs
in Cranelift or other code generators, such as CVE-2021-32629.

This commit adds a `Config::guard_before_linear_memory` configuration
option, defaulting to `true`, which indicates whether guard pages should
be present both before linear memories as well as afterwards. Guard
regions continue to be controlled by
`{static,dynamic}_memory_guard_size` methods.

The implementation here affects both on-demand allocated memories as
well as the pooling allocator for memories. For on-demand memories this
adjusts the size of the allocation as well as adjusts the calculations
for the base pointer of the wasm memory. For the pooling allocator this
will place a singular extra guard region at the very start of the
allocation for memories. Since linear memories in the pooling allocator
are contiguous every memory already had a preceding guard region in
memory, it was just the previous memory's guard region afterwards. Only
the first memory needed this extra guard.

I've attempted to write some tests to help test all this, but this is
all somewhat tricky to test because the settings are pretty far away
from the actual behavior. I think, though, that the tests added here
should help cover various use cases and help us have confidence in
tweaking the various `Config` settings beyond their defaults.

Note that this also contains a semantic change where
`InstanceLimits::memory_reservation_size` has been removed. Instead this
field is now inferred from the `static_memory_maximum_size` and guard
size settings. This should hopefully remove some duplication in these
settings, canonicalizing on the guard-size/static-size settings as the
way to control memory sizes and virtual reservations.

* Update config docs

* Fix a typo

* Fix benchmark

* Fix wasmtime-runtime tests

* Fix some more tests

* Try to fix uffd failing test

* Review items

* Tweak 32-bit defaults

Makes the pooling allocator a bit more reasonable by default on 32-bit
with these settings.
2021-06-18 09:57:08 -05:00
Nick Fitzgerald
824ce7bf89 deps: Update Arbitrary to 1.0; libfuzzer-sys to 0.4.0; wasm-smith to 0.4.0 2021-02-25 15:34:02 -08:00
Alex Crichton
0e41861662 Implement limiting WebAssembly execution with fuel (#2611)
* Consume fuel during function execution

This commit adds codegen infrastructure necessary to instrument wasm
code to consume fuel as it executes. Currently nothing is really done
with the fuel, but that'll come in later commits.

The focus of this commit is to implement the codegen infrastructure
necessary to consume fuel and account for fuel consumed correctly.

* Periodically check remaining fuel in wasm JIT code

This commit enables wasm code to periodically check to see if fuel has
run out. When fuel runs out an intrinsic is called which can do what it
needs to do in the result of fuel running out. For now a trap is thrown
to have at least some semantics in synchronous stores, but another
planned use for this feature is for asynchronous stores to periodically
yield back to the host based on fuel running out.

Checks for remaining fuel happen in the same locations as interrupt
checks, which is to say the start of the function as well as loop
headers.

* Improve codegen by caching `*const VMInterrupts`

The location of the shared interrupt value and fuel value is through a
double-indirection on the vmctx (load through the vmctx and then load
through that pointer). The second pointer in this chain, however, never
changes, so we can alter codegen to account for this and remove some
extraneous load instructions and hopefully reduce some register
pressure even maybe.

* Add tests fuel can abort infinite loops

* More fuzzing with fuel

Use fuel to time out modules in addition to time, using fuzz input to
figure out which.

* Update docs on trapping instructions

* Fix doc links

* Fix a fuzz test

* Change setting fuel to adding fuel

* Fix a doc link

* Squelch some rustdoc warnings
2021-01-29 08:57:17 -06:00
Alex Crichton
dccaa64962 Add knobs to limit memories/tables in a Store
Fuzzing has turned up that module linking can create large amounts of
tables and memories in addition to instances. For example if N instances
are allowed and M tables are allowed per-instance, then currently
wasmtime allows MxN tables (which is quite a lot). This is causing some
wasm-smith-generated modules to exceed resource limits while fuzzing!

This commits adds corresponding `max_tables` and `max_memories`
functions to sit alongside the `max_instances` configuration.
Additionally fuzzing now by default configures all of these to a
somewhat low value to avoid too much resource usage while fuzzing.
2021-01-28 08:47:00 -08:00
Alex Crichton
b73b831892 Replace binaryen -ttf based fuzzing with wasm-smith (#2336)
This commit removes the binaryen support for fuzzing from wasmtime,
instead switching over to `wasm-smith`. In general it's great to have
what fuzzing we can, but our binaryen support suffers from a few issues:

* The Rust crate, binaryen-sys, seems largely unmaintained at this
  point. While we could likely take ownership and/or send PRs to update
  the crate it seems like the maintenance is largely on us at this point.

* Currently the binaryen-sys crate doesn't support fuzzing anything
  beyond MVP wasm, but we're interested at least in features like bulk
  memory and reference types. Additionally we'll also be interested in
  features like module-linking. New features would require either
  implementation work in binaryen or the binaryen-sys crate to support.

* We have 4-5 fuzz-bugs right now related to timeouts simply in
  generating a module for wasmtime to fuzz. One investigation along
  these lines in the past revealed a bug in binaryen itself, and in any
  case these bugs would otherwise need to get investigated, reported,
  and possibly fixed ourselves in upstream binaryen.

Overall I'm not sure at this point if maintaining binaryen fuzzing is
worth it with the advent of `wasm-smith` which has similar goals for
wasm module generation, but is much more readily maintainable on our
end.

Additonally in this commit I've added a fuzzer for wasm-smith's
`SwarmConfig`-based fuzzer which should expand the coverage of tested
modules.

Closes #2163
2020-10-29 10:02:59 -05:00
Nick Fitzgerald
98e899f6b3 fuzz: Add a fuzz target for table.{get,set} operations
This new fuzz target exercises sequences of `table.get`s, `table.set`s, and
GCs.

It already found a couple bugs:

* Some leaks due to ref count cycles between stores and host-defined functions
  closing over those stores.

* If there are no live references for a PC, Cranelift can avoid emiting an
  associated stack map. This was running afoul of a debug assertion.
2020-06-30 12:00:57 -07:00
Alex Crichton
5fa4d36b0d Disable Cranelift debug verifier when fuzzing (#1851)
* Add CLI flags for internal cranelift options

This commit adds two flags to the `wasmtime` CLI:

* `--enable-cranelift-debug-verifier`
* `--enable-cranelift-nan-canonicalization`

These previously weren't exposed from the command line but have been
useful to me at least for reproducing slowdowns found during fuzzing on
the CLI.

* Disable Cranelift debug verifier when fuzzing

This commit disables Cranelift's debug verifier for our fuzz targets.
We've gotten a good number of timeouts on OSS-Fuzz and some I've
recently had some discussion over at google/oss-fuzz#3944 about this
issue and what we can do. The result of that discussion was that there
are two primary ways we can speed up our fuzzers:

* One is independent of Wasmtime, which is to tweak the flags used to
  compile code. The conclusion was that one flag was passed to LLVM
  which significantly increased runtime for very little benefit. This
  has now been disabled in rust-fuzz/cargo-fuzz#229.

* The other way is to reduce the amount of debug checks we run while
  fuzzing wasmtime itself. To put this in perspective, a test case which
  took ~100ms to instantiate was taking 50 *seconds* to instantiate in
  the fuzz target. This 500x slowdown was caused by a ton of
  multiplicative factors, but two major contributors were NaN
  canonicalization and cranelift's debug verifier. I suspect the NaN
  canonicalization itself isn't too pricy but when paired with the debug
  verifier in float-heavy code it can create lots of IR to verify.

This commit is specifically tackling this second point in an attempt to
avoid slowing down our fuzzers too much. The intent here is that we'll
disable the cranelift debug verifier for now but leave all other checks
enabled. If the debug verifier gets a speed boost we can try re-enabling
it, but otherwise it seems like for now it's otherwise not catching any
bugs and creating lots of noise about timeouts that aren't relevant.

It's not great that we have to turn off internal checks since that's
what fuzzing is supposed to trigger, but given the timeout on OSS-Fuzz
and the multiplicative effects of all the slowdowns we have when
fuzzing, I'm not sure we can afford the massive slowdown of the debug verifier.
2020-06-10 12:50:21 -05:00
Alex Crichton
363cd2d20f Expose memory-related options in Config (#1513)
* Expose memory-related options in `Config`

This commit was initially motivated by looking more into #1501, but it
ended up balooning a bit after finding a few issues. The high-level
items in this commit are:

* New configuration options via `wasmtime::Config` are exposed to
  configure the tunable limits of how memories are allocated and such.
* The `MemoryCreator` trait has been updated to accurately reflect the
  required allocation characteristics that JIT code expects.
* A bug has been fixed in the cranelift wasm code generation where if no
  guard page was present bounds checks weren't accurately performed.

The new `Config` methods allow tuning the memory allocation
characteristics of wasmtime. Currently 64-bit platforms will reserve 6GB
chunks of memory for each linear memory, but by tweaking various config
options you can change how this is allocate, perhaps at the cost of
slower JIT code since it needs more bounds checks. The methods are
intended to be pretty thoroughly documented as to the effect they have
on the JIT code and what values you may wish to select. These new
methods have been added to the spectest fuzzer to ensure that various
configuration values for these methods don't affect correctness.

The `MemoryCreator` trait previously only allocated memories with a
`MemoryType`, but this didn't actually reflect the guarantees that JIT
code expected. JIT code is generated with an assumption about the
minimum size of the guard region, as well as whether memory is static or
dynamic (whether the base pointer can be relocated). These properties
must be upheld by custom allocation engines for JIT code to perform
correctly, so extra parameters have been added to
`MemoryCreator::new_memory` to reflect this.

Finally the fuzzing with `Config` turned up an issue where if no guard
pages present the wasm code wouldn't correctly bounds-check memory
accesses. The issue here was that with a guard page we only need to
bounds-check the first byte of access, but without a guard page we need
to bounds-check the last byte of access. This meant that the code
generation needed to account for the size of the memory operation
(load/store) and use this as the offset-to-check in the no-guard-page
scenario. I've attempted to make the various comments in cranelift a bit
more exhaustive too to hopefully make it a bit clearer for future
readers!

Closes #1501

* Review comments

* Update a comment
2020-04-29 17:10:00 -07:00
Alex Crichton
c9a0ba81a0 Implement interrupting wasm code, reimplement stack overflow (#1490)
* Implement interrupting wasm code, reimplement stack overflow

This commit is a relatively large change for wasmtime with two main
goals:

* Primarily this enables interrupting executing wasm code with a trap,
  preventing infinite loops in wasm code. Note that resumption of the
  wasm code is not a goal of this commit.

* Additionally this commit reimplements how we handle stack overflow to
  ensure that host functions always have a reasonable amount of stack to
  run on. This fixes an issue where we might longjmp out of a host
  function, skipping destructors.

Lots of various odds and ends end up falling out in this commit once the
two goals above were implemented. The strategy for implementing this was
also lifted from Spidermonkey and existing functionality inside of
Cranelift. I've tried to write up thorough documentation of how this all
works in `crates/environ/src/cranelift.rs` where gnarly-ish bits are.

A brief summary of how this works is that each function and each loop
header now checks to see if they're interrupted. Interrupts and the
stack overflow check are actually folded into one now, where function
headers check to see if they've run out of stack and the sentinel value
used to indicate an interrupt, checked in loop headers, tricks functions
into thinking they're out of stack. An interrupt is basically just
writing a value to a location which is read by JIT code.

When interrupts are delivered and what triggers them has been left up to
embedders of the `wasmtime` crate. The `wasmtime::Store` type has a
method to acquire an `InterruptHandle`, where `InterruptHandle` is a
`Send` and `Sync` type which can travel to other threads (or perhaps
even a signal handler) to get notified from. It's intended that this
provides a good degree of flexibility when interrupting wasm code. Note
though that this does have a large caveat where interrupts don't work
when you're interrupting host code, so if you've got a host import
blocking for a long time an interrupt won't actually be received until
the wasm starts running again.

Some fallout included from this change is:

* Unix signal handlers are no longer registered with `SA_ONSTACK`.
  Instead they run on the native stack the thread was already using.
  This is possible since stack overflow isn't handled by hitting the
  guard page, but rather it's explicitly checked for in wasm now. Native
  stack overflow will continue to abort the process as usual.

* Unix sigaltstack management is now no longer necessary since we don't
  use it any more.

* Windows no longer has any need to reset guard pages since we no longer
  try to recover from faults on guard pages.

* On all targets probestack intrinsics are disabled since we use a
  different mechanism for catching stack overflow.

* The C API has been updated with interrupts handles. An example has
  also been added which shows off how to interrupt a module.

Closes #139
Closes #860
Closes #900

* Update comment about magical interrupt value

* Store stack limit as a global value, not a closure

* Run rustfmt

* Handle review comments

* Add a comment about SA_ONSTACK

* Use `usize` for type of `INTERRUPTED`

* Parse human-readable durations

* Bring back sigaltstack handling

Allows libstd to print out stack overflow on failure still.

* Add parsing and emission of stack limit-via-preamble

* Fix new example for new apis

* Fix host segfault test in release mode

* Fix new doc example
2020-04-21 11:03:28 -07:00
Alex Crichton
6dde222992 Add a spec test fuzzer for Config (#1509)
* Add a spec test fuzzer for Config

This commit adds a new fuzzer which is intended to run on oss-fuzz. This
fuzzer creates and arbitrary `Config` which *should* pass spec tests and
then asserts that it does so. The goal here is to weed out any
accidental bugs in global configuration which could cause
non-spec-compliant behavior.

* Move implementation to `fuzzing` crate
2020-04-15 08:29:12 -05:00
teapotd
2180e9ce16 fuzzing: Enable NaN canonicalization (#1334)
* Method to enable NaN canonicalization in Config

* Use fuzz_default_config in DifferentialConfig

* Enable NaN canonicalization for fuzzing
2020-03-31 09:22:08 -05:00
Alex Crichton
b0cf8c021f Turn off binaryen in fuzzing by default
... but turn it back on in CI by default. The `binaryen-sys` crate
builds binaryen from source, which is a drag on CI for a few reasons:

* This is quite large and takes a good deal of time to build
* The debug build directory for binaryen is 4GB large

In an effort to both save time and disk space on the builders this
commit adds a `binaryen` feature to the `wasmtime-fuzz` crate. This
feature is enabled specifically when running the fuzzers on CI, but it
is disabled during the typical `cargo test --all` command. This means
that the test builders should save an extra 4G of space and be a bit
speedier now that they don't build a giant wad of C++.

We'll need to update the OSS-fuzz integration to enable the `binaryen`
feature when executing `cargo fuzz build`, and I'll do that once this
gets closer to landing.
2020-03-17 09:51:59 -07:00
Nick Fitzgerald
3accccd5f7 fuzzing: Enable Cranelift's IR verifier for differential fuzzing 2020-03-16 16:21:45 -07:00
Nick Fitzgerald
4866fa0e6a Limit rayon to one thread during fuzzing
This should enable more deterministic execution.
2020-02-28 18:35:09 -08:00
Nick Fitzgerald
f2fef600c6 Implement Arbitrary::size_hint for WasmOptTtf 2020-02-28 15:48:24 -08:00
Nick Fitzgerald
1bf8de35f3 Add initial differential fuzzing
Part of #611
2020-01-17 16:17:04 -08:00
Nick Fitzgerald
adcc047f4a Update fuzz crates (#826)
* deps: Update to arbitrary 0.3.x and libfuzzer-sys 0.2.0

* ci: Use cargo-fuzz 0.7.x in CI
2020-01-15 23:05:37 -06:00
Nick Fitzgerald
0cde30197d fuzzing: Add initial API call fuzzer
We only generate *valid* sequences of API calls. To do this, we keep track of
what objects we've already created in earlier API calls via the `Scope` struct.

To generate even-more-pathological sequences of API calls, we use [swarm
testing]:

> In swarm testing, the usual practice of potentially including all features
> in every test case is abandoned. Rather, a large “swarm” of randomly
> generated configurations, each of which omits some features, is used, with
> configurations receiving equal resources.

[swarm testing]: https://www.cs.utah.edu/~regehr/papers/swarm12.pdf

There are more public APIs and instance introspection APIs that we have than
this fuzzer exercises right now. We will need a better generator of valid Wasm
than `wasm-opt -ttf` to really get the most out of those currently-unexercised
APIs, since the Wasm modules generated by `wasm-opt -ttf` don't import and
export a huge variety of things.
2019-12-10 15:14:12 -08:00
Nick Fitzgerald
bab59a2cd2 Fuzzing: Add test case logging and regression test template
When the test case that causes the failure can successfully be disassembled to
WAT, we get logs like this:

```
[2019-11-26T18:48:46Z INFO  wasmtime_fuzzing] Wrote WAT disassembly to: /home/fitzgen/wasmtime/crates/fuzzing/target/scratch/8437-0.wat
[2019-11-26T18:48:46Z INFO  wasmtime_fuzzing] If this fuzz test fails, copy `/home/fitzgen/wasmtime/crates/fuzzing/target/scratch/8437-0.wat` to `wasmtime/crates/fuzzing/tests/regressions/my-regression.wat` and add the following test to `wasmtime/crates/fuzzing/tests/regressions.rs`:

    ```
    #[test]
    fn my_fuzzing_regression_test() {
        let data = wat::parse_str(
            include_str!("./regressions/my-regression.wat")
        ).unwrap();
        oracles::instantiate(data, CompilationStrategy::Auto)
    }
    ```
```

If the test case cannot be disassembled to WAT, then we get logs like this:

```
[2019-11-26T18:48:46Z INFO  wasmtime_fuzzing] Wrote Wasm test case to: /home/fitzgen/wasmtime/crates/fuzzing/target/scratch/8437-0.wasm
[2019-11-26T18:48:46Z INFO  wasmtime_fuzzing] Failed to disassemble Wasm into WAT:
    Bad magic number (at offset 0)

    Stack backtrace:
        Run with RUST_LIB_BACKTRACE=1 env variable to display a backtrace

[2019-11-26T18:48:46Z INFO  wasmtime_fuzzing] If this fuzz test fails, copy `/home/fitzgen/wasmtime/crates/fuzzing/target/scratch/8437-0.wasm` to `wasmtime/crates/fuzzing/tests/regressions/my-regression.wasm` and add the following test to `wasmtime/crates/fuzzing/tests/regressions.rs`:

    ```
    #[test]
    fn my_fuzzing_regression_test() {
        let data = include_bytes!("./regressions/my-regression.wasm");
        oracles::instantiate(data, CompilationStrategy::Auto)
    }
    ```
```
2019-11-26 10:54:21 -08:00
Nick Fitzgerald
58ba066758 Split our existing fuzz targets into separate generators and oracles
Part of #611
2019-11-21 15:52:02 -08:00