Files

Nick Fitzgerald c2a7ea7e24 Cranelift: de-duplicate bounds checks in legalizations (#5190 )

* Cranelift: Add the `DataFlowGraph::display_value_inst` convenience method

* Cranelift: Add some `trace!` logs to some parts of legalization

* Cranelift: de-duplicate bounds checks in legalizations

When both (1) "dynamic" memories that need explicit bounds checks and (2)
spectre mitigations that perform bounds checks are enabled, reuse the same
bounds checks between the two legalizations.

This reduces the overhead of explicit bounds checks and spectre mitigations over
using virtual memory guard pages with spectre mitigations from ~1.9-2.1x
overhead to ~1.6-1.8x overhead. That is about a 14-19% speed up for when dynamic
memories and spectre mitigations are enabled.

<details>

```
execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm

  Δ = 3422455129.47 ± 120159.49 (confidence = 99%)

  virtual-memory-guards.so is 2.09x to 2.09x faster than bounds-checks.so!

  [6563931659 6564063496.07 6564301535] bounds-checks.so
  [3141492675 3141608366.60 3141895249] virtual-memory-guards.so

execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm

  Δ = 338716136.87 ± 1.38 (confidence = 99%)

  virtual-memory-guards.so is 2.08x to 2.08x faster than bounds-checks.so!

  [651961494 651961495.47 651961497] bounds-checks.so
  [313245357 313245358.60 313245362] virtual-memory-guards.so

execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm

  Δ = 22742944.07 ± 331.73 (confidence = 99%)

  virtual-memory-guards.so is 1.87x to 1.87x faster than bounds-checks.so!

  [48841295 48841567.33 48842139] bounds-checks.so
  [26098439 26098623.27 26099479] virtual-memory-guards.so
```

</details>

<details>

```
execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm

  Δ = 2465900207.27 ± 146476.61 (confidence = 99%)

  virtual-memory-guards.so is 1.78x to 1.78x faster than de-duped-bounds-checks.so!

  [5607275431 5607442989.13 5607838342] de-duped-bounds-checks.so
  [3141445345 3141542781.87 3141711213] virtual-memory-guards.so

execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm

  Δ = 234253620.20 ± 2.33 (confidence = 99%)

  virtual-memory-guards.so is 1.75x to 1.75x faster than de-duped-bounds-checks.so!

  [547498977 547498980.93 547498985] de-duped-bounds-checks.so
  [313245357 313245360.73 313245363] virtual-memory-guards.so

execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm

  Δ = 16605659.13 ± 315.78 (confidence = 99%)

  virtual-memory-guards.so is 1.64x to 1.64x faster than de-duped-bounds-checks.so!

  [42703971 42704284.40 42704787] de-duped-bounds-checks.so
  [26098432 26098625.27 26099234] virtual-memory-guards.so
```

</details>

<details>

```
execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm

  Δ = 104462517.13 ± 7.32 (confidence = 99%)

  de-duped-bounds-checks.so is 1.19x to 1.19x faster than bounds-checks.so!

  [651961493 651961500.80 651961532] bounds-checks.so
  [547498981 547498983.67 547498989] de-duped-bounds-checks.so

execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm

  Δ = 956556982.80 ± 103034.59 (confidence = 99%)

  de-duped-bounds-checks.so is 1.17x to 1.17x faster than bounds-checks.so!

  [6563930590 6564019842.40 6564243651] bounds-checks.so
  [5607307146 5607462859.60 5607677763] de-duped-bounds-checks.so

execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm

  Δ = 6137307.87 ± 247.75 (confidence = 99%)

  de-duped-bounds-checks.so is 1.14x to 1.14x faster than bounds-checks.so!

  [48841303 48841472.93 48842000] bounds-checks.so
  [42703965 42704165.07 42704718] de-duped-bounds-checks.so
```

</details>

* Update test expectations

* Add a test for deduplicating bounds checks between dynamic memories and spectre mitigations

* Define a struct for the Spectre comparison instead of using a tuple

* More trace logging for heap legalization

2022-11-15 08:47:22 -08:00

bforest

Bump Wasmtime to 4.0.0 (#5209 )

2022-11-06 13:32:34 -06:00

codegen

Cranelift: de-duplicate bounds checks in legalizations (#5190 )

2022-11-15 08:47:22 -08:00

docs

cranelift: Remove booleans (#5031 )

2022-10-17 16:00:27 -07:00

egraph

Bump Wasmtime to 4.0.0 (#5209 )

2022-11-06 13:32:34 -06:00

entity

Bump Wasmtime to 4.0.0 (#5209 )

2022-11-06 13:32:34 -06:00

filetests

Cranelift: de-duplicate bounds checks in legalizations (#5190 )

2022-11-15 08:47:22 -08:00

frontend

Bump Wasmtime to 4.0.0 (#5209 )

2022-11-06 13:32:34 -06:00

fuzzgen

fuzzgen: Add a few more ops (#5201 )

2022-11-07 09:08:26 -08:00

interpreter

cranelift: Fix iadd_carry/iadd_cout in the interpreter (#5176 )

2022-11-14 10:18:28 -08:00

isle

cranelift-isle: New IR and revised overlap checks (#5195 )

2022-11-14 02:29:22 +00:00

jit

Bump Wasmtime to 4.0.0 (#5209 )

2022-11-06 13:32:34 -06:00

media

Check in the Crane and Ferris drawing so that people can remix it :-).

2018-09-13 15:30:39 -07:00

module

Bump Wasmtime to 4.0.0 (#5209 )

2022-11-06 13:32:34 -06:00

native

Wasmtime+Cranelift: strip out some dead x86-32 code. (#5226 )

2022-11-08 23:03:17 +00:00

object

Bump Wasmtime to 4.0.0 (#5209 )

2022-11-06 13:32:34 -06:00

preopt

Bump Wasmtime to 4.0.0 (#5209 )

2022-11-06 13:32:34 -06:00

reader

Cranelift: Make heap_addr return calculated base + index + offset (#5231 )

2022-11-09 19:53:51 +00:00

serde

Bump Wasmtime to 4.0.0 (#5209 )

2022-11-06 13:32:34 -06:00

src

Cranelift: Make heap_addr return calculated base + index + offset (#5231 )

2022-11-09 19:53:51 +00:00

tests

cranelift: Remove iconst.i128 (#5075 )

2022-10-24 12:43:28 -07:00

umbrella

Bump Wasmtime to 4.0.0 (#5209 )

2022-11-06 13:32:34 -06:00

wasm

Update wasm-tools crates (#5248 )

2022-11-10 21:23:20 +00:00

Cargo.toml

Leverage Cargo's workspace inheritance feature (#4905 )

2022-09-26 11:30:01 -05:00

README.md

Cranellift: remove Baldrdash support and related features. (#4571 )

2022-08-02 19:37:56 +00:00

rustc.md

Update outdated references to the Cranelift repository

2020-03-09 14:06:24 +01:00

README.md

Cranelift Code Generator

A Bytecode Alliance project

Cranelift is a low-level retargetable code generator. It translates a target-independent intermediate representation into executable machine code.

For more information, see the documentation.

For an example of how to use the JIT, see the JIT Demo, which implements a toy language.

For an example of how to use Cranelift to run WebAssembly code, see Wasmtime, which implements a standalone, embeddable, VM using Cranelift.

Status

Cranelift currently supports enough functionality to run a wide variety of programs, including all the functionality needed to execute WebAssembly (MVP and various extensions like SIMD), although it needs to be used within an external WebAssembly embedding such as Wasmtime to be part of a complete WebAssembly implementation. It is also usable as a backend for non-WebAssembly use cases: for example, there is an effort to build a Rust compiler backend using Cranelift.

Cranelift is production-ready, and is used in production in several places, all within the context of Wasmtime. It is carefully fuzzed as part of Wasmtime with differential comparison against V8 and the executable Wasm spec, and the register allocator is separately fuzzed with symbolic verification. There is an active effort to formally verify Cranelift's instruction-selection backends. We take security seriously and have a security policy as a part of Bytecode Alliance.

Cranelift has three backends: x86-64, aarch64 (aka ARM64), and s390x (aka IBM Z). All three backends fully support enough functionality for Wasm MVP, and x86-64 and aarch64 fully support SIMD as well. On x86-64, Cranelift supports both the System V AMD64 ABI calling convention used on many platforms and the Windows x64 calling convention. On aarch64, Cranelift supports the standard Linux calling convention and also has specific support for macOS (i.e., M1 / Apple Silicon).

Cranelift's code quality is within range of competitiveness to browser JIT engines' optimizing tiers. A recent paper includes third-party benchmarks of Cranelift, driven by Wasmtime, against V8 and an LLVM-based Wasm engine, WAVM (Fig 22). The speed of Cranelift's generated code is ~2% slower than that of V8 (TurboFan), and ~14% slower than WAVM (LLVM). Its compilation speed, in the same paper, is measured as approximately an order of magnitude faster than WAVM (LLVM). We continue to work to improve both measures.

The core codegen crates have minimal dependencies and are carefully written to handle malicious or arbitrary compiler input: in particular, they do not use callstack recursion.

Cranelift performs some basic mitigations for Spectre attacks on heap bounds checks, table bounds checks, and indirect branch bounds checks; see #1032 for more.

Cranelift's APIs are not yet considered stable, though we do follow semantic-versioning (semver) with minor-version patch releases.

Cranelift generally requires the latest stable Rust to build as a policy, and is tested as such, but we can incorporate fixes for compilation with older Rust versions on a best-effort basis.

Contributing

If you're interested in contributing to Cranelift: thank you! We have a contributing guide which will help you getting involved in the Cranelift project.

Planned uses

Cranelift is designed to be a code generator for WebAssembly, but it is general enough to be useful elsewhere too. The initial planned uses that affected its design were:

Wasmtime non-Web wasm engine.
Debug build backend for the Rust compiler.
WebAssembly compiler for the SpiderMonkey engine in Firefox (currently not planned anymore; SpiderMonkey team may re-assess in the future).
Backend for the IonMonkey JavaScript JIT compiler in Firefox (currently not planned anymore; SpiderMonkey team may re-assess in the future).

Building Cranelift

Cranelift uses a conventional Cargo build process.

Cranelift consists of a collection of crates, and uses a Cargo Workspace, so for some cargo commands, such as cargo test, the --all is needed to tell cargo to visit all of the crates.

test-all.sh at the top level is a script which runs all the cargo tests and also performs code format, lint, and documentation checks.

Log configuration

Cranelift uses the log crate to log messages at various levels. It doesn't specify any maximal logging level, so embedders can choose what it should be; however, this can have an impact of Cranelift's code size. You can use log features to reduce the maximum logging level. For instance if you want to limit the level of logging to warn messages and above in release mode:

[dependency.log]
...
features = ["release_max_level_warn"]

Editor Support

Editor support for working with Cranelift IR (clif) files:

Vim: https://github.com/bytecodealliance/cranelift.vim