Files
wasmtime/cranelift
Jamey Sharp d35c508436 cranelift-frontend: Replace Vecs with ListPools (#5001)
* Elide redundant sentinel values

The `undef_variables` lists were a binding from Variable to Value, but
the Values were always equal to a suffix of the block's parameters. So
instead of storing another copy, we can just get them back from the
block parameters.

According to DHAT, this decreases total memory allocated and number of
bytes written, and increases number of bytes read and instructions
retired, but all by small fractions of a percent. According to
hyperfine, main is "1.00 ± 0.01 times faster".

* Use entity_impl for cranelift_frontend::Variable

Instead of hand-coding essentially the same thing.

* Keep undefined variables in a ListPool

According to DHAT, this improves every measure of performance
(instructions retired, total memory allocated, max heap size, bytes
read, and bytes written), although by fractions of a percent. According
to hyperfine the difference is nearly zero, but on Spidermonkey this
branch is "1.01 ± 0.00 times faster" than main.

* Elide redundant block IDs

In a list of predecessors, we previously kept both the jump instruction
that points to the current block, and the block where that instruction
resides. But we can look up the block from the instruction as long as we
have access to the current Layout, which we do everywhere that it was
necessary. So don't store the block, just store the instruction.

* Keep predecessor definitions in a ListPool

* Make append_jump_argument independent of self

This makes it easier to reason about borrow-checking issues.

* Reuse `results` instead of re-doing variable lookup

This eliminates three array lookups per predecessor by hanging on to the
results of earlier steps a little longer. This only works now because I
previously removed the need to borrow all of `self`, which otherwise
prevented keeping a borrow of self.results alive.

I had experimented with using `Vec::split_off` to copy the relevant
chunk of results to a temporary heap allocation, but the extra
allocation and copy was measurably slower. So it's important that this
is just a borrow.

* Cache single-predecessor block ID when sealing

Of the code in cranelift_frontend, `use_var` is the second-hottest path,
sitting close behind the `build` function that's used when inserting
every new instruction. This makes sense given that the operands of a new
instruction usually need to be looked up immediately before building the
instruction.

So making the single-predecessor loops in `find_var` and `use_var_local`
do fewer memory accesses and execute fewer instructions turns out to
have a measurable effect. It's still only a small fraction of a percent
overall since cranelift-frontend is only a few percent of total runtime.

This patch keeps a block ID in the SSABlockData, which is None unless
both the block is sealed and it has exactly one predecessor. Doing so
avoids two array lookups on each iteration of the two loops.

According to DHAT, compared with main, at this point this PR uses 0.3%
less memory at max heap, reads 0.6% fewer bytes, and writes 0.2% fewer
bytes.

According to Hyperfine, this PR is "1.01 ± 0.01 times faster" than main
when compiling Spidermonkey. On the other hand, Sightglass says main is
1.01x faster than this PR on the same benchmark by CPU cycles. In short,
actual effects are too small to measure reliably.
2022-10-03 14:29:12 -07:00
..
2022-09-27 09:52:58 -07:00

Cranelift Code Generator

A Bytecode Alliance project

Cranelift is a low-level retargetable code generator. It translates a target-independent intermediate representation into executable machine code.

Build Status Chat Minimum rustc 1.37 Documentation Status

For more information, see the documentation.

For an example of how to use the JIT, see the JIT Demo, which implements a toy language.

For an example of how to use Cranelift to run WebAssembly code, see Wasmtime, which implements a standalone, embeddable, VM using Cranelift.

Status

Cranelift currently supports enough functionality to run a wide variety of programs, including all the functionality needed to execute WebAssembly (MVP and various extensions like SIMD), although it needs to be used within an external WebAssembly embedding such as Wasmtime to be part of a complete WebAssembly implementation. It is also usable as a backend for non-WebAssembly use cases: for example, there is an effort to build a Rust compiler backend using Cranelift.

Cranelift is production-ready, and is used in production in several places, all within the context of Wasmtime. It is carefully fuzzed as part of Wasmtime with differential comparison against V8 and the executable Wasm spec, and the register allocator is separately fuzzed with symbolic verification. There is an active effort to formally verify Cranelift's instruction-selection backends. We take security seriously and have a security policy as a part of Bytecode Alliance.

Cranelift has three backends: x86-64, aarch64 (aka ARM64), and s390x (aka IBM Z). All three backends fully support enough functionality for Wasm MVP, and x86-64 and aarch64 fully support SIMD as well. On x86-64, Cranelift supports both the System V AMD64 ABI calling convention used on many platforms and the Windows x64 calling convention. On aarch64, Cranelift supports the standard Linux calling convention and also has specific support for macOS (i.e., M1 / Apple Silicon).

Cranelift's code quality is within range of competitiveness to browser JIT engines' optimizing tiers. A recent paper includes third-party benchmarks of Cranelift, driven by Wasmtime, against V8 and an LLVM-based Wasm engine, WAVM (Fig 22). The speed of Cranelift's generated code is ~2% slower than that of V8 (TurboFan), and ~14% slower than WAVM (LLVM). Its compilation speed, in the same paper, is measured as approximately an order of magnitude faster than WAVM (LLVM). We continue to work to improve both measures.

The core codegen crates have minimal dependencies and are carefully written to handle malicious or arbitrary compiler input: in particular, they do not use callstack recursion.

Cranelift performs some basic mitigations for Spectre attacks on heap bounds checks, table bounds checks, and indirect branch bounds checks; see #1032 for more.

Cranelift's APIs are not yet considered stable, though we do follow semantic-versioning (semver) with minor-version patch releases.

Cranelift generally requires the latest stable Rust to build as a policy, and is tested as such, but we can incorporate fixes for compilation with older Rust versions on a best-effort basis.

Contributing

If you're interested in contributing to Cranelift: thank you! We have a contributing guide which will help you getting involved in the Cranelift project.

Planned uses

Cranelift is designed to be a code generator for WebAssembly, but it is general enough to be useful elsewhere too. The initial planned uses that affected its design were:

  • Wasmtime non-Web wasm engine.
  • Debug build backend for the Rust compiler.
  • WebAssembly compiler for the SpiderMonkey engine in Firefox (currently not planned anymore; SpiderMonkey team may re-assess in the future).
  • Backend for the IonMonkey JavaScript JIT compiler in Firefox (currently not planned anymore; SpiderMonkey team may re-assess in the future).

Building Cranelift

Cranelift uses a conventional Cargo build process.

Cranelift consists of a collection of crates, and uses a Cargo Workspace, so for some cargo commands, such as cargo test, the --all is needed to tell cargo to visit all of the crates.

test-all.sh at the top level is a script which runs all the cargo tests and also performs code format, lint, and documentation checks.

Log configuration

Cranelift uses the log crate to log messages at various levels. It doesn't specify any maximal logging level, so embedders can choose what it should be; however, this can have an impact of Cranelift's code size. You can use log features to reduce the maximum logging level. For instance if you want to limit the level of logging to warn messages and above in release mode:

[dependency.log]
...
features = ["release_max_level_warn"]

Editor Support

Editor support for working with Cranelift IR (clif) files: