Files

Alex Crichton e68aa99588 Implement the memory64 proposal in Wasmtime (#3153 )

* Implement the memory64 proposal in Wasmtime

This commit implements the WebAssembly [memory64 proposal][proposal] in
both Wasmtime and Cranelift. In terms of work done Cranelift ended up
needing very little work here since most of it was already prepared for
64-bit memories at one point or another. Most of the work in Wasmtime is
largely refactoring, changing a bunch of `u32` values to something else.

A number of internal and public interfaces are changing as a result of
this commit, for example:

* Acessors on `wasmtime::Memory` that work with pages now all return
`u64` unconditionally rather than `u32`. This makes it possible to
accommodate 64-bit memories with this API, but we may also want to
consider `usize` here at some point since the host can't grow past
`usize`-limited pages anyway.

* The `wasmtime::Limits` structure is removed in favor of
minimum/maximum methods on table/memory types.

* Many libcall intrinsics called by jit code now unconditionally take
`u64` arguments instead of `u32`. Return values are `usize`, however,
since the return value, if successful, is always bounded by host
memory while arguments can come from any guest.

* The `heap_addr` clif instruction now takes a 64-bit offset argument
instead of a 32-bit one. It turns out that the legalization of
`heap_addr` already worked with 64-bit offsets, so this change was
fairly trivial to make.

* The runtime implementation of mmap-based linear memories has changed
to largely work in `usize` quantities in its API and in bytes instead
of pages. This simplifies various aspects and reflects that
mmap-memories are always bound by `usize` since that's what the host
is using to address things, and additionally most calculations care
about bytes rather than pages except for the very edge where we're
going to/from wasm.

Overall I've tried to minimize the amount of `as` casts as possible,
using checked `try_from` and checked arithemtic with either error
handling or explicit `unwrap()` calls to tell us about bugs in the
future. Most locations have relatively obvious things to do with various
implications on various hosts, and I think they should all be roughly of
the right shape but time will tell. I mostly relied on the compiler
complaining that various types weren't aligned to figure out
type-casting, and I manually audited some of the more obvious locations.
I suspect we have a number of hidden locations that will panic on 32-bit
hosts if 64-bit modules try to run there, but otherwise I think we
should be generally ok (famous last words). In any case I wouldn't want
to enable this by default naturally until we've fuzzed it for some time.

In terms of the actual underlying implementation, no one should expect
memory64 to be all that fast. Right now it's implemented with
"dynamic" heaps which have a few consequences:

* All memory accesses are bounds-checked. I'm not sure how aggressively
Cranelift tries to optimize out bounds checks, but I suspect not a ton
since we haven't stressed this much historically.

* Heaps are always precisely sized. This means that every call to
`memory.grow` will incur a `memcpy` of memory from the old heap to the
new. We probably want to at least look into `mremap` on Linux and
otherwise try to implement schemes where dynamic heaps have some
reserved pages to grow into to help amortize the cost of
`memory.grow`.

The memory64 spec test suite is scheduled to now run on CI, but as with
all the other spec test suites it's really not all that comprehensive.
I've tried adding more tests for basic things as I've had to implement
guards for them, but I wouldn't really consider the testing adequate
from just this PR itself. I did try to take care in one test to actually
allocate a 4gb+ heap and then avoid running that in the pooling
allocator or in emulation because otherwise that may fail or take
excessively long.

[proposal]: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md

* Fix some tests

* More test fixes

* Fix wasmtime tests

* Fix doctests

* Revert to 32-bit immediate offsets in `heap_addr`

This commit updates the generation of addresses in wasm code to always
use 32-bit offsets for `heap_addr`, and if the calculated offset is
bigger than 32-bits we emit a manual add with an overflow check.

* Disable memory64 for spectest fuzzing

* Fix wrong offset being added to heap addr

* More comments!

* Clarify bytes/pages

2021-08-12 09:40:20 -05:00

bforest

Bump to Wasmtime v0.29.0 and Cranelift 0.76.0.

2021-08-02 11:24:09 -07:00

codegen

Implement the memory64 proposal in Wasmtime (#3153 )

2021-08-12 09:40:20 -05:00

docs

Fix an incorrect link.

2021-03-20 03:41:03 +09:00

entity

Bump to Wasmtime v0.29.0 and Cranelift 0.76.0.

2021-08-02 11:24:09 -07:00

filetests

Revert IR changes

2021-08-05 09:35:32 +01:00

frontend

Bump to Wasmtime v0.29.0 and Cranelift 0.76.0.

2021-08-02 11:24:09 -07:00

fuzzgen

Bump to Wasmtime v0.29.0 and Cranelift 0.76.0.

2021-08-02 11:24:09 -07:00

interpreter

Revert IR changes

2021-08-05 09:35:32 +01:00

jit

Bump to Wasmtime v0.29.0 and Cranelift 0.76.0.

2021-08-02 11:24:09 -07:00

media

Check in the Crane and Ferris drawing so that people can remix it :-).

2018-09-13 15:30:39 -07:00

module

Bump to Wasmtime v0.29.0 and Cranelift 0.76.0.

2021-08-02 11:24:09 -07:00

native

Bump to Wasmtime v0.29.0 and Cranelift 0.76.0.

2021-08-02 11:24:09 -07:00

object

Bump to Wasmtime v0.29.0 and Cranelift 0.76.0.

2021-08-02 11:24:09 -07:00

peepmatic

Bump the wasm-tools crates (#3139 )

2021-08-04 09:53:47 -05:00

preopt

Bump to Wasmtime v0.29.0 and Cranelift 0.76.0.

2021-08-02 11:24:09 -07:00

reader

Bump to Wasmtime v0.29.0 and Cranelift 0.76.0.

2021-08-02 11:24:09 -07:00

serde

Bump to Wasmtime v0.29.0 and Cranelift 0.76.0.

2021-08-02 11:24:09 -07:00

src

cranelift: Move most debug-level logs to the trace level

2021-07-26 11:50:16 -07:00

tests

machinst x64: enable clif testing

2020-09-25 11:12:21 +02:00

umbrella

Bump to Wasmtime v0.29.0 and Cranelift 0.76.0.

2021-08-02 11:24:09 -07:00

wasm

Implement the memory64 proposal in Wasmtime (#3153 )

2021-08-12 09:40:20 -05:00

Cargo.toml

Bump to Wasmtime v0.29.0 and Cranelift 0.76.0.

2021-08-02 11:24:09 -07:00

README.md

Update README.md

2021-01-25 15:29:51 -08:00

rustc.md

Update outdated references to the Cranelift repository

2020-03-09 14:06:24 +01:00

spidermonkey.md

Convert top-level *.rst files to markdown.

2018-07-17 15:01:08 -07:00

README.md

Cranelift Code Generator

A Bytecode Alliance project

Cranelift is a low-level retargetable code generator. It translates a target-independent intermediate representation into executable machine code.

For more information, see the documentation.

For an example of how to use the JIT, see the JIT Demo, which implements a toy language.

For an example of how to use Cranelift to run WebAssembly code, see Wasmtime, which implements a standalone, embeddable, VM using Cranelift.

Status

Cranelift currently supports enough functionality to run a wide variety of programs, including all the functionality needed to execute WebAssembly MVP functions, although it needs to be used within an external WebAssembly embedding to be part of a complete WebAssembly implementation.

The x86-64 backend is currently the most complete and stable; other architectures are in various stages of development. Cranelift currently supports both the System V AMD64 ABI calling convention used on many platforms and the Windows x64 calling convention. The performance of code produced by Cranelift is not yet impressive, though we have plans to fix that.

The core codegen crates have minimal dependencies, support no_std mode (see below), and do not require any host floating-point support, and do not use callstack recursion.

Cranelift does not yet perform mitigations for Spectre or related security issues, though it may do so in the future. It does not currently make any security-relevant instruction timing guarantees. It has seen a fair amount of testing and fuzzing, although more work is needed before it would be ready for a production use case.

Cranelift's APIs are not yet stable.

Cranelift currently requires Rust 1.37 or later to build.

Contributing

If you're interested in contributing to Cranelift: thank you! We have a contributing guide which will help you getting involved in the Cranelift project.

Planned uses

Cranelift is designed to be a code generator for WebAssembly, but it is general enough to be useful elsewhere too. The initial planned uses that affected its design are:

Building Cranelift

Cranelift uses a conventional Cargo build process.

Cranelift consists of a collection of crates, and uses a Cargo Workspace, so for some cargo commands, such as cargo test, the --all is needed to tell cargo to visit all of the crates.

test-all.sh at the top level is a script which runs all the cargo tests and also performs code format, lint, and documentation checks.

Building with no_std

The following crates support `no_std`, although they do depend on liballoc:

cranelift-entity
cranelift-bforest
cranelift-codegen
cranelift-frontend
cranelift-native
cranelift-wasm
cranelift-module
cranelift-preopt
cranelift

To use no_std mode, disable the std feature and enable the core feature. This currently requires nightly rust.

For example, to build `cranelift-codegen`:

cd cranelift-codegen
cargo build --no-default-features --features core

Or, when using cranelift-codegen as a dependency (in Cargo.toml):

[dependency.cranelift-codegen]
...
default-features = false
features = ["core"]

no_std support is currently "best effort". We won't try to break it, and we'll accept patches fixing problems, however we don't expect all developers to build and test no_std when submitting patches. Accordingly, the ./test-all.sh script does not test no_std.

There is a separate ./test-no_std.sh script that tests the no_std support in packages which support it.

It's important to note that cranelift still needs liballoc to compile. Thus, whatever environment is used must implement an allocator.

Also, to allow the use of HashMaps with no_std, an external crate called hashmap_core is pulled in (via the core feature). This is mostly the same as std::collections::HashMap, except that it doesn't have DOS protection. Just something to think about.

Log configuration

Cranelift uses the log crate to log messages at various levels. It doesn't specify any maximal logging level, so embedders can choose what it should be; however, this can have an impact of Cranelift's code size. You can use log features to reduce the maximum logging level. For instance if you want to limit the level of logging to warn messages and above in release mode:

[dependency.log]
...
features = ["release_max_level_warn"]

Editor Support

Editor support for working with Cranelift IR (clif) files:

Vim: https://github.com/bytecodealliance/cranelift.vim