Alex Crichton 453330b2db x64: Add rudimentary support for some AVX instructions (#5795)
* x64: Add rudimentary support for some AVX instructions

I was poking around Spidermonkey's wasm backend and saw that the various
assembler functions used are all `v*`-prefixed which look like they're
intended for use with AVX instructions. I looked at Cranelift and it
currently doesn't have support for many AVX-based instructions, so I
figured I'd take a crack at it!

The support added here is a bit of a mishmash when viewed alone, but my
general goal was to take a single instruction from the SIMD proposal for
WebAssembly and migrate all of its component instructions to AVX. I, by
random chance, picked a pretty complicated instruction of `f32x4.min`.
This wasm instruction is implemented on x64 with 4 unique SSE
instructions and ended up being a pretty good candidate.

Further digging about AVX-vs-SSE shows that there should be two major
benefits to using AVX over SSE:

* Primarily AVX instructions largely use a three-operand form where two
  input registers are operated with and an output register is also
  specified. This is in contrast to SSE's predominant
  one-register-is-input-but-also-output pattern. This should help free
  up the register allocator a bit and additionally remove the need for
  movement between registers.

* As #4767 notes the memory-based operations of VEX-encoded instructions
  (aka AVX instructions) do not have strict alignment requirements which
  means we would be able to sink loads and stores into individual
  instructions instead of having separate instructions.

So I set out on my journey to implement the instructions used by
`f32x4.min`. The first few were fairly easy. The machinst backends are
already of the shape "take these inputs and compute the output" where
the x86 requirement of a register being both input and output is
postprocessed in. This means that the `inst.isle` creation helpers for
SSE instructions were already of the correct form to use AVX. I chose to
add new `rule` branches for the instruction creation helpers, for
example `x64_andnps`. The new `rule` conditionally only runs if AVX is
enabled and emits an AVX instruction instead of an SSE instruction for
achieving the same goal. This means that no lowerings of clif
instructions were modified, instead just new instructions are being
generated.

The VEX encoding was previously not heavily used in Cranelift. The only
current user are the FMA-style instructions that Cranelift has at this
time. These FMA instructions have one extra operand than `vandnps`, for
example, so I split the existing `XmmRmRVex` into a few more variants to
fit the shape of the instructions that needed generating for
`f32x4.min`. This was accompanied then with more AVX opcode definitions,
more emission support, etc.

Upon implementing all of this it turned out that the test suite was
failing on my machine due to the memory-operand encodings of VEX
instructions not being supported. I didn't explicitly add those in
myself but some preexisting RIP-relative addressing was leaking into the
new instructions with existing tests. I opted to go ahead and fill out
the memory addressing modes of VEX encoding to get the tests passing
again.

All-in-all this PR adds new instructions to the x64 backend for a number
of AVX instructions, updates 5 existing instruction producers to use AVX
instructions conditionally, implements VEX memory operands, and adds
some simple tests for the new output of `f32x4.min`. The existing
runtest for `f32x.min` caught a few intermediate bugs along the way and
I additionally added a plain `target x86_64` to that runtest to ensure
that it executes with and without AVX to test the various lowerings.
I'll also note that this, and future support, should be well-fuzzed
through Wasmtime's fuzzing which may explicitly disable AVX support
despite the machine having access to AVX, so non-AVX lowerings should be
well-tested into the future.

It's also worth mentioning that I am not an AVX or VEX or x64 expert.
Implementing the memory operand part for VEX was the hardest part of
this PR and while I think it should be good someone else should
definitely double-check me. Additionally I haven't added many
instructions to the x64 backend yet so I may have missed obvious places
to tests or such, so am happy to follow-up with anything to be more
thorough if necessary.

Finally I should note that this is just the tip of the iceberg when it
comes to AVX. My hope is to get some of the idioms sorted out to make it
easier for future PRs to add one-off instruction lowerings or such.

* Review feedback
2023-02-17 01:29:55 +00:00
2020-02-28 09:16:05 -08:00
2023-02-14 19:45:15 +00:00

wasmtime

A standalone runtime for WebAssembly

A Bytecode Alliance project

build status zulip chat supported rustc stable Documentation Status

Guide | Contributing | Website | Chat

Installation

The Wasmtime CLI can be installed on Linux and macOS (locally) with a small install script:

curl https://wasmtime.dev/install.sh -sSf | bash

Windows or otherwise interested users can download installers and binaries directly from the GitHub Releases page.

Example

If you've got the Rust compiler installed then you can take some Rust source code:

fn main() {
    println!("Hello, world!");
}

and compile/run it with:

$ rustup target add wasm32-wasi
$ rustc hello.rs --target wasm32-wasi
$ wasmtime hello.wasm
Hello, world!

Features

  • Fast. Wasmtime is built on the optimizing Cranelift code generator to quickly generate high-quality machine code either at runtime or ahead-of-time. Wasmtime is optimized for efficient instantiation, low-overhead calls between the embedder and wasm, and scalability of concurrent instances.

  • Secure. Wasmtime's development is strongly focused on correctness and security. Building on top of Rust's runtime safety guarantees, each Wasmtime feature goes through careful review and consideration via an RFC process. Once features are designed and implemented, they undergo 24/7 fuzzing donated by Google's OSS Fuzz. As features stabilize they become part of a release, and when things go wrong we have a well-defined security policy in place to quickly mitigate and patch any issues. We follow best practices for defense-in-depth and integrate protections and mitigations for issues like Spectre. Finally, we're working to push the state-of-the-art by collaborating with academic researchers to formally verify critical parts of Wasmtime and Cranelift.

  • Configurable. Wasmtime uses sensible defaults, but can also be configured to provide more fine-grained control over things like CPU and memory consumption. Whether you want to run Wasmtime in a tiny environment or on massive servers with many concurrent instances, we've got you covered.

  • WASI. Wasmtime supports a rich set of APIs for interacting with the host environment through the WASI standard.

  • Standards Compliant. Wasmtime passes the official WebAssembly test suite, implements the official C API of wasm, and implements future proposals to WebAssembly as well. Wasmtime developers are intimately engaged with the WebAssembly standards process all along the way too.

Language Support

You can use Wasmtime from a variety of different languages through embeddings of the implementation.

Languages supported by the Bytecode Alliance:

Languages supported by the community:

Documentation

📚 Read the Wasmtime guide here! 📚

The wasmtime guide is the best starting point to learn about what Wasmtime can do for you or help answer your questions about Wasmtime. If you're curious in contributing to Wasmtime, it can also help you do that!


It's Wasmtime.

Description
No description provided
Readme 125 MiB
Languages
Rust 77.8%
WebAssembly 20.6%
C 1.3%