wasmtime

Author	SHA1	Message	Date
Chris Fallin	61dc38c065	Implement Spectre mitigations for table accesses and br_tables. (#4092 ) Currently, we have partial Spectre mitigation: we protect heap accesses with dynamic bounds checks. Specifically, we guard against errant accesses on the misspeculated path beyond the bounds-check conditional branch by adding a conditional move that is also dependent on the bounds-check condition. This data dependency on the condition is not speculated and thus will always pick the "safe" value (in the heap case, a NULL address) on the misspeculated path, until the pipeline flushes and recovers onto the correct path. This PR uses the same technique both for table accesses -- used to implement Wasm tables -- and for jumptables, used to implement Wasm `br_table` instructions. In the case of Wasm tables, the cmove picks the table base address on the misspeculated path. This is equivalent to reading the first table entry. This prevents loads of arbitrary data addresses on the misspeculated path. In the case of `br_table`, the cmove picks index 0 on the misspeculated path. This is safer than allowing a branch to an address loaded from an index under misspeculation (i.e., it preserves control-flow integrity even under misspeculation). The table mitigation is controlled by a Cranelift setting, on by default. The br_table mitigation is always on, because it is part of the single lowering pseudoinstruction. In both cases, the impact should be minimal: a single extra cmove in a (relatively) rarely-used operation. The table mitigation is architecture-independent (happens during legalization); the br_table mitigation has been implemented for both x64 and aarch64. (I don't know enough about s390x to implement this confidently there, but would happily review a PR to do the same on that platform.)	2022-05-02 11:19:16 -07:00
Chris Fallin	0af8737ec3	Add support for running the regalloc2 checker. (#4043 ) With these fixes, all this PR has to do is instantiate and run the checker on the `regalloc2::Output`. This is off by default, and is enabled by setting the `regalloc_checker` Cranelift option. This restores the old functionality provided by e.g. the `backtracking_checked` regalloc algorithm setting rather than `backtracking` when we were still on regalloc.rs.	2022-04-18 14:06:07 -07:00
Chris Fallin	a0318f36f0	Switch Cranelift over to regalloc2. (#3989 ) This PR switches Cranelift over to the new register allocator, regalloc2. See [this document](https://gist.github.com/cfallin/08553421a91f150254fe878f67301801) for a summary of the design changes. This switchover has implications for core VCode/MachInst types and the lowering pass. Overall, this change brings improvements to both compile time and speed of generated code (runtime), as reported in #3942: ``` Benchmark Compilation (wallclock) Execution (wallclock) blake3-scalar 25% faster 28% faster blake3-simd no diff no diff meshoptimizer 19% faster 17% faster pulldown-cmark 17% faster no diff bz2 15% faster no diff SpiderMonkey, 21% faster 2% faster fib(30) clang.wasm 42% faster N/A ```	2022-04-14 10:28:21 -07:00
bjorn3	37115c10e0	Implement Display for settings::Value	2021-07-03 14:34:42 +02:00
Chris Fallin	800cf25bb5	Make the CFG metadata computation conditional on a flag.	2021-05-24 13:01:15 -07:00
bjorn3	03fdbadfb4	Remove thiserror dependency from cranelift_codegen	2021-05-04 13:45:20 +02:00
Peter Huene	0ddfe97a09	Change how flags are stored in serialized modules. This commit changes how both the shared flags and ISA flags are stored in the serialized module to detect incompatibilities when a serialized module is instantiated. It improves the error reporting when a compiled module has mismatched shared flags.	2021-04-01 21:39:57 -07:00
Peter Huene	abf3bf29f9	Add a `wasmtime settings` command to print Cranelift settings. This commit adds the `wasmtime settings` command to print out available Cranelift settings for a target (defaults to the host). The compile command has been updated to remove the Cranelift ISA options in favor of encouraging users to use `wasmtime settings` to discover what settings are available. This will reduce the maintenance cost for syncing the compile command with Cranelift ISA flags.	2021-04-01 19:38:19 -07:00
Chris Fallin	2d5db92a9e	Rework/simplify unwind infrastructure and implement Windows unwind. Our previous implementation of unwind infrastructure was somewhat complex and brittle: it parsed generated instructions in order to reverse-engineer unwind info from prologues. It also relied on some fragile linkage to communicate instruction-layout information that VCode was not designed to provide. A much simpler, more reliable, and easier-to-reason-about approach is to embed unwind directives as pseudo-instructions in the prologue as we generate it. That way, we can say what we mean and just emit it directly. The usual reasoning that leads to the reverse-engineering approach is that metadata is hard to keep in sync across optimization passes; but here, (i) prologues are generated at the very end of the pipeline, and (ii) if we ever do a post-prologue-gen optimization, we can treat unwind directives as black boxes with unknown side-effects, just as we do for some other pseudo-instructions today. It turns out that it was easier to just build this for both x64 and aarch64 (since they share a factored-out ABI implementation), and wire up the platform-specific unwind-info generation for Windows and SystemV. Now we have simpler unwind on all platforms and we can delete the old unwind infra as soon as we remove the old backend. There were a few consequences to supporting Fastcall unwind in particular that led to a refactor of the common ABI. Windows only supports naming clobbered-register save locations within 240 bytes of the frame-pointer register, whatever one chooses that to be (RSP or RBP). We had previously saved clobbers below the fixed frame (and below nominal-SP). The 240-byte range has to include the old RBP too, so we're forced to place clobbers at the top of the frame, just below saved RBP/RIP. This is fine; we always keep a frame pointer anyway because we use it to refer to stack args. It does mean that offsets of fixed-frame slots (spillslots, stackslots) from RBP are no longer known before we do regalloc, so if we ever want to index these off of RBP rather than nominal-SP because we add support for `alloca` (dynamic frame growth), then we'll need a "nominal-BP" mode that is resolved after regalloc and clobber-save code is generated. I added a comment to this effect in `abi_impl.rs`. The above refactor touched both x64 and aarch64 because of shared code. This had a further effect in that the old aarch64 prologue generation subtracted from `sp` once to allocate space, then used stores to `[sp, offset]` to save clobbers. Unfortunately the offset only has 7-bit range, so if there are enough clobbered registers (and there can be -- aarch64 has 384 bytes of registers; at least one unit test hits this) the stores/loads will be out-of-range. I really don't want to synthesize large-offset sequences here; better to go back to the simpler pre-index/post-index `stp r1, r2, [sp, #-16]` form that works just like a "push". It's likely not much worse microarchitecturally (dependence chain on SP, but oh well) and it actually saves an instruction if there's no other frame to allocate. As a further advantage, it's much simpler to understand; simpler is usually better. This PR adds the new backend on Windows to CI as well.	2021-03-11 20:03:52 -08:00
Chris Fallin	6c94eb82aa	x86-64 Windows fastcall ABI support. This adds support for the "fastcall" ABI, which is the native C/C++ ABI on Windows platforms on x86-64. It is similar to but not exactly like System V; primarily, its argument register assignments are different, and it requires stack shadow space. Note that this also adjusts the handling of multi-register values in the shared ABI implementation, and with this change, adjusts handling of `i128`s on both Fastcall/x64 and SysV/x64 platforms. This was done to align with actual behavior by the "rustc ABI" on both platforms, as mapped out experimentally (Compiler Explorer link in comments). This behavior is gated under the `enable_llvm_abi_extensions` flag. Note also that this does not add x64 unwind info on Windows. That will come in a future PR (but is planned!).	2021-03-03 19:53:18 -08:00
Alex Crichton	503129ad91	Add a method to share `Config` across machines (#2608 ) With `Module::{serialize,deserialize}` it should be possible to share wasmtime modules across machines or CPUs. Serialization, however, embeds a hash of all configuration values, including cranelift compilation settings. By default wasmtime's selection of the native ISA would enable ISA flags according to CPU features available on the host, but the same CPU features may not be available across two machines. This commit adds a `Config::cranelift_clear_cpu_flags` method which allows clearing the target-specific ISA flags that are automatically inferred by default for the native CPU. Options can then be incrementally built back up as-desired with teh `cranelift_other_flag` method.	2021-01-26 15:59:12 -06:00
Yury Delendik	399ee0a54c	Serialize and deserialize compilation artifacts. (#2020 ) * Serialize and deserialize Module * Use bincode to serialize * Add wasm_module_serialize; docs * Simple tests	2020-07-21 15:05:50 -05:00
Chris Fallin	e694fb1312	Spectre mitigation on heap access overflow checks. This PR adds a conditional move following a heap bounds check through which the address to be accessed flows. This conditional move ensures that even if the branch is mispredicted (access is actually out of bounds, but speculation goes down in-bounds path), the acually accessed address is zero (a NULL pointer) rather than the out-of-bounds address. The mitigation is controlled by a flag that is off by default, but can be set by the embedding. Note that in order to turn it on by default, we would need to add conditional-move support to the current x86 backend; this does not appear to be present. Once the deprecated backend is removed in favor of the new backend, IMHO we should turn this flag on by default. Note that the mitigation is unneccessary when we use the "huge heap" technique on 64-bit systems, in which we allocate a range of virtual address space such that no 32-bit offset can reach other data. Hence, this only affects small-heap configurations.	2020-07-01 08:36:09 -07:00
Benjamin Bouvier	65ef26b989	Add a setting to choose a register allocator algorithm to use with MachBackend;	2020-04-22 14:47:18 +02:00
bjorn3	0a1bb3ba6c	Add TLS support for ELF and MachO (#1174 ) * Add TLS support * Add binemit and legalize tests * Spill all caller-saved registers when necessary	2020-02-25 17:50:04 -08:00
Benjamin Bouvier	dd497c19e1	Renames Settings ⚠️ (fixes #976 ) (#1321 ) This is a breaking API change: the following settings have been renamed: - jump_tables_enabled -> enable_jump_tables - colocated_libcalls -> use_colocated_libcalls - probestack_enabled -> enable_probestack - allones_funcaddrs -> emit_all_ones_funcaddrs	2020-01-13 14:42:49 -07:00
Josh Triplett	7e725cf880	Migrate from failure to thiserror The failure crate invents its own traits that don't use std::error::Error (because failure predates certain features added to Error); this prevents using ? on an error from failure in a function using Error. The thiserror crate integrates with the standard Error trait instead.	2019-10-30 17:15:09 -07:00
Peter Huene	9f506692c2	Fix clippy warnings. This commit fixes the current set of (stable) clippy warnings in the repo.	2019-10-24 17:20:12 -07:00
bjorn3	bb8fa40ef0	Rustfmt	2019-10-02 11:50:44 -07:00
bjorn3	10e226f9ff	Always use extern crate std in cranelift-codegen	2019-10-02 11:50:44 -07:00
julian-seward1	9e088e4164	Reorganise optimisation level settings, and make the insn shrink pass optional (#1044 ) This patch: * removes the "default" opt level, on the basis that it has no definition and is referred to nowhere in the compiler. * renames the "fastest" level to "none". The resulting set of transformations is unchanged. * renames the "best" level to "speed_and_size". The resulting set of transformations is unchanged. * adds a new level, "speed". This is the same as "speed_and_size" except that it omits transformations aimed only at reducing code size. Currently it omits only the insn shrinking pass.	2019-09-19 18:51:25 +02:00
Benjamin Bouvier	c1609b70e8	[codegen] Allow using the pinned register as the heap base via a setting;	2019-09-06 16:18:27 +02:00
Benjamin Bouvier	660b8b28b8	[codegen] Add a pinned register that's entirely under the control of the user;	2019-09-06 16:18:27 +02:00
Carmen Kwan	19257f80c1	Add reference types R32 and R64 -Add resumable_trap, safepoint, isnull, and null instructions -Add Stackmap struct and StackmapSink trait Co-authored-by: Mir Ahmed <mirahmed753@gmail.com> Co-authored-by: Dan Gohman <sunfish@mozilla.com>	2019-08-16 11:35:16 -07:00
Benjamin Bouvier	d8d3602257	Adds the libcall_call_conv setting and use it for libcall calls expansion;	2019-08-12 16:12:00 -07:00
Andrew Brown	f2c48009e8	Disable SIMD features by default	2019-07-16 17:07:44 -07:00
Benjamin Bouvier	563525b090	[meta] Remove mentions to Python in comments of the non-meta crate;	2019-07-05 17:50:17 +02:00
Benjamin Bouvier	d7d48d5cc6	Add the dyn keyword before trait objects;	2019-06-24 11:42:26 +02:00
Benjamin Bouvier	d6059d4605	[meta] Use the Rust crate for settings generation;	2019-05-03 12:01:12 +02:00
lazypassion	747ad3c4c5	moved crates in lib/ to src/, renamed crates, modified some files' text (#660 ) moved crates in lib/ to src/, renamed crates, modified some files' text (#660)	2019-01-28 15:56:54 -08:00

30 Commits