wasmtime

Author	SHA1	Message	Date
Andrew Brown	a27a079d65	Replace ExtractLane format with BinaryImm8 Like https://github.com/bytecodealliance/wasmtime/pull/1762, this change the name of the `ExtractLane` format to the more-general `BinaryImm8` and renames its immediate argument from `lane` to `imm`.	2020-05-29 19:56:27 -07:00
Andrew Brown	7d6e94b952	Replace InsertLane format with TernaryImm8 The InsertLane format has an ordering (`value().imm().value()`) and immediate name (`"lane"`) that make it awkward to use for other instructions. This changes the ordering (`value().value().imm()`) and uses the default name (`"imm"`) throughout the codebase.	2020-05-29 19:56:27 -07:00
teapotd	e430984ac4	Improve bitselect codegen with knowledge of operand origin (#1783 ) * Encode vselect using BLEND instructions on x86 * Legalize vselect to bitselect * Optimize bitselect to vselect for some operands * Add run tests for bitselect-vselect optimization * Address review feedback	2020-05-29 19:53:11 -07:00
teapotd	0f55bb4b8d	Always check if struct-return parameter is needed	2020-05-25 20:03:24 +02:00
Peter Huene	ce5f3e153b	Only update XMM save unwind operation offsets when using a FP. This commit prevents updating the XMM save unwind operation offsets when a frame pointer is not used, even though currently Cranelift always uses a frame pointer. This will prevent incorrect unwind information in the future when we start omitting frame pointers.	2020-05-21 16:46:30 -07:00
Peter Huene	2cd5ed1880	Address code review feedback.	2020-05-21 15:57:11 -07:00
Peter Huene	78c3091e84	Fix FPR saving and shadow space allocation for Windows x64. This commit fixes both how FPR callee-saved registers are saved and how the shadow space allocation occurs when laying out the stack for Windows x64 calling convention. Importantly, this commit removes the compiler limitation of stack size for Windows x64 that was imposed because FPR saves previously couldn't always be represented in the unwind information. The FPR saves are now performed without using stack slots, much like how the callee-saved GPRs are saved. The total CSR space is given to `layout_stack` so that it is included in the frame size and to offset the layout of spills and explicit slots. The FPR saves are now done via an RSP offset (post adjustment) and they always follow the GPR saves on the stack. A simpler calculation can now be made to determine the proper offsets of the FPR saves for representing the unwind information. Additionally, the shadow space is no longer treated as an incoming argument, but an explicit stack slot that gets laid out at the lowest address possible in the local frame. This prevents `layout_stack` from putting a spill or explicit slot in this reserved space. In the future, `layout_stack` should take advantage of the caller-provided shadow space for spills, but this commit does not attempt to address that. The shadow space is now omitted from the local frame for leaf functions. Fixes #1728. Fixes #1587. Fixes #1475.	2020-05-20 15:37:30 -07:00
whitequark	4ec16fa057	Legalize 64 bit shifts on x86_32 using PSLLQ/PSRLQ. Co-authored-by: iximeow <git@iximeow.net>	2020-05-09 03:28:19 -07:00
whitequark	2331403741	Extend X86 ABI to cover stack overflow checking on X86-32. In stark contrast with every reasonable architecture, X86-32 does not pass any parameters in registers. Because of that we have to resort to reading arguments from stack without being able to use the stack slot machinery. (This wouldn't have been avoidable even by pinning a register because there is a trampoline in wasmtime with the C ABI that Cranelift needs to be able to call.)	2020-05-09 03:27:06 -07:00
Benjamin Bouvier	fa54422854	Add a work-in-progress backend for x86_64 using the new instruction selection; Most of the work is credited to Julian Seward. Co-authored-by: Julian Seward <jseward@acm.org> Co-authored-by: Chris Fallin <cfallin@mozilla.com>	2020-05-05 16:35:41 +02:00
Andrew Brown	5f0286696c	Add x86 implentation of 8x16 `ishl` This involves some large mask tables that may hurt code size but reduce the number of instructions. See https://github.com/WebAssembly/simd/issues/117 for a more in-depth discussion on this.	2020-04-23 10:55:54 -07:00
Alex Crichton	c9a0ba81a0	Implement interrupting wasm code, reimplement stack overflow (#1490 ) * Implement interrupting wasm code, reimplement stack overflow This commit is a relatively large change for wasmtime with two main goals: * Primarily this enables interrupting executing wasm code with a trap, preventing infinite loops in wasm code. Note that resumption of the wasm code is not a goal of this commit. * Additionally this commit reimplements how we handle stack overflow to ensure that host functions always have a reasonable amount of stack to run on. This fixes an issue where we might longjmp out of a host function, skipping destructors. Lots of various odds and ends end up falling out in this commit once the two goals above were implemented. The strategy for implementing this was also lifted from Spidermonkey and existing functionality inside of Cranelift. I've tried to write up thorough documentation of how this all works in `crates/environ/src/cranelift.rs` where gnarly-ish bits are. A brief summary of how this works is that each function and each loop header now checks to see if they're interrupted. Interrupts and the stack overflow check are actually folded into one now, where function headers check to see if they've run out of stack and the sentinel value used to indicate an interrupt, checked in loop headers, tricks functions into thinking they're out of stack. An interrupt is basically just writing a value to a location which is read by JIT code. When interrupts are delivered and what triggers them has been left up to embedders of the `wasmtime` crate. The `wasmtime::Store` type has a method to acquire an `InterruptHandle`, where `InterruptHandle` is a `Send` and `Sync` type which can travel to other threads (or perhaps even a signal handler) to get notified from. It's intended that this provides a good degree of flexibility when interrupting wasm code. Note though that this does have a large caveat where interrupts don't work when you're interrupting host code, so if you've got a host import blocking for a long time an interrupt won't actually be received until the wasm starts running again. Some fallout included from this change is: * Unix signal handlers are no longer registered with `SA_ONSTACK`. Instead they run on the native stack the thread was already using. This is possible since stack overflow isn't handled by hitting the guard page, but rather it's explicitly checked for in wasm now. Native stack overflow will continue to abort the process as usual. * Unix sigaltstack management is now no longer necessary since we don't use it any more. * Windows no longer has any need to reset guard pages since we no longer try to recover from faults on guard pages. * On all targets probestack intrinsics are disabled since we use a different mechanism for catching stack overflow. * The C API has been updated with interrupts handles. An example has also been added which shows off how to interrupt a module. Closes #139 Closes #860 Closes #900 * Update comment about magical interrupt value * Store stack limit as a global value, not a closure * Run rustfmt * Handle review comments * Add a comment about SA_ONSTACK * Use `usize` for type of `INTERRUPTED` * Parse human-readable durations * Bring back sigaltstack handling Allows libstd to print out stack overflow on failure still. * Add parsing and emission of stack limit-via-preamble * Fix new example for new apis * Fix host segfault test in release mode * Fix new doc example	2020-04-21 11:03:28 -07:00
Andrew Brown	3f47291f2e	Add x86 implentation of 8x16 `ushr` This involves some large mask tables that may hurt code size but reduce the number of instructions. See https://github.com/WebAssembly/simd/issues/117 for a more in-depth discussion on this.	2020-04-17 11:59:47 -07:00
Peter Huene	2fb7e9f3c2	Return error for register mapping failure. This commit removes a panic when a register mapping fails and instead returns an error from creating the unwind information.	2020-04-16 11:15:35 -07:00
Peter Huene	09a3f10a48	Move UnwindInfo definition out of x86 ABI. This commit moves the opaque definition of Windows x64 UnwindInfo out of the ISA and into a location that can be easily used by the top level `UnwindInfo` enum. This allows the `unwind` feature to be independent of the individual ISAs supported.	2020-04-16 11:15:34 -07:00
Peter Huene	f7e9f86ba9	Refactor unwind generation in Cranelift. This commit makes the following changes to unwind information generation in Cranelift: * Remove frame layout change implementation in favor of processing the prologue and epilogue instructions when unwind information is requested. This also means this work is no longer performed for Windows, which didn't utilize it. It also helps simplify the prologue and epilogue generation code. * Remove the unwind sink implementation that required each unwind information to be represented in final form. For FDEs, this meant writing a complete frame table per function, which wastes 20 bytes or so for each function with duplicate CIEs. This also enables Cranelift users to collect the unwind information and write it as a single frame table. * For System V calling convention, the unwind information is no longer stored in code memory (it's only a requirement for Windows ABI to do so). This allows for more compact code memory for modules with a lot of functions. * Deletes some duplicate code relating to frame table generation. Users can now simply use gimli to create a frame table from each function's unwind information. Fixes #1181.	2020-04-16 11:15:32 -07:00
Samrat Man Singh	4d34c22a1c	Use F64X2 as type when saving and restoring XMM registers When adding floating-point registers as callee-saved register to block- and function parameter lists add them as `F64X2` arguments.	2020-04-13 09:48:08 -07:00
iximeow	4cca510085	Windows FPRs preservation (#1216 ) Preserve FPRs as required by the Windows fastcall calling convention. This exposes an implementation limit due to Cranelift's approach to stack layout, which conflicts with expectations Windows makes in SEH layout - functions where the Cranelift user desires fastcall unwind information, that require preservation of an ABI-reserved FPR, that have a stack frame 240 bytes or larger, now produce an error when compiled. Several wasm spectests were disabled because they would trip this limit. This is a temporary constraint that should be fixed promptly. Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>	2020-04-10 13:27:20 -07:00
Andrew Brown	6fd0451bc3	Add TargetIsa::map_dwarf_register; fixes #1471 This exposes the functionality of `fde::map_reg` on the `TargetIsa` trait, avoiding compilation errors on architectures where register mapping is not yet supported. The change is conditially compiled under the `unwind` feature.	2020-04-09 09:45:20 -07:00
Andrew Brown	a799f9f6b5	Skip extra work when calculating sizes for recipes with inferred REX prefixes As explained in the added documentation and #1342, if we prevent `infer_rex()` and `w()` from being used together then we don't need to check whether the W bit is set when calculating the size of a recipe. This should improve compile time for x86 very slightly since all `infer_rex()` instructions will no longer need this check.	2020-04-02 16:50:07 -07:00
Andrew Brown	a4c1147045	Skip extra work when inferring REX prefixes As explained in the added documentation and #1342, if we prevent `infer_rex()` and `w()` from being used together then we don't need to check whether the W bit is set when figuring out if a REX prefix is needed in `needs_rex()`. This should improve compile time for x86 very slightly since all `infer_rex()` instructions will no longer need this check.	2020-04-02 16:50:07 -07:00
Andrew Brown	e425bfcebd	Infer REX prefixes for SIMD load and store with displacement	2020-04-02 11:28:42 -07:00
Andrew Brown	dc874a5b3b	Infer REX prefixes for SIMD load_extend	2020-04-02 11:28:42 -07:00
Andrew Brown	d3df275003	Remove duplication of map_reg; fixes #1245 Both cranelift-codegen and wasmtime-debug need to map Cranelift registers to Gimli registers. Previously both crates had an almost-identical `map_reg` implementation. This change: - removes the wasmtime-debug implementation - improves the cranelift-codegen implementation with custom errors - exposes map_reg in `cranelift_codegen::isa::fde::map_reg` and subsequently `wasmtime_environ::isa::fde::map_reg`	2020-03-31 15:42:02 -07:00
Andrew Brown	0d63bd12d8	Infer REX prefix for SIMD operations; fixes #1127 - Convert recipes to have necessary size calculator - Add a missing binemit function, `put_dynrexmp3` - Modify the meta-encodings of x86 SIMD instructions to use `infer_rex()`, mostly through the `enc_both_inferred()` helper - Fix up tests that previously always emitted a REX prefix	2020-03-18 10:12:50 -07:00
Andrew Brown	8598295bc4	Remove FPR32; fixes #1303 Until #1306 is resolved (some spilling/regalloc issue with larger FPR register banks), this removes FPR32 support. Only Wasm's `i64x2.mul` was using this register class and that instruction is predicated on AVX512 support; for the time being, that instruction will have to make do with the 16 FPR registers.	2020-03-17 12:46:41 -07:00
Andrew Brown	965714d675	Add encoding functions for emitting EVEX formats Only the `reg, vvvv, rm` form is currently supported but it should not be difficult to add more forms.	2020-03-06 10:53:22 -08:00
Andrew Brown	079fcafcb1	Expand x86 registers to include 32 XMM registers The EVEX encoding format (e.g. in AVX-512) allows addressing 32 registers instead of 16. The FPR register class currently defines 16 registers, `%xmm0`-`%xmm15`; that class is kept as-is with this change. A larger class, FPR32, is added as a super-class of FPR using a larger bank of registers, `%xmm0`-`%xmm31`.	2020-03-06 10:53:22 -08:00
Andrew Brown	3f53bcb740	Remove dependency on hard-coded ordering of x86 register banks With this change, register banks can now be re-ordered and other components (e.g. unwinding, regalloc) will no longer break. The previous behavior assumed that GPR registers always started at `RegUnit` 0.	2020-03-06 10:53:22 -08:00
bjorn3	0a1bb3ba6c	Add TLS support for ELF and MachO (#1174 ) * Add TLS support * Add binemit and legalize tests * Spill all caller-saved registers when necessary	2020-02-25 17:50:04 -08:00
Andrew Brown	1a9dc743d1	Infer REX prefix for SIMD `load` instruction	2020-02-19 09:24:05 -08:00
Andrew Brown	936120dcf9	Infer REX prefix for SIMD `store` and `vconst` instructions	2020-02-19 09:24:05 -08:00
Peter Delevoryas	18b40d1101	Add ineg legalization for scalar integer types (#1385 )	2020-02-14 13:16:02 -08:00
Ryan Hunt	832666c45e	Mass rename Ebb and relatives to Block (#1365 ) * Manually rename BasicBlock to BlockPredecessor BasicBlock is a pair of (Ebb, Inst) that is used to represent the basic block subcomponent of an Ebb that is a predecessor to an Ebb. Eventually we will be able to remove this struct, but for now it makes sense to give it a non-conflicting name so that we can start to transition Ebb to represent a basic block. I have not updated any comments that refer to BasicBlock, as eventually we will remove BlockPredecessor and replace with Block, which is a basic block, so the comments will become correct. * Manually rename SSABuilder block types to avoid conflict SSABuilder has its own Block and BlockData types. These along with associated identifier will cause conflicts in a later commit, so they are renamed to be more verbose here. * Automatically rename 'Ebb' to 'Block' in .rs Automatically rename 'EBB' to 'block' in .rs Automatically rename 'ebb' to 'block' in .rs Automatically rename 'extended basic block' to 'basic block' in .rs Automatically rename 'an basic block' to 'a basic block' in .rs Manually update comment for `Block` `Block`'s wikipedia article required an update. * Automatically rename 'an `Block`' to 'a `Block`' in .rs Automatically rename 'extended_basic_block' to 'basic_block' in .rs Automatically rename 'ebb' to 'block' in .clif Manually rename clif constant that contains 'ebb' as substring to avoid conflict * Automatically rename filecheck uses of 'EBB' to 'BB' 'regex: EBB' -> 'regex: BB' '$EBB' -> '$BB' * Automatically rename 'EBB' 'Ebb' to 'block' in .clif Automatically rename 'an block' to 'a block' in .clif Fix broken testcase when function name length increases Test function names are limited to 16 characters. This causes the new longer name to be truncated and fail a filecheck test. An outdated comment was also fixed.	2020-02-07 10:46:47 -06:00
Yury Delendik	169dbef784	Properly preserve and restore CFA state in FDE (#1373 ) * Properly preserve and restore CFA state in FDE	2020-02-03 14:08:40 -08:00
Ryan Hunt	a15bb9cfcb	Codegen: Use GPR regclass for reference types on x86	2020-01-23 13:37:11 -06:00
Andrew Brown	fd04ea2b06	Fix incorrect assertion for `insertlane` (#1355 ) Previously, the assertion checked for `lane > 0` when it should have been `lane >= 0`; since lane is unsigned, this half of the assertion can be entirely removed.	2020-01-17 14:39:31 -08:00
Andrew Brown	e1d513ab4b	Fix remaining clippy warnings (#1340 ) * clippy: allow complex encoding function * clippy: remove unnecessary main() function in doctest * clippy: remove redundant `Type` suffix on LaneType enum variants * clippy: ignore incorrect debug_assert_with_mut_call warning * clippy: fix FDE clippy warnings	2020-01-17 14:03:30 -06:00
Benjamin Bouvier	dd497c19e1	Renames Settings ⚠️ (fixes #976 ) (#1321 ) This is a breaking API change: the following settings have been renamed: - jump_tables_enabled -> enable_jump_tables - colocated_libcalls -> use_colocated_libcalls - probestack_enabled -> enable_probestack - allones_funcaddrs -> emit_all_ones_funcaddrs	2020-01-13 14:42:49 -07:00
Yury Delendik	bd88155483	Refactor unwind; add FDE support. (#1320 ) * Refactor unwind * add FDE support * use sink directly in emit functions * pref off all unwinding generation with feature	2020-01-13 10:32:55 -06:00
Andrew Brown	e8c3302bc5	Fix some additional clippy warnings	2020-01-10 08:38:40 -08:00
Sean Stangl	cf9e762f16	Add a DynRex recipe type for x86, decreasing the number of recipes (#1298 ) This patch adds a third mode for templates: REX inference is requestable at template instantiation time. This reduces the number of recipes by removing rex()/nonrex() redundancy for many instructions.	2019-12-19 15:49:34 -07:00
Andrew Brown	d4df756acf	Remove packed_struct dependency; closes #1271 and #1284 (#1282 )	2019-12-12 17:01:31 -08:00
llogiq	0d8f8bc71f	Fix some clippy warnings (#1277 )	2019-12-07 09:47:43 -08:00
iximeow	d804ab8b92	Track frame layout changes. (#1204 ) * Track frame layout changes.	2019-11-18 10:18:38 -08:00
Benjamin Bouvier	569a57fa7d	Hoist the stack alignment and Windows64 fastcall shadow stack space constants.	2019-11-15 13:58:47 +01:00
Sean Stangl	f8ae622003	Use a struct interface for creating and reading encoding bits on x86. #1156 (#1212 )	2019-11-13 18:01:13 -07:00
Nick Fitzgerald	7e32fa2731	Try and assign directly to return registers; backtrack to use struct-return param (#1213 ) * Try and assign directly to return registers; backtrack to use struct-return param Rather than trying to count number of return registers that would be used by a given set of return values, optimistically assign the return values to registers. If we later find that we can't fit them all in registers, then backtrack and introduce the use of a struct-return pointer parameter. * Rename `rets2` and wrap it in an option so we avoid the clone for non-multi-value	2019-11-08 09:51:57 -08:00
Sean Stangl	a06f2c87c2	Pass Encoding to compute_size() for runtime Encoding inspection. #1156 In some cases, compute_size() is used to choose between various different Encodings before one has been assigned to an instruction. For x86, the REX.W bit is stored in the Encoding. To share recipes between REX/non-REX, that bit must be inspected by compute_size().	2019-11-08 09:08:07 -08:00
Benjamin Bouvier	143cb01489	Do not align the stack frame for leaf functions not using the stack.	2019-11-08 17:20:20 +01:00

1 2

82 Commits