wasmtime

Author	SHA1	Message	Date
Chris Fallin	0c240991ae	Merge pull request #2346 from uweigand/abi-noframepointer machinst ABI: Pass fixed frame size to gen_clobber_restore	2020-11-03 09:00:59 -08:00
Julian Seward	5a5fb11979	CL/aarch64: implement the wasm SIMD `i32x4.dot_i16x8_s` instruction This patch implements, for aarch64, the following wasm SIMD extensions i32x4.dot_i16x8_s instruction https://github.com/WebAssembly/simd/pull/127 It also updates dependencies as follows, in order that the new instruction can be parsed, decoded, etc: wat to 1.0.27 wast to 26.0.1 wasmparser to 0.65.0 wasmprinter to 0.2.12 The changes are straightforward: * new CLIF instruction `widening_pairwise_dot_product_s` * translation from wasm into `widening_pairwise_dot_product_s` * new AArch64 instructions `smull`, `smull2` (part of the `VecRRR` group) * translation from `widening_pairwise_dot_product_s` to `smull ; smull2 ; addv` There is no testcase in this commit, because that is a separate repo. The implementation has been tested, nevertheless.	2020-11-03 14:25:04 +01:00
Ulrich Weigand	c9bc4edd08	machinst ABI: Pass fixed frame size to gen_clobber_restore The ABI common code currently passes the fixed frame size to the gen_clobber_save back-end routine, which is required to emit code to allocate the required stack space in the prologue. Similarly, the back-end needs to emit code to de-allocate the stack in the epilogue. However, at this point the back-end does not have access to that fixed frame size value any more. With targets that use a frame pointer, this does not matter, since de-allocation can be done simply by assigning the frame pointer back to the stack pointer. However, on targets that do not use a frame pointer, the frame size is required. To allow back-ends that option, this patch changes ABI common code to pass the fixed frame size to get_clobber_restore as well (the same value as is passed to get_clobber_save).	2020-11-03 11:15:03 +01:00
Ulrich Weigand	d02ae3940c	machinst ABI: Allow back-end to define stack alignment The common gen_prologue code currently assumes that the stack pointer has to be aligned to twice the word size. While this is true for many ABIs, it does not hold universally. This patch adds a new callback stack_align that back-ends can provide to define the specific stack alignment required by the ABI on that platform.	2020-11-03 09:43:55 +01:00
Andrew Brown	6d50099816	Rewrite interpreter generically (#2323 ) * Rewrite interpreter generically This change re-implements the Cranelift interpreter to use generic values; this makes it possible to do abstract interpretation of Cranelift instructions. In doing so, the interpretation state is extracted from the `Interpreter` structure and is accessed via a `State` trait; this makes it possible to not only more clearly observe the interpreter's state but also to interpret using a dummy state (e.g. `ImmutableRegisterState`). This addition made it possible to implement more of the Cranelift instructions (~70%, ignoring the x86-specific instructions). * Replace macros with closures	2020-11-02 12:28:07 -08:00
Chris Fallin	d1be8dcfc0	Merge pull request #2310 from akirilov-arm/vector_constants Cranelift AArch64: Improve code generation for vector constants	2020-11-01 21:56:40 -08:00
bjorn3	23aafa1054	Fix icmp_imm.i128 The immediate splitting code contained a bug causing both low and high to be equal for i128. This is the root cause for bjorn3/rustc_codegen_cranelift#1097 and likely the only bug preventing cg_clif from bootstrapping rustc.	2020-10-31 21:11:50 +01:00
Johnnie Birch	c32740ffcd	Updates comments on Int to Float conversion Int to float for unsigned ints has merged, but there were some comments on a different PR for the same pull request that are addressed in this PR	2020-10-30 16:49:30 -07:00
Anton Kirilov	207779fe1d	Cranelift AArch64: Improve code generation for vector constants In particular, introduce initial support for the MOVI and MVNI instructions, with 8-bit elements. Also, treat vector constants as 32- or 64-bit floating-point numbers, if their value allows it, by relying on the architectural zero extension. Finally, stop generating literal loads for 32-bit constants. Copyright (c) 2020, Arm Limited.	2020-10-30 13:16:12 +00:00
Andrew Brown	6c6d958f38	[machinst x64]: implement packed pmin/pmax	2020-10-28 16:03:53 -07:00
Andrew Brown	6725b6b129	[machinst x64]: implement bitmask	2020-10-28 15:16:36 -07:00
Andrew Brown	5b9a21e099	Add missing `SourceLoc` to newly-emitted instructions The changes in https://github.com/bytecodealliance/wasmtime/pull/2278 added `SourceLoc`s to several x64 `Inst` variants; between when that PR was last run in CI and when it was merged, new instructions were added that require this new parameter. This change adds the parameter in order to fix CI.	2020-10-28 14:33:09 -07:00
Johnnie Birch	8bbe6a25a9	Add support for packed float to signed int conversion Implements i32x4.trunc_sat_f32x4_s	2020-10-28 13:02:50 -07:00
Johnnie Birch	97392eae3d	Adds support for converting packed unsigned integer to packed float	2020-10-28 13:02:50 -07:00
Chris Fallin	c35904a8bf	Merge pull request #2278 from akirilov-arm/load_splat Introduce the Cranelift IR instruction `LoadSplat`	2020-10-28 12:54:03 -07:00
Leonardo Yvens	bde9555793	Add Trap::trap_code (#2309 ) * add Trap::trap_code * Add non-exhaustive wasmtime::TrapCode * wasmtime: Better document TrapCode * move and refactor test	2020-10-27 16:30:45 -05:00
Julian Seward	c15d9bd61b	CL/aarch64: implement the wasm SIMD pseudo-max/min and FP-rounding instructions This patch implements, for aarch64, the following wasm SIMD extensions Floating-point rounding instructions https://github.com/WebAssembly/simd/pull/232 Pseudo-Minimum and Pseudo-Maximum instructions https://github.com/WebAssembly/simd/pull/122 The changes are straightforward: * `build.rs`: the relevant tests have been enabled * `cranelift/codegen/meta/src/shared/instructions.rs`: new CLIF instructions `fmin_pseudo` and `fmax_pseudo`. The wasm rounding instructions do not need any new CLIF instructions. * `cranelift/wasm/src/code_translator.rs`: translation into CLIF; this is pretty much the same as any other unary or binary vector instruction (for the rounding and the pmin/max respectively) * `cranelift/codegen/src/isa/aarch64/lower_inst.rs`: - `fmin_pseudo` and `fmax_pseudo` are converted into a two instruction sequence, `fcmpgt` followed by `bsl` - the CLIF rounding instructions are converted to a suitable vector `frint{n,z,p,m}` instruction. * `cranelift/codegen/src/isa/aarch64/inst/mod.rs`: minor extension of `pub enum VecMisc2` to handle the rounding operations. And corresponding `emit` cases.	2020-10-26 10:37:07 +01:00
Yury Delendik	de4af90af6	machinst x64: New backend unwind (#2266 ) Addresses unwind for experimental x64 backend. The preliminary code enables backtrace on SystemV call convension.	2020-10-23 15:19:41 -05:00
Julian Seward	2702942050	CL/aarch64 back end: implement the wasm SIMD `bitmask` instructions The `bitmask.{8x16,16x8,32x4}` instructions do not map neatly to any single AArch64 SIMD instruction, and instead need a sequence of around ten instructions. Because of this, this patch is somewhat longer and more complex than it would be for (eg) x64. Main changes are: * the relevant testsuite test (`simd_boolean.wast`) has been enabled on aarch64. * at the CLIF level, add a new instruction `vhigh_bits`, into which these wasm instructions are to be translated. * in the wasm->CLIF translation (code_translator.rs), translate into `vhigh_bits`. This is straightforward. * in the CLIF->AArch64 translation (lower_inst.rs), translate `vhigh_bits` into equivalent sequences of AArch64 instructions. There is a different sequence for each of the `{8x16, 16x8, 32x4}` variants. All other changes are AArch64-specific, and add instruction definitions needed by the previous step: * Add two new families of AArch64 instructions: `VecShiftImm` (vector shift by immediate) and `VecExtract` (effectively a double-length vector shift) * To the existing AArch64 family `VecRRR`, add a `zip1` variant. To the `VecLanesOp` family add an `addv` variant. * Add supporting code for the above changes to AArch64 instructions: - getting the register uses (`aarch64_get_regs`) - mapping the registers (`aarch64_map_regs`) - printing instructions - emitting instructions (`impl MachInstEmit for Inst`). The handling of `VecShiftImm` is a bit complex. - emission tests for new instructions and variants.	2020-10-23 05:26:25 +02:00
Yury Delendik	b10e027fef	Refactor UnwindInfo codes and frame_register (#2307 ) * Refactor UnwindInfo codes and frame_register * use isa word_size * fix filetests * Add comment about UnwindCode::PushRegister	2020-10-22 14:52:42 -05:00
Johnnie Birch	f27c0f3434	Adds support for signed packed integer conversion to float f32x4.convert_i32x4_s	2020-10-16 14:16:53 -07:00
Yury Delendik	3c68845813	Cranelift: refactoring of unwind info (#2289 ) * factor common code * move fde/unwind emit to more abstract level * code_len -> function_size * speedup block scanning * better function_size calciulation * Rename UnwindCode enums	2020-10-15 08:34:50 -05:00
Andrew Brown	a26e9e9a20	[machinst x64]: lower load_splat using memory addressing	2020-10-14 09:43:33 -07:00
Andrew Brown	d990dd4c9a	[machinst x64]: add source locations to more instruction formats In order to register traps for `load_splat`, several instruction formats need knowledge of `SourceLoc`s; however, since the x64 backend does not correctly and completely register traps for `RegMem::Mem` variants I opened https://github.com/bytecodealliance/wasmtime/issues/2290 to discuss and resolve this issue. In the meantime, the current behavior (i.e. remaining largely unaware of `SourceLoc`s) is retained.	2020-10-14 09:43:33 -07:00
Anton Kirilov	e0b911a4df	Introduce the Cranelift IR instruction `LoadSplat` It corresponds to WebAssembly's `load*_splat` operations, which were previously represented as a combination of `Load` and `Splat` instructions. However, there are architectures such as Armv8-A that have a single machine instruction equivalent to the Wasm operations. In order to generate it, it is necessary to merge the `Load` and the `Splat` in the backend, which is not possible because the load may have side effects. The new IR instruction works around this limitation. The AArch64 backend leverages the new instruction to improve code generation. Copyright (c) 2020, Arm Limited.	2020-10-14 13:07:13 +01:00
Nick Fitzgerald	c2d01fe56f	Merge pull request #2257 from fitzgen/peepmatic-no-paths-in-linear-ir Peepmatic: Do not use paths in linear IR	2020-10-13 12:18:26 -07:00
Nick Fitzgerald	c015d69eb8	peepmatic: Do not use paths in linear IR Rather than using paths from the root instruction to the instruction we are matching against or checking if it is constant or whatever, use temporary variables. When we successfully match an instruction's opcode, we simultaneously define these temporaries for the instruction's operands. This is similar to how open-coding these matches in Rust would use `match` expressions with pattern matching to bind the operands to variables at the same time. This saves about 1.8% of instructions retired when Peepmatic is enabled.	2020-10-13 11:03:48 -07:00
Andrew Brown	1799b0947f	[machinst x64]: implement packed bitselect	2020-10-09 10:04:50 -07:00
Andrew Brown	95f0e96e62	[machinst x64]: implement packed not This begins to use `Inst` helper functions as discussed in #2252.	2020-10-09 10:04:50 -07:00
Andrew Brown	3c55523d40	[machinst x64]: implement packed and, and_not, xor, or	2020-10-09 10:04:50 -07:00
Benjamin Bouvier	e8c2a1763a	machinst x64: avoid emitting movzx when the input is an ALU 32-bits operation;	2020-10-09 18:49:27 +02:00
Benjamin Bouvier	3980a43cda	machinst x64: use the (base,offset) addressing mode even in the presence of a uextend;	2020-10-09 18:49:27 +02:00
Andrew Brown	c8cce5d2d7	[machinst x64]: enable packed saturated arithmetic	2020-10-08 08:46:20 -07:00
Benjamin Bouvier	116acb8dcd	machinst x64: emit nop of variable sizes;	2020-10-08 10:05:57 +02:00
Benjamin Bouvier	a470f1e0cd	machinst x64: remove dead code and allow(dead_code) annotation; The BranchTarget is always used as a label, so just use a plain MachLabel in this case.	2020-10-08 10:05:57 +02:00
Benjamin Bouvier	e32e6fb612	machinst x64: check SSE requirements for instructions against enabled features;	2020-10-08 09:21:51 +02:00
Benjamin Bouvier	c5bbc87498	machinst: allow passing constant information to the instruction emitter; A new associated type Info is added to MachInstEmit, which is the immutable counterpart to State. It can't easily be constructed from an ABICallee, since it would require adding an associated type to the latter, and making so leaks the associated type in a lot of places in the code base and makes the code harder to read. Instead, the EmitInfo state can simply be passed to the `Vcode::emit` function directly.	2020-10-08 09:21:51 +02:00
Andrew Brown	3778fa025c	Switch DataValue to use Ieee32/Ieee64 As discussed in #2251, in order to be very confident that NaN signaling bits are correctly handled by the compiler, this switches `DataValue` to use Cranelift's `Ieee32` and `Ieee64` structures. This makes it a bit more inconvenient to interpreter Cranelift FP operations but this should change to something like `rustc_apfloat` in the future.	2020-10-07 12:17:17 -07:00
Andrew Brown	ce44719e1f	refactor: change LowerCtx::get_immediate to return a DataValue This change abstracts away (from the perspective of the new backend) how immediate values are stored in InstructionData. It gathers large immediates from necessary places (e.g. constant pool) and delegates to `InstructionData::imm_value` for the rest. This refactor only touches original users of `LowerCtx::get_immediate` but a future change could do the same for any place the new backend is accessing InstructionData directly to retrieve immediates.	2020-10-07 12:17:17 -07:00
Andrew Brown	3a2025fdc7	Add InstructionData::imm_value()	2020-10-07 12:17:17 -07:00
Andrew Brown	6f6f79ef2b	refactor: move DataValue from cranelift-reader to cranelift-codegen This is no change to functionality; the move is necessary in order to return InstructionData immediates in a structure way (see next commit).	2020-10-07 12:17:17 -07:00
Benjamin Bouvier	84ac3feef8	machinst x64: use zero-latency move instructions for f32/f64; As found by @julian-seward1, movss/movsd aren't included in the zero-latency move instructions section of the Intel optimization manual. Use MOVAPS instead for those moves.	2020-10-07 10:55:44 +02:00
Chris Fallin	71768bb6cf	Fix AArch64 ABI to respect half-caller-save, half-callee-save vec regs. This PR updates the AArch64 ABI implementation so that it (i) properly respects that v8-v15 inclusive have callee-save lower halves, and caller-save upper halves, by conservatively approximating (to full registers) in the appropriate directions when generating prologue caller-saves and when informing the regalloc of clobbered regs across callsites. In order to prevent saving all of these vector registers in the prologue of every non-leaf function due to the above approximation, this also makes use of a new regalloc.rs feature to exclude call instructions' writes from the clobber set returned by register allocation. This is safe whenever the caller and callee have the same ABI (because anything the callee could clobber, the caller is allowed to clobber as well without saving it in the prologue). Fixes #2254.	2020-10-06 14:44:02 -07:00
Benjamin Bouvier	df8f85f4bc	machinst x64: remove non_camel_case_types;	2020-10-05 17:44:31 +02:00
Benjamin Bouvier	4a10a78e33	machinst x64: remove non_snake_case;	2020-10-05 17:44:31 +02:00
Johnnie Birch	7b4d173b90	Adds packed floating point min/max for X64 for the new backend Allows for simd_f32x4 and simd_f64x2 spec tests	2020-10-02 16:20:10 -07:00
Chris Fallin	3ca173e4bc	Fix arm32 build after some ABI framework changes. It turns out that while we don't have the partial/experimental arm32 backend tested on our CI yet, the Firefox build does at least rely on the backend to build, because it specifies the `arm32` feature to `cranelift-codegen`, even if it will never invoke the backend. Our previous old-framework arm32 stub at least compiled, so it didn't break Firefox. We should probably add a CI build check to ensure we don't bitrot what we have here, but this is the immediate fix to get us back to sanity.	2020-10-02 11:55:46 -07:00
Chris Fallin	b2f52910fb	Merge pull request #2224 from jgouly/sp_adjust arm64: Use SignedOffset rather than PreIndexed addressing mode for ca…	2020-10-02 09:18:00 -07:00
Andrew Brown	ca1b76421a	[machinst x64]: remove duplicate code to insert a lane	2020-10-02 08:29:31 -07:00
Andrew Brown	c42a097a0c	[machinst x64]: use `is64` instead of `w_bit`	2020-10-02 08:29:31 -07:00

... 10 11 12 13 14 ...

1271 Commits