wasmtime

Author	SHA1	Message	Date
Dan Gohman	5dc449ec9e	Rename "local variables" to "explicit stack slots". The term "local variables" predated the SSA builder in the front-end crate, which also provides a way to implement source-language local variables. The name "explicit stack slot" makes it clear what this construct is.	2018-02-28 14:04:28 -08:00
Julian Seward	7054f25abb	Adds support to transform integer div and rem by constants into cheaper equivalents. Adds support for transforming integer division and remainder by constants into sequences that do not involve division instructions. * div/rem by constant powers of two are turned into right shifts, plus some fixups for the signed cases. * div/rem by constant non-powers of two are turned into double length multiplies by a magic constant, plus some fixups involving shifts, addition and subtraction, that depends on the constant, the word size and the signedness involved. * The following cases are transformed: div and rem, signed or unsigned, 32 or 64 bit. The only un-transformed cases are: unsigned div and rem by zero, signed div and rem by zero or -1. * This is all incorporated within a new transformation pass, "preopt", in lib/cretonne/src/preopt.rs. * In preopt.rs, fn do_preopt() is the main driver. It is designed to be extensible to transformations of other kinds of instructions. Currently it merely uses a helper to identify div/rem transformation candidates and another helper to perform the transformation. * In preopt.rs, fn get_div_info() pattern matches to find candidates, both cases where the second arg is an immediate, and cases where the second arg is an identifier bound to an immediate at its definition point. * In preopt.rs, fn do_divrem_transformation() does the heavy lifting of the transformation proper. It in turn uses magic{S,U}{32,64} to calculate the magic numbers required for the transformations. * There are many test cases for the transformation proper: filetests/preopt/div_by_const_non_power_of_2.cton filetests/preopt/div_by_const_power_of_2.cton filetests/preopt/rem_by_const_non_power_of_2.cton filetests/preopt/rem_by_const_power_of_2.cton filetests/preopt/div_by_const_indirect.cton preopt.rs also contains a set of tests for magic number generation. * The main (non-power-of-2) transformation requires instructions that return the high word of a double-length multiply. For this, instructions umulhi and smulhi have been added to the core instruction set. These will map directly to single instructions on most non-intel targets. * intel does not have an instruction exactly like that. For intel, instructions x86_umulx and x86_smulx have been added. These map to real instructions and return both result words. The intel legaliser will rewrite {s,u}mulhi into x86_{s,u}mulx uses that throw away the lower half word. Tests: filetests/isa/intel/legalize-mulhi.cton (new file) filetests/isa/intel/binary64.cton (added x86_{s,u}mulx encoding tests)	2018-02-28 11:41:36 -08:00
Dan Gohman	ab9298eafa	Make the `fst` recipe use the deref-safe register class as well.	2018-02-28 10:12:40 -08:00
Jakob Stoklund Olesen	8d388b2218	Fix stack pointer offsets for outgoing arguments. StackSlotKind::OutgoingArg stack slots have an offset that is relative to our own stack pointer, while all other stack slot kinds have offsets that are relative to the caller's stack pointer. Make sure we generate the right sp-relative offsets for outgoing arguments too.	2018-02-21 10:34:41 -08:00
Dan Gohman	10dcfcacdb	Remove support for entity variables in filecheck. Now that the parser doesn't renumber indices, there's no need for entity variables like $v0.	2018-02-20 17:27:46 -08:00
Dan Gohman	a5b00b173e	Don't renumber entities in the parser. This makes it easier to debug testcases: - the entity numbers in a .cton file match the entity numbers used within Cretonne. - serializing and deserializing doesn't cause indices to change. One disadvantage is that if a .cton file uses sparse entity numbers, deserializing to the in-memory form doesn't compact it. However, the text format is not intended to be performance-critical, so this isn't expected to be a big burden.	2018-02-20 17:27:46 -08:00
Jakob Stoklund Olesen	b9b1d0fcd5	Add a trapff instruction. This is the floating point equivalent of trapif: Trap when a given condition is in the floating-point flags. Define Intel encodings comparable to the trapif encodings.	2018-02-20 14:35:41 -08:00
Jakob Stoklund Olesen	ad896d9790	Add more legalization patterns for *_imm instructions. When the imediate value is out of range for the legal encodings, convert these instructions to an iconst followed by their register counterparts.	2018-02-20 10:47:46 -08:00
Jakob Stoklund Olesen	a9e799debb	Add an avoid_div_traps setting. This enables code generation that never causes a SIGFPE signal to be raised from a division instruction. Instead, division and remainder calculations are protected by explicit traps.	2018-02-16 13:10:29 -08:00
Jakob Stoklund Olesen	3ccc3f4f9b	Add a stack_check instruction. This instruction loads a stack limit from a global variable and compares it to the stack pointer, trapping if the stack has grown beyond the limit. Also add a expand_flags transform group containing legalization patterns for ISAs with CPU flags. Fixes #234.	2018-02-13 10:48:06 -08:00
Jakob Stoklund Olesen	60e70da0e6	Add Intel encodings for ifcmp_imm. The instruction set has variants with 8-bit and 32-bit signed immediate operands. Add a TODO to use a TEST instruction for the special case ifcmp_imm x, 0.	2018-02-13 10:38:46 -08:00
Jakob Stoklund Olesen	788a78caf4	Add Intel encodings for ifcmp_sp. Also generate an Into<RegUnit> implementation for the RU enums.	2018-02-09 14:32:29 -08:00
Jakob Stoklund Olesen	69f70fc61d	Add Intel encodings for trapif. This is implemented as a macro with a conditional jump over a ud2. This way, we don't have to split up EBBs at every conditional trap.	2018-02-08 15:15:15 -08:00
Julian Seward	6f8a54b6a5	Adds support for legalizing CLZ, CTZ and POPCOUNT on baseline x86_64 targets. Changes: * Adds a new generic instruction, SELECTIF, that does value selection (a la conditional move) similarly to existing SELECT, except that it is controlled by condition code input and flags-register inputs. * Adds a new Intel x86_64 variant, 'baseline', that supports SSE2 and nothing else. * Adds new Intel x86_64 instructions BSR and BSF. * Implements generic CLZ, CTZ and POPCOUNT on x86_64 'baseline' targets using the new BSR, BSF and SELECTIF instructions. * Implements SELECTIF on x86_64 targets using conditional-moves. * new test filetests/isa/intel/baseline_clz_ctz_popcount.cton (for legalization) * new test filetests/isa/intel/baseline_clz_ctz_popcount_encoding.cton (for encoding) * Allow lib/cretonne/meta/gen_legalizer.py to generate non-snake-caseified Rust without rustc complaining. Fixes #238.	2018-02-06 09:43:00 -08:00
Tyler McMullen	14e39db428	Add filetest for statically out-of-bound heap addresses.	2018-01-18 15:49:10 -08:00
Tyler McMullen	df210bfdea	Fix the Intel x64 PIC 'call' test, adding correct addend.	2018-01-18 14:23:00 -08:00
Dan Gohman	4f53cc1dad	Align IntelGOTPCRel4 with R_X86_64_GOTPCREL. Add an addend field to reloc_external, and use it to move the responsibility for accounting for the difference between the end of an instruction (where the PC is considered to be in PC-relative on intel) and the beginning of the immediate field into the encoding code. Specifically, this makes IntelGOTPCRel4 directly correspond to R_X86_64_GOTPCREL, instead of also carrying an implicit `- 4`.	2017-12-15 16:17:32 -06:00
Dan Gohman	76e31cc1ad	Rename GotPCRel4 to GOTPCRel4. This emphasizes that GOT is being used as an abbreviation rather than the word "got".	2017-12-15 16:17:32 -06:00
Pat Hickey	ed81bc21be	filetests: add filetests for intel PIC encodings	2017-12-12 19:29:52 -08:00
Jakob Stoklund Olesen	a7eb13a151	Expand unknown instructions to runtime library calls.	2017-12-08 10:37:50 -08:00
Jakob Stoklund Olesen	f03729d742	Fix generated code for ISA predicates on encoding recipes. The generated code had syntax errors and inverted logic. Add an SSE 4.1 requirement to the floating point rounding instructions.	2017-12-08 10:37:50 -08:00
Tyler McMullen	7988d0c54c	Add 8-bit variation of adjust_sp_imm for 32-bit and 64-bit Intel.	2017-12-05 11:49:12 -08:00
Tyler McMullen	5783ea2c9a	Account for return address when reserving stack space for CSRs.	2017-12-05 11:49:12 -08:00
Tyler McMullen	a75248d2cf	Move the initial stack pointer adjustment to after the CSR pushes.	2017-12-05 11:49:12 -08:00
Tyler McMullen	ebcbd54f61	Add 'compile' test and confirm the pro/epilogue is added. Fix regression this revealed.	2017-12-05 11:49:12 -08:00
Tyler McMullen	ced39f5186	Fix up adjust_sp_imm instruction. * Use imm64 rather than offset32 * Add predicate to enforce signed 32-bit limit to imm * Remove AdjustSpImm format * Add encoding tests for adjust_sp_imm * Adjust use of adjust_sp_imm in Intel prologue_epilogue to match	2017-12-05 11:49:12 -08:00
Tyler McMullen	1a11c351b5	Add tests and documentation for x86_(push\|pop). Fix up encoding issues revealed by tests.	2017-12-05 11:49:12 -08:00
Tyler McMullen	3b1b33e0ac	Add docs and tests for copy_special instruction. Fixes encoding issue that tests revealed.	2017-12-05 11:49:12 -08:00
Tyler McMullen	6ec4bfc4ca	Fix up the encodings for new instructions, both expected and actual. Make the test more accurate.	2017-12-05 11:49:12 -08:00
Tyler McMullen	fdfe24760a	Add missing newline to prologue epilogue test	2017-12-05 11:49:12 -08:00
Tyler McMullen	d4311d2b1d	Add prologue-epilogue test that exercises new instructions and binary emission.	2017-12-05 11:49:12 -08:00
Pat Hickey	b5601d57c8	filetests: change hex function names to user function numbers	2017-11-23 14:08:47 -08:00
Dan Gohman	5d063eb8bc	Merge reloc_func and reloc_globalsym into reloc_external.	2017-10-31 12:26:33 -07:00
Dan Gohman	9c54c3fff0	Introduce globalsym_addr. This is an instruction used in legalization of GlobalVarData::Sym global variables.	2017-10-30 13:26:56 -07:00
Dan Gohman	cb805f704d	Put BaldrMonkey-specific behavior under a setting. BaldrMonkey will need to enable allones_funcaddrs.	2017-10-30 13:26:56 -07:00
Jakob Stoklund Olesen	1b71285b34	Return bools in GPR registers. Boolean types are returned in %rax, so regclass_for_abi_type() should return GPR. Fixes #179.	2017-10-25 13:34:55 -07:00
Jakob Stoklund Olesen	5d065c4d8f	Add encodings for CPU flags instructions. Branch on flags: brif, brff, Compare integers to flags: ifcmp Compare floats to flags: ffcmp Convert flags to b1: trueif, trueff	2017-10-16 13:07:23 -07:00
Jakob Stoklund Olesen	ba52a38597	Add a t8jccd_long encoding recipe for brz.b1 and brnz.b1 in 32-bit mode. The register allocator can't handle branches with constrained register operands, and the brz.b1/brnz.b1 instructions only have the t8jccd_abcd in 32-bit mode where no REX prefixes are possible. This adds a worst case encoding for those cases where a b1 value lives in a non-ABCD register.	2017-10-11 14:20:43 -07:00
Jakob Stoklund Olesen	73d4bb47c0	Intel encodings for regspill and regfill. These are always SP-based.	2017-10-04 17:02:09 -07:00
Jakob Stoklund Olesen	ef048b8899	Allow for call args in incoming stack slots. A value passed as an argument to a function call may live in an incoming stack slot initially. Fix the call legalizer so it copies such an argument into the expected outgoing stack slot for the call.	2017-10-03 11:13:46 -07:00
Jakob Stoklund Olesen	a274cdf275	Fix the Intel encoding of band_not. The andnps instruction inverts its first argument while band_not inverts is second argument. Use a swapped-operands "fax" encoding recipe.	2017-09-27 18:14:13 -07:00
Jakob Stoklund Olesen	84471a8431	Add some very basic support for the Intel32 ABI. In 32-bit mode, all function arguments are passed on the stack, not in registers. This ABI support is not complete or properly tested, but at least it doesn't try to pass arguments in r8.	2017-09-27 12:55:34 -07:00
Jakob Stoklund Olesen	b6b474a8c9	Add Intel legalization for fmin and fmax. The native x86_fmin and x86_fmax instructions don't behave correctly for NaN inputs and when comparing +0.0 to -0.0, so we need separate branches for those cases.	2017-09-27 12:55:34 -07:00
Jakob Stoklund Olesen	44eab3e158	Add Intel regmove encodings for floating point types.	2017-09-27 12:49:54 -07:00
Jakob Stoklund Olesen	1fe7890700	Add x86_fmin and x86_fmax instructions. These Intel-specific instructions represent the semantics of the minss / maxss Intel instructions which behave more like a C ternary operator than the WebAssembly fmin and fmax instructions. They will be used as building blocks for implementing the WebAssembly semantics.	2017-09-27 09:17:09 -07:00
Jakob Stoklund Olesen	ac69f3bfdf	Add an Intel-specific x86_cvtt2si instruction. This is used to represent the non-trapping semantics of the cvttss2si and cvttsd2si instructions (and their vectorized counterparts). The overflow behavior of this instruction is specific to the Intel ISAs. There is no float-to-i64 instruction on the 32-bit Intel ISA.	2017-09-26 15:44:41 -07:00
Jakob Stoklund Olesen	6ff681a90d	Add general legalization for the select instruction.	2017-09-26 14:16:35 -07:00
Jakob Stoklund Olesen	ce767be703	Intel encodings for floating point copies.	2017-09-26 13:54:38 -07:00
Jakob Stoklund Olesen	7fb6159a85	Add Intel encodings for the fcmp instruction. Not all floating point condition codes are directly supported by the ucimiss/ucomisd instructions. Some inequalities need to be reversed and eq+ne require two separate tests.	2017-09-26 11:17:32 -07:00
Jakob Stoklund Olesen	6bec5f8507	Intel encodings for nearest/floor/ceil/trunc. These floating point rounding operations all use the roundss/roundsd instructions that are available in SSE 4.1.	2017-09-25 15:08:04 -07:00

1 2

93 Commits