wasmtime

Author	SHA1	Message	Date
Jakob Stoklund Olesen	dbaa919ca9	Make room for SpecialType in the value type numbering. The value types are now classified into three groups: 1. Lane types are scalar types that can also be used to form vectors. 2. Vector types 2-256 copies of a lane type. 3. Special types. This is where the CPU flag types will go. The special types can't be used to form vectors. Change the numbering scheme for value types to make room for the special types and add `is_lane()` and `is_special()` classification methods. The VOID type still has number 0, but it can no longer appear as a vector lane. It classifies as special now.	2017-10-12 12:48:55 -07:00
Jakob Stoklund Olesen	89a24b2f13	Rename ScalarType to LaneType. The word "scalar" is a bit vague and tends to mean "non-vector". Since we are about to add new CPU flag value types that can't appear as vector lanes, make the distinction clear: LaneType represents value types that can appear as a vector lane. Also replace the Type::is_scalar() method with an is_vector() method.	2017-10-12 10:39:12 -07:00
Jakob Stoklund Olesen	994af598f5	Avoid interference on CFG edges. Track allocatable registers both locally and globally: Add a second AllocatableSet which tracks registers allocated to global values without accounting for register diversions. Since diversions are only local to an EBB, global values must be assigned un-diverted locations that don't interfere. Handle the third "global" interference domain in the constraint solver in addition to the existing "input" and "output" domains. Extend the solver error code to indicate when a global define just can't be allocated because there are not enough available global registers. Resolve this problem by replacing the instruction's global defines with local defines that are copied into their global destinations afterwards.	2017-10-11 15:38:30 -07:00
Jakob Stoklund Olesen	699cb9895e	Enforce a 4-byte minimum spill slot size. This is primarily for the benefit of 32-bit x86 code which can't spill 1-byte types from arbitrary registers. This makes it possible to use 32-bit writes to spill types like b1 and i8. These small types are expected to be very rare since WebAssembly doesn't have then, and we tend to push integer arithmetic to at least i32. The effect of frame sizes should be minimal.	2017-10-11 14:20:43 -07:00
Jakob Stoklund Olesen	1a04c4260f	Remove an unused import to silence a compiler warning.	2017-10-11 14:20:43 -07:00
Dan Gohman	3f30171b79	Actually disable simple_gvn and licm by default. See https://github.com/stoklund/cretonne/pull/164#discussion_r142449999 for details.	2017-10-10 16:28:29 -07:00
Jakob Stoklund Olesen	90ed698e83	Add an unreachable code elimination pass. The register allocator doesn't even try to compile unreachable EBBs, so any values defined in such blocks won't be assigned registers. Since the dominator tree already has determined which EBBs are reachable, we should just eliminate any unreachable blocks instead o trying to do something with the dead code. Not that this is not a "dead code elimination" pass which would also remove individual instructions whose results are not used.	2017-10-09 15:26:27 -07:00
Dan Gohman	6aeeaebbd3	Disallow branching to the entry block. Functions that would otherwise start with a loop should start with a separate ebb which just branches to the header of the loop.	2017-10-09 15:02:17 -07:00
Jakob Stoklund Olesen	893a6716c6	Enforce all instruction constraints in iterate_solution(). During iterate_solution(), live-through values may be converted to solver variables so they can be moved out of the way in order to satisfy all constraints. Make sure that the instruction's operand constraints are also considered for these new variables. Add a program_complete_input_constraints() which turns all the instruction's input operands into variables with the proper constraints. That makes it safe for try_add_var() to re-add these values as variables with looser generic constraints. The solver's add_var() function is split into three functions: add_var for use before inputs_done(), and add_killed_var/add_through_var for use after.	2017-10-09 14:08:37 -07:00
Jakob Stoklund Olesen	4a2bf6d9a6	Use a more compact display of AllocatableSet. Since only Intel uses named registers, we can use a one-char shorthand for the registers.	2017-10-09 14:08:37 -07:00
Jakob Stoklund Olesen	ac8c8a676a	Constrain solver variables as little as possible. When solver variables represent operands on the current instruction, they need to be constrained as required by the instructions, but variables that are simply moved out of the way should only be constrained to their top-level register class. The live range affinity is just a hint, not a requirement.	2017-10-09 14:08:37 -07:00
Jakob Stoklund Olesen	12a8d6cce1	Avoid diverting values that are live on an outgoing CFG edge. When try_add_var is looking for values that can be moved out of the way in order to satisfy constraints for the current instruction, avoid values that are live on a CFG edge originating at the current (branch) instruction. These values must be in their globally assigned location when entering the branch destination EBB. This is covered by the existing regalloc/iterate.cton test case which fails with an upcoming commit.	2017-10-09 14:08:37 -07:00
Dan Gohman	0c4500897f	Clarify FunctionName's role in its comment.	2017-10-09 13:50:03 -07:00
Jakob Stoklund Olesen	b3fa47cacc	Add support for emergency spill slots. - Create a new kind of stack slot: emergency_slot. - Add a get_emergency_slot() method which finds a suitable emergency slot given a list of slots already in use. - Use emergency spill slots when schedule_moves needs them.	2017-10-06 10:45:13 -07:00
Jakob Stoklund Olesen	d0b4c76262	Use a non-allocating sort algorithm. The sort_unstable* functions are available in stable Rust now. These functions never allocate memory.	2017-10-06 09:21:30 -07:00
Jakob Stoklund Olesen	b562fdcd5c	Remove the dfg::resolve_copies() method. This method was important back when result values couldn't be moved between instructions. Now that results can be moved, value aliases do everything we need. Copy instructions are still used to break interferences in the register allocator's coalescing phase, but there isn't really any reason to use a copy instruction over a value alias anywhere else. After and during register allocation, copy instructions are significant, so we never want to "see through" them like the resolve_copies() function did. This is related to #166, but probably doesn't fix the underlying problem.	2017-10-05 14:46:34 -07:00
Jakob Stoklund Olesen	30aeb57083	Add a value location verifier. This is a verification pass that can be run after register allocation. It verifies that value locations are consistent with constraints on their uses, and that the register diversions are consistent. Make it clear that register diversions are local to an EBB only. This affects what branch relaxation is allowed to do. The verify_locations() takes an optional Liveness parameter which is used to check that no diverted values are live across CFG edges.	2017-10-05 13:59:18 -07:00
Jakob Stoklund Olesen	73d4bb47c0	Intel encodings for regspill and regfill. These are always SP-based.	2017-10-04 17:02:09 -07:00
Jakob Stoklund Olesen	826d4062fb	Apply register diversions during binemit tests. When "binemit" tests encode instructions, keep track of the current set of register diversions, and use the diverted locations to check operand constraints. This matches how constraints are applied during a real binemit phase.	2017-10-04 17:02:09 -07:00
Jakob Stoklund Olesen	dda3efcbdd	Add regspill and regfill instructions. These are parallels to the existing regmove instruction, but the divert the value to and from a stack slot. Like regmove diversions, this is a temporary diversion that must be local to the EBB.	2017-10-04 17:02:09 -07:00
Jakob Stoklund Olesen	d4aeec6ece	Generalize RegDiversions to track stack locations too. For emergency spilling, we need to be able to temporarily divert an SSA value to a stack slot if there are no available registers.	2017-10-04 17:02:09 -07:00
Jakob Stoklund Olesen	e32aa8ab60	Emergency spilling for the solver's move scheduler. The register constraint solver schedules a set of move instructions to execute before the instruction getting colored. In extreme cases, this is not possible because there are no available registers to break cycles in the register assignments that must be scheduled. When that happens, we spill one register to an emergency slot so it becomes available for implementing the assignment cycle. Then the original register is restored. The coloring pass can't yet understand the spill and fill move types. This will be implemented next.	2017-10-04 17:02:09 -07:00
Jakob Stoklund Olesen	ce4d723a73	Give RegClassData a reference to its parent RegInfo. This makes it possible to materialize new RegClass references without requiring a RegInfo reference to be passed around. - Move the RegInfo::toprc() method to RegClassData. - Rename RegClassData::intersect() to intersect_index() and provide a new intersect() which returns a register class. - Remove some &RegInfo parameters that are no longer needed.	2017-10-04 17:02:09 -07:00
Jakob Stoklund Olesen	fb0999ce33	Check the top-level register class for available registers. Fixes #165. The constraint solver's schedule_move() function sometimes need to use an extra available register when the moves to be scheduled contains cycles. The pending moves have associated register classes that come from the constraint programming. Since the moves have hard-coded to and from registers, these register classes are only meant to indicate the register sizes. In particular, we can use the whole top-level register class when scavenging for a spare register to break a cycle.	2017-10-03 14:12:18 -07:00
Jakob Stoklund Olesen	739d414d18	Convert regalloc::coloring to use an EncCursor. No functional change intended, this is just a big fight with the borrow checker.	2017-10-03 13:39:43 -07:00
Jakob Stoklund Olesen	c091a695e6	Fix coalescer bug exposed by the gvn-unremovable-phi test. When we detect interference between the values that have already been merged into the candidate virtual register and an EBB argument, we first try to resolve the conflict by splitting. We also check if the existing interfering value is fundamentally incompatible with the branch instruction so it needs to be removed from the virtual register, restarting the merge operation. However, this existing interfering value is not necessarily the only interference, so the split is not guaranteed to resolve the conflict. If it turns out that splitting didn't resolve the conflict, restart the merge after removing this second conflicting value.	2017-10-03 11:13:46 -07:00
Jakob Stoklund Olesen	ef048b8899	Allow for call args in incoming stack slots. A value passed as an argument to a function call may live in an incoming stack slot initially. Fix the call legalizer so it copies such an argument into the expected outgoing stack slot for the call.	2017-10-03 11:13:46 -07:00
Dan Gohman	374ed3a07b	Fix dominator-tree queries on unreachable nodes.	2017-10-03 11:03:06 -07:00
Dan Gohman	7c7b1651d8	Do a full compile in 'cton-util wasm'. This removes the `optimize` option, as one can do that with `--set`, eg. `--set opt_level=best`. And it adds an option to print the compilation output. And, it enables simple_gvn and licm for opt_level=best.	2017-10-03 09:39:07 -07:00
Jakob Stoklund Olesen	5f56f81251	Resolve all value aliases when computing live ranges. Value aliases are only in the way during register allocation, so make sure they are all dead as we enter the register allocation passes.	2017-09-29 15:54:06 -07:00
Jakob Stoklund Olesen	51a6901a7f	Implement coloring::iterate_solution(). It can happen that the currently live registers are blocking a smaller register class completely, so the only way of solving the allocation problem is to turn some of the live-through registers into solver variables. When the quick_solve attempt fails, try to free up registers in the critical register class by turning live-through values into solver variables.	2017-09-29 14:55:35 -07:00
Jakob Stoklund Olesen	50ccd000a9	Implement branch relaxation. Now that we have the legal_encodings iterator, it is simpler to find an alternative branch encoding with a better range.	2017-09-29 12:42:34 -07:00
Jakob Stoklund Olesen	45888ab84e	Reload for spilled call return values. When the return value from a call has been spilled, the reload pass needs to insert a spill instruction right after the call instruction which returns its results in registers.	2017-09-29 11:25:38 -07:00
Jakob Stoklund Olesen	711e5cd644	Handle srem INT_MIN, -1 correctly. The x86_divmodx traps on integer overflow, but the srem instruction is not supposed to trap with a -1 divisor. Generate a legalization expansion for srem that special-cases the -1 divisor to simply return 0.	2017-09-29 08:53:49 -07:00
Jakob Stoklund Olesen	53404a9387	Check for invalid special type constraints. The extend and reduce instructions have additional type constraints. Stop inserting sextend instructions after ctz, clz, and popcnt when translating from WebAssembly. The Cretonne instructions have the same signature as the WebAssembly equivalents.	2017-09-28 16:30:19 -07:00
Jakob Stoklund Olesen	2888ff5bf3	Fix a corner case in fcvt_to_sint.i32.f64 legalization. An f64 can represent multiple values in the range INT_MIN-1 < x <= INT_MIN which all truncate to INT_MIN, so comparing the input value against INT_MIN is not good enough. Instead, detect overflow on x <= INT_MIN-1 when INT_MIN-1 is an exact floating point value.	2017-09-28 14:24:39 -07:00
Jakob Stoklund Olesen	8abcdac5a1	Legalize fcvt_to_sint and fcvt_to_uint for Intel64. We need to generate traps on NaN and overflow.	2017-09-28 12:00:38 -07:00
Jakob Stoklund Olesen	34146435e5	Legalize unsigned-to-float conversions for Intel 64. Also make sure we generate type checks for the controlling type variable in legalization patterns. This is not needed for encodings since the encoding tables are already keyed on the controlling type variable.	2017-09-28 11:39:19 -07:00
Jakob Stoklund Olesen	979a22f548	Add pow2() and neg() methods for the IEEE immediate types. These are convenient methods for creating common floating point constants.	2017-09-28 11:34:02 -07:00
Jakob Stoklund Olesen	1d481d7897	Use the ThreadId to name cretonne.dbg in unnamed threads. Don't use a single cretonne.dbg log file when there are multiple unnamed threads logging. They will clobber each other.	2017-09-27 16:27:49 -07:00
Jakob Stoklund Olesen	84471a8431	Add some very basic support for the Intel32 ABI. In 32-bit mode, all function arguments are passed on the stack, not in registers. This ABI support is not complete or properly tested, but at least it doesn't try to pass arguments in r8.	2017-09-27 12:55:34 -07:00
Jakob Stoklund Olesen	b6b474a8c9	Add Intel legalization for fmin and fmax. The native x86_fmin and x86_fmax instructions don't behave correctly for NaN inputs and when comparing +0.0 to -0.0, so we need separate branches for those cases.	2017-09-27 12:55:34 -07:00
Jakob Stoklund Olesen	6ff681a90d	Add general legalization for the select instruction.	2017-09-26 14:16:35 -07:00
Jakob Stoklund Olesen	6bec5f8507	Intel encodings for nearest/floor/ceil/trunc. These floating point rounding operations all use the roundss/roundsd instructions that are available in SSE 4.1.	2017-09-25 15:08:04 -07:00
Jakob Stoklund Olesen	fdb97da21b	Implement a poor man's jump table. We will eventually support real jump tables, but for now just expand br_table into a sequence of conditional branches.	2017-09-25 10:56:14 -07:00
Jakob Stoklund Olesen	29dfcf5dfb	Add spill/fill encodings for Intel ISAs. To begin with, these are catch-all encodings with a SIB byte and a 32-bit displacement, so they can access any stack slot via both the stack pointer and the frame pointer. In the future, we will add encodings for 8-bit displacements as well as EBP-relative references without a SIB byte.	2017-09-22 16:05:26 -07:00
Jakob Stoklund Olesen	76eb7df9f0	Add an isa::StackRef type. This contains encoding details for a stack reference: The base register and offset to use in the specific instruction encoding. Generate StackRef objects called in_stk0 etc for the binemit recipe code. All binemit recipes need to compute base pointer offsets for stack references, so have the automatically generated code do it.	2017-09-22 13:34:33 -07:00
Jakob Stoklund Olesen	2946cc54d3	Add more trap codes. These are codes that come up naturally when translating WebAssembly and legalizing the Cretonne instruction set.	2017-09-22 08:51:55 -07:00
Angus Holder	3b66c0be40	Emit compressed instruction encodings for instructions where constraints allow	2017-09-22 07:54:26 -07:00
Jakob Stoklund Olesen	2d4c860187	Convert legalizer::split and generated legalization code to FuncCursor.	2017-09-21 17:05:51 -07:00

1 2 3 4 5 ...

505 Commits