wasmtime

Author	SHA1	Message	Date
bjorn3	a81c206870	Various cleanups to Layout (#6042 ) * Use inst_block instead of pp_block where possible * Remove unused is_block_gap method * Remove ProgramOrder trait It only has a single implementation * Rename Layout::cmp to pp_cmp to distinguish it from Ord::cmp * Make pp_block non-generic * Use rpo_cmp_block instead of rpo_cmp in the verifier * Remove ProgramPoint * Rename ExpandedProgramPoint to ProgramPoint * Remove From<ValueDef> for ProgramPoint impl	2023-03-17 18:46:34 +00:00
Afonso Bordado	a10c50afe9	cranelift: Translate `stack_*` accesses as unaligned (#6016 ) We can't currently ensure that these will be aligned, so we shouldn't mark them as such.	2023-03-15 18:05:55 +00:00
Alex Crichton	de0e0bea3f	Legalize `b{and,or,xor}_not` into component instructions (#5709 ) * Remove trailing whitespace in `lower.isle` files * Legalize the `band_not` instruction into simpler form This commit legalizes the `band_not` instruction into `band`-of-`bnot`, or two instructions. This is intended to assist with egraph-based optimizations where the `band_not` instruction doesn't have to be specifically included in other bit-operation-patterns. Lowerings of the `band_not` instruction have been moved to a specialization of the `band` instruction. * Legalize `bor_not` into components Same as prior commit, but for the `bor_not` instruction. * Legalize bxor_not into bxor-of-bnot Same as prior commits. I think this also ended up fixing a bug in the s390x backend where `bxor_not x y` was actually translated as `bnot (bxor x y)` by accident given the test update changes. * Simplify not-fused operands for riscv64 Looks like some delegated-to rules have special-cases for "if this feature is enabled use the fused instruction" so move the clause for testing the feature up to the lowering phase to help trigger other rules if the feature isn't enabled. This should make the riscv64 backend more consistent with how other backends are implemented. * Remove B{and,or,xor}Not from cost of egraph metrics These shouldn't ever reach egraphs now that they're legalized away. * Add an egraph optimization for `x^-1 => ~x` This adds a simplification node to translate xor-against-minus-1 to a `bnot` instruction. This helps trigger various other optimizations in the egraph implementation and also various backend lowering rules for instructions. This is chiefly useful as wasm doesn't have a `bnot` equivalent, so it's encoded as `x^-1`. * Add a wasm test for end-to-end bitwise lowerings Test that end-to-end various optimizations are being applied for input wasm modules. * Specifically don't self-update rustup on CI I forget why this was here originally, but this is failing on Windows CI. In general there's no need to update rustup, so leave it as-is. * Cleanup some aarch64 lowering rules Previously a 32/64 split was necessary due to the `ALUOp` being different but that's been refactored away no so there's no longer any need for duplicate rules. * Narrow a x64 lowering rule This previously made more sense when it was `band_not` and rarely used, but be more specific in the type-filter on this rule that it's only applicable to SIMD types with lanes. * Simplify xor-against-minus-1 rule No need to have the commutative version since constants are already shuffled right for egraphs * Optimize band-of-bnot when bnot is on the left Use some more rules in the egraph algebraic optimizations to canonicalize band/bor/bxor with a `bnot` operand to put the operand on the right. That way the lowerings in the backends only have to list the rule once, with the operand on the right, to optimize both styles of input. * Add commutative lowering rules * Update cranelift/codegen/src/isa/x64/lower.isle Co-authored-by: Jamey Sharp <jamey@minilop.net> --------- Co-authored-by: Jamey Sharp <jamey@minilop.net>	2023-02-06 13:53:40 -06:00
Trevor Elliott	a5698cedf8	cranelift: Remove brz and brnz (#5630 ) Remove the brz and brnz instructions, as their behavior is now redundant with brif.	2023-01-30 20:34:56 +00:00
Trevor Elliott	25bf8e0e67	Make DataFlowGraph::insts public, but restricted (#5450 ) We have some operations defined on DataFlowGraph purely to work around borrow-checker issues with InstructionData and other data on DataFlowGraph. Part of the problem is that indexing the DFG directly hides the fact that we're only indexing the insts field of the DFG. This PR makes the insts field of the DFG public, but wraps it in a newtype that only allows indexing. This means that the borrow checker is better able to tell when operations on memory held by the DFG won't conflict, which comes up frequently when mutating ValueLists held by InstructionData.	2022-12-16 10:46:09 -08:00
Nick Fitzgerald	c0b587ac5f	Remove heaps from core Cranelift, push them into `cranelift-wasm` (#5386 ) * cranelift-wasm: translate Wasm loads into lower-level CLIF operations Rather than using `heap_{load,store,addr}`. * cranelift: Remove the `heap_{addr,load,store}` instructions These are now legalized in the `cranelift-wasm` frontend. * cranelift: Remove the `ir::Heap` entity from CLIF * Port basic memory operation tests to .wat filetests * Remove test for verifying CLIF heaps * Remove `heap_addr` from replace_branching_instructions_and_cfg_predecessors.clif test * Remove `heap_addr` from readonly.clif test * Remove `heap_addr` from `table_addr.clif` test * Remove `heap_addr` from the simd-fvpromote_low.clif test * Remove `heap_addr` from simd-fvdemote.clif test * Remove `heap_addr` from the load-op-store.clif test * Remove the CLIF heap runtest * Remove `heap_addr` from the global_value.clif test * Remove `heap_addr` from fpromote.clif runtests * Remove `heap_addr` from fdemote.clif runtests * Remove `heap_addr` from memory.clif parser test * Remove `heap_addr` from reject_load_readonly.clif test * Remove `heap_addr` from reject_load_notrap.clif test * Remove `heap_addr` from load_readonly_notrap.clif test * Remove `static-heap-without-guard-pages.clif` test Will be subsumed when we port `make-heap-load-store-tests.sh` to generating `.wat` tests. * Remove `static-heap-with-guard-pages.clif` test Will be subsumed when we port `make-heap-load-store-tests.sh` over to `.wat` tests. * Remove more heap tests These will be subsumed by porting `make-heap-load-store-tests.sh` over to `.wat` tests. * Remove `heap_addr` from `simple-alias.clif` test * Remove `heap_addr` from partial-redundancy.clif test * Remove `heap_addr` from multiple-blocks.clif test * Remove `heap_addr` from fence.clif test * Remove `heap_addr` from extends.clif test * Remove runtests that rely on heaps Heaps are not a thing in CLIF or the interpreter anymore * Add generated load/store `.wat` tests * Enable memory-related wasm features in `.wat` tests * Remove CLIF heap from fcmp-mem-bug.clif test * Add a mode for compiling `.wat` all the way to assembly in filetests * Also generate WAT to assembly tests in `make-load-store-tests.sh` * cargo fmt * Reinstate `f{de,pro}mote.clif` tests without the heap bits * Remove undefined doc link * Remove outdated SVG and dot file from docs * Add docs about `None` returns for base address computation helpers * Factor out `env.heap_access_spectre_mitigation()` to a local * Expand docs for `FuncEnvironment::heaps` trait method * Restore f{de,pro}mote+load clif runtests with stack memory	2022-12-15 00:26:45 +00:00
Ulrich Weigand	e913cf3647	Remove IFLAGS/FFLAGS types (#5406 ) All instructions using the CPU flags types (IFLAGS/FFLAGS) were already removed. This patch completes the cleanup by removing all remaining instructions that define values of CPU flags types, as well as the types themselves. Specifically, the following features are removed: - The IFLAGS and FFLAGS types and the SpecialType category. - Special handling of IFLAGS and FFLAGS in machinst/isle.rs and machinst/lower.rs. - The ifcmp, ifcmp_imm, ffcmp, iadd_ifcin, iadd_ifcout, iadd_ifcarry, isub_ifbin, isub_ifbout, and isub_ifborrow instructions. - The writes_cpu_flags instruction property. - The flags verifier pass. - Flags handling in the interpreter. All of these features are currently unused; no functional change intended by this patch. This addresses https://github.com/bytecodealliance/wasmtime/issues/3249.	2022-12-09 13:42:03 -08:00
Nick Fitzgerald	4510a4a805	Cranelift: mark post-legalization trapping blocks as cold (#5367 ) Trapping is a rare event.	2022-12-01 12:46:26 -08:00
Nick Fitzgerald	79f7fa6079	Cranelift: implement `heap_{load,store}` instruction legalization (#5351 ) * Cranelift: implement `heap_{load,store}` instruction legalization This does not remove `heap_addr` yet, but it does factor out the common bounds-check-and-compute-the-native-address functionality that is shared between all of `heap_{addr,load,store}`. Finally, this adds a missing optimization for when we can dedupe explicit bounds checks for static memories and Spectre mitigations. * Cranelift: Enable `heap_load_store_*` run tests on all targets	2022-11-30 19:12:49 +00:00
Nick Fitzgerald	c2a7ea7e24	Cranelift: de-duplicate bounds checks in legalizations (#5190 ) * Cranelift: Add the `DataFlowGraph::display_value_inst` convenience method * Cranelift: Add some `trace!` logs to some parts of legalization * Cranelift: de-duplicate bounds checks in legalizations When both (1) "dynamic" memories that need explicit bounds checks and (2) spectre mitigations that perform bounds checks are enabled, reuse the same bounds checks between the two legalizations. This reduces the overhead of explicit bounds checks and spectre mitigations over using virtual memory guard pages with spectre mitigations from ~1.9-2.1x overhead to ~1.6-1.8x overhead. That is about a 14-19% speed up for when dynamic memories and spectre mitigations are enabled. <details> ``` execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm Δ = 3422455129.47 ± 120159.49 (confidence = 99%) virtual-memory-guards.so is 2.09x to 2.09x faster than bounds-checks.so! [6563931659 6564063496.07 6564301535] bounds-checks.so [3141492675 3141608366.60 3141895249] virtual-memory-guards.so execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm Δ = 338716136.87 ± 1.38 (confidence = 99%) virtual-memory-guards.so is 2.08x to 2.08x faster than bounds-checks.so! [651961494 651961495.47 651961497] bounds-checks.so [313245357 313245358.60 313245362] virtual-memory-guards.so execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 22742944.07 ± 331.73 (confidence = 99%) virtual-memory-guards.so is 1.87x to 1.87x faster than bounds-checks.so! [48841295 48841567.33 48842139] bounds-checks.so [26098439 26098623.27 26099479] virtual-memory-guards.so ``` </details> <details> ``` execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm Δ = 2465900207.27 ± 146476.61 (confidence = 99%) virtual-memory-guards.so is 1.78x to 1.78x faster than de-duped-bounds-checks.so! [5607275431 5607442989.13 5607838342] de-duped-bounds-checks.so [3141445345 3141542781.87 3141711213] virtual-memory-guards.so execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm Δ = 234253620.20 ± 2.33 (confidence = 99%) virtual-memory-guards.so is 1.75x to 1.75x faster than de-duped-bounds-checks.so! [547498977 547498980.93 547498985] de-duped-bounds-checks.so [313245357 313245360.73 313245363] virtual-memory-guards.so execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 16605659.13 ± 315.78 (confidence = 99%) virtual-memory-guards.so is 1.64x to 1.64x faster than de-duped-bounds-checks.so! [42703971 42704284.40 42704787] de-duped-bounds-checks.so [26098432 26098625.27 26099234] virtual-memory-guards.so ``` </details> <details> ``` execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm Δ = 104462517.13 ± 7.32 (confidence = 99%) de-duped-bounds-checks.so is 1.19x to 1.19x faster than bounds-checks.so! [651961493 651961500.80 651961532] bounds-checks.so [547498981 547498983.67 547498989] de-duped-bounds-checks.so execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm Δ = 956556982.80 ± 103034.59 (confidence = 99%) de-duped-bounds-checks.so is 1.17x to 1.17x faster than bounds-checks.so! [6563930590 6564019842.40 6564243651] bounds-checks.so [5607307146 5607462859.60 5607677763] de-duped-bounds-checks.so execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 6137307.87 ± 247.75 (confidence = 99%) de-duped-bounds-checks.so is 1.14x to 1.14x faster than bounds-checks.so! [48841303 48841472.93 48842000] bounds-checks.so [42703965 42704165.07 42704718] de-duped-bounds-checks.so ``` </details> * Update test expectations * Add a test for deduplicating bounds checks between dynamic memories and spectre mitigations * Define a struct for the Spectre comparison instead of using a tuple * More trace logging for heap legalization	2022-11-15 08:47:22 -08:00
Nick Fitzgerald	fc62d4ad65	Cranelift: Make `heap_addr` return calculated `base + index + offset` (#5231 ) * Cranelift: Make `heap_addr` return calculated `base + index + offset` Rather than return just the `base + index`. (Note: I've chosen to use the nomenclature "index" for the dynamic operand and "offset" for the static immediate.) This move the addition of the `offset` into `heap_addr`, instead of leaving it for the subsequent memory operation, so that we can Spectre-guard the full address, and not allow speculative execution to read the first 4GiB of memory. Before this commit, we were effectively doing load(spectre_guard(base + index) + offset) Now we are effectively doing load(spectre_guard(base + index + offset)) Finally, this also corrects `heap_addr`'s documented semantics to say that it returns an address that will trap on access if `index + offset + access_size` is out of bounds for the given heap, rather than saying that the `heap_addr` itself will trap. This matches the implemented behavior for static memories, and after https://github.com/bytecodealliance/wasmtime/pull/5190 lands (which is blocked on this commit) will also match the implemented behavior for dynamic memories. * Update heap_addr docs * Factor out `offset + size` to a helper	2022-11-09 19:53:51 +00:00
Trevor Elliott	ec12415b1f	cranelift: Remove redundant branch and select instructions (#5097 ) As discussed in the 2022/10/19 meeting, this PR removes many of the branch and select instructions that used iflags, in favor if using brz/brnz and select in their place. Additionally, it reworks selectif_spectre_guard to take an i8 input instead of an iflags input. For reference, the removed instructions are: br_icmp, brif, brff, trueif, trueff, and selectif.	2022-10-24 16:14:35 -07:00
Afonso Bordado	e577a76c0d	cranelift: Sign extend immediates in instructions that embed them. (#4602 ) * cranelift: Sign extend immediates in instructions that embed them. * cranelift: Clarify imm instruction behaviour * cranelift: Deduplicate imm_const * cranelift: zero extend logical imm ops	2022-08-15 18:08:20 +00:00
Sam Parker	9c43749dfe	[RFC] Dynamic Vector Support (#4200 ) Introduce a new concept in the IR that allows a producer to create dynamic vector types. An IR function can now contain global value(s) that represent a dynamic scaling factor, for a given fixed-width vector type. A dynamic type is then created by 'multiplying' the corresponding global value with a fixed-width type. These new types can be used just like the existing types and the type system has a set of hard-coded dynamic types, such as I32X4XN, which the user defined types map onto. The dynamic types are also used explicitly to create dynamic stack slots, which have no set size like their existing counterparts. New IR instructions are added to access these new stack entities. Currently, during codegen, the dynamic scaling factor has to be lowered to a constant so the dynamic slots do eventually have a compile-time known size, as do spill slots. The current lowering for aarch64 just targets Neon, using a dynamic scale of 1. Copyright (c) 2022, Arm Limited.	2022-07-07 12:54:39 -07:00
Chris Fallin	61dc38c065	Implement Spectre mitigations for table accesses and br_tables. (#4092 ) Currently, we have partial Spectre mitigation: we protect heap accesses with dynamic bounds checks. Specifically, we guard against errant accesses on the misspeculated path beyond the bounds-check conditional branch by adding a conditional move that is also dependent on the bounds-check condition. This data dependency on the condition is not speculated and thus will always pick the "safe" value (in the heap case, a NULL address) on the misspeculated path, until the pipeline flushes and recovers onto the correct path. This PR uses the same technique both for table accesses -- used to implement Wasm tables -- and for jumptables, used to implement Wasm `br_table` instructions. In the case of Wasm tables, the cmove picks the table base address on the misspeculated path. This is equivalent to reading the first table entry. This prevents loads of arbitrary data addresses on the misspeculated path. In the case of `br_table`, the cmove picks index 0 on the misspeculated path. This is safer than allowing a branch to an address loaded from an index under misspeculation (i.e., it preserves control-flow integrity even under misspeculation). The table mitigation is controlled by a Cranelift setting, on by default. The br_table mitigation is always on, because it is part of the single lowering pseudoinstruction. In both cases, the impact should be minimal: a single extra cmove in a (relatively) rarely-used operation. The table mitigation is architecture-independent (happens during legalization); the br_table mitigation has been implemented for both x64 and aarch64. (I don't know enough about s390x to implement this confidently there, but would happily review a PR to do the same on that platform.)	2022-05-02 11:19:16 -07:00
bjorn3	e4d42a1be4	Move arg unpacking for all remaining expand functions to simple_legalize	2021-10-31 18:26:25 +01:00
bjorn3	ce3175993a	Inline expand_br_icmp, expand_stack_load and expand_stack_store	2021-10-31 18:00:39 +01:00
bjorn3	96b14ed8eb	Match on InstructionData instead of Opcode This should be faster by avoiding matching on InstructionData to get the Opcode, then on Opcode and then finally on the InstructionData again to get the arguments	2021-10-31 17:52:43 +01:00
Benjamin Bouvier	43a86f14d5	Remove more old backend ISA concepts (#3402 ) This also paves the way for unifying TargetIsa and MachBackend, since now they map one to one. In theory the two traits could be merged, which would be nice to limit the number of total concepts. Also they have quite different responsibilities, so it might be fine to keep them separate. Interestingly, this PR started as removing RegInfo from the TargetIsa trait since the adapter returned a dummy value there. From the fallout, noticed that all Display implementations didn't needed an ISA anymore (since these were only used to render ISA specific registers). Also the whole family of RegInfo / ValueLoc / RegUnit was exclusively used for the old backend, and these could be removed. Notably, some IR instructions needed to be removed, because they were using RegUnit too: this was the oddball of regfill / regmove / regspill / copy_special, which were IR instructions inserted by the old regalloc. Fare thee well!	2021-10-04 10:36:12 +02:00
Benjamin Bouvier	bae4ec6427	Remove ancient register allocation (#3401 )	2021-09-30 21:27:23 +02:00
bjorn3	9e5201d88f	Fix all dead-code warnings in cranelift-codegen	2021-09-29 16:27:47 +02:00
bjorn3	59e18b7d1b	Remove the old riscv backend	2021-09-29 16:23:57 +02:00
bjorn3	9e34df33b9	Remove the old x86 backend	2021-09-29 16:13:46 +02:00
bjorn3	fa6e52848c	Fix warnings	2021-06-20 18:52:48 +02:00
bjorn3	50e53a0a73	Use hand written legalizations in simple_legalize This allows removal of the legalization dsl once the old backends are removed.	2021-06-20 18:43:38 +02:00
Ulrich Weigand	467a1af83a	Support explicit endianness in Cranelift IR MemFlags WebAssembly memory operations are by definition little-endian even on big-endian target platforms. However, other memory accesses will require native target endianness (e.g. to access parts of the VMContext that is also accessed by VM native code). This means on big-endian targets, the code generator will have to handle both little- and big-endian memory accesses. However, there is currently no way to encode that distinction into the Cranelift IR that describes memory accesses. This patch provides such a way by adding an (optional) explicit endianness marker to an instance of MemFlags. Since each Cranelift IR instruction that describes memory accesses already has an instance of MemFlags attached, this can now be used to provide endianness information. Note that by default, memory accesses will continue to use the native target ISA endianness. To override this to specify an explicit endianness, a MemFlags value that was built using the set_endianness routine must be used. This patch does so for accesses that implement WebAssembly memory operations. This patch addresses issue #2124.	2020-12-14 20:15:37 +01:00
bjorn3	23aafa1054	Fix icmp_imm.i128 The immediate splitting code contained a bug causing both low and high to be equal for i128. This is the root cause for bjorn3/rustc_codegen_cranelift#1097 and likely the only bug preventing cg_clif from bootstrapping rustc.	2020-10-31 21:11:50 +01:00
Benjamin Bouvier	abf157bd69	machinst x64: Only use the feature flag to enable the x64 new backend; Before this patch, running the x64 new backend would require both compiling with --features experimental_x64 and running with `use_new_backend`. This patches changes this behavior so that the runtime flag is not needed anymore: using the feature flag will enforce usage of the new backend everywhere, making using and testing it much simpler: cargo run --features experimental_x64 ;; other CLI options/flags This also gives a hint at what the meta language generation would look like after switching to the new backend. Compiling only with the x64 codegen flag gives a nice compile time speedup.	2020-07-15 13:11:28 +02:00
Benjamin Bouvier	dad56a2488	cranelift: add a new resumable_trapnz instruction; This is useful to have to allow resumable_trap to happen in loop headers, for instance. This is the correct way to implement interrupt checks in Spidermonkey, which are effectively resumable traps. Previous implementation was using traps, which is wrong, since traps semantically can't be resumed after.	2020-06-15 12:04:28 +02:00
whitequark	bc555468a7	cranelift: add i64.{ishl,ushr,ashr} libcalls. These libcalls are useful for 32-bit platforms. On x86_32 in particular, commit `4ec16fa0` added support for legalizing 64-bit shifts through SIMD operations. However, that legalization requires SIMD to be enabled and SSE 4.1 to be supported, which is not acceptable as a hard requirement.	2020-06-05 12:13:49 -07:00
Chris Fallin	875d2758b1	ARM64 backend, part 1 / 11: misc changes to existing code. - Add a `simple_legalize()` function that invokes a predetermined set of legalizations, without depending on the details of the current backend design. This will be used by the new backend pipeline. - Separate out `has_side_effect()` from the DCE pass. This will be used by the new backends' lowering code. - Add documentation for the `Arm64Call` relocation type.	2020-04-11 17:50:51 -07:00
Ryan Hunt	832666c45e	Mass rename Ebb and relatives to Block (#1365 ) * Manually rename BasicBlock to BlockPredecessor BasicBlock is a pair of (Ebb, Inst) that is used to represent the basic block subcomponent of an Ebb that is a predecessor to an Ebb. Eventually we will be able to remove this struct, but for now it makes sense to give it a non-conflicting name so that we can start to transition Ebb to represent a basic block. I have not updated any comments that refer to BasicBlock, as eventually we will remove BlockPredecessor and replace with Block, which is a basic block, so the comments will become correct. * Manually rename SSABuilder block types to avoid conflict SSABuilder has its own Block and BlockData types. These along with associated identifier will cause conflicts in a later commit, so they are renamed to be more verbose here. * Automatically rename 'Ebb' to 'Block' in .rs Automatically rename 'EBB' to 'block' in .rs Automatically rename 'ebb' to 'block' in .rs Automatically rename 'extended basic block' to 'basic block' in .rs Automatically rename 'an basic block' to 'a basic block' in .rs Manually update comment for `Block` `Block`'s wikipedia article required an update. * Automatically rename 'an `Block`' to 'a `Block`' in .rs Automatically rename 'extended_basic_block' to 'basic_block' in .rs Automatically rename 'ebb' to 'block' in .clif Manually rename clif constant that contains 'ebb' as substring to avoid conflict * Automatically rename filecheck uses of 'EBB' to 'BB' 'regex: EBB' -> 'regex: BB' '$EBB' -> '$BB' * Automatically rename 'EBB' 'Ebb' to 'block' in .clif Automatically rename 'an block' to 'a block' in .clif Fix broken testcase when function name length increases Test function names are limited to 16 characters. This causes the new longer name to be truncated and fail a filecheck test. An outdated comment was also fixed.	2020-02-07 10:46:47 -06:00
jmkrauz	ae6ba1e58c	Fix narrow_icmp_imm (#1343 )	2020-01-21 15:20:44 +01:00
Benjamin Bouvier	dd497c19e1	Renames Settings ⚠️ (fixes #976 ) (#1321 ) This is a breaking API change: the following settings have been renamed: - jump_tables_enabled -> enable_jump_tables - colocated_libcalls -> use_colocated_libcalls - probestack_enabled -> enable_probestack - allones_funcaddrs -> emit_all_ones_funcaddrs	2020-01-13 14:42:49 -07:00
Nick Fitzgerald	a49483408c	Many multi-value returns (#1147 ) * Add x86 encodings for `bint` converting to `i8` and `i16` * Introduce tests for many multi-value returns * Support arbitrary numbers of return values This commit implements support for returning an arbitrary number of return values from a function. During legalization we transform multi-value signatures to take a struct return ("sret") return pointer, instead of returning its values in registers. Callers allocate the sret space in their stack frame and pass a pointer to it into the caller, and once the caller returns to them, they load the return values back out of the sret stack slot. The callee's return operations are legalized to store the return values through the given sret pointer. * Keep track of old, pre-legalized signatures When legalizing a call or return for its new legalized signature, we may need to look at the old signature in order to figure out how to legalize the call or return. * Add test for multi-value returns and `call_indirect` * Encode bool -> int x86 instructions in a loop * Rename `Signature::uses_sret` to `Signature::uses_struct_return_param` * Rename `p` to `param` * Add a clarifiying comment in `num_registers_required` * Rename `num_registers_required` to `num_return_registers_required` * Re-add newline * Handle already-assigned parameters in `num_return_registers_required` * Document what some debug assertions are checking for * Make "illegalizing" closure's control flow simpler * Add unit tests and comments for our rounding-up-to-the-next-multiple-of-a-power-of-2 function * Use `append_isnt_arg` instead of doing the same thing manually * Fix grammar in comment * Add `Signature::uses_special_{param,return}` helper functions * Inline the definition of `legalize_type_for_sret_load` for readability * Move sret legalization debug assertions out into their own function * Add `round_up_to_multiple_of_type_align` helper for readability * Add a debug assertion that we aren't removing the wrong return value * Rename `RetPtr` stack slots to `StructReturnSlot` * Make `legalize_type_for_sret_store` more symmetrical to `legalized_type_for_sret` * rustfmt * Remove unnecessary loop labels * Do not pre-assign offsets to struct return stack slots Instead, let the existing frame layout algorithm decide where they should go. * Expand "sret" into explicit "struct return" in doc comment * typo: "than" -> "then" in comment * Fold test's debug message into the assertion itself	2019-11-05 14:36:03 -08:00
Peter Huene	9f506692c2	Fix clippy warnings. This commit fixes the current set of (stable) clippy warnings in the repo.	2019-10-24 17:20:12 -07:00
Benjamin Bouvier	9c159ac17c	Cleanup: Mark ebb as unused in legalization;	2019-10-10 09:01:40 -07:00
Ujjwal Sharma	c062f12d7c	[codegen] legalize icmp for 64 and 128 bit operands Add legalizations for icmp and icmp_imm for i64 and i128 operands for the narrow legalization set, allowing 32-bit ISAs (like x86-32) to compare 64-bit integers and all ISAs to compare 128-bit integers. Fixes: https://github.com/bnjbvr/cranelift-x86/issues/2	2019-10-10 11:06:19 +02:00
bjorn3	68b671e0ee	Fix isplit legalization for ebb params when jumping forward Fixes #1106	2019-10-09 08:56:48 -07:00
bjorn3	10e226f9ff	Always use extern crate std in cranelift-codegen	2019-10-02 11:50:44 -07:00
Benjamin Bouvier	3aa76b558c	Legalize i64.const by breaking it into two i32.const, on 32-bits platforms;	2019-09-10 19:50:34 +02:00
bjorn3	e8d4ef7c3d	Fix review comments	2019-09-07 09:55:09 -07:00
bjorn3	f4cdd3007c	[split] Prevent double legalization of isplit and vsplit	2019-09-07 09:55:09 -07:00
bjorn3	2426bce9ac	Fix load.i64 and store legalization	2019-09-07 09:55:09 -07:00
bjorn3	acd454890c	Legalize load.i128 and store.i128 with arbitrary offsets	2019-09-07 09:55:09 -07:00
bjorn3	3ae78fddde	Fix warnings	2019-09-07 09:55:09 -07:00
bjorn3	dce521fa1c	Fix lone isplit, when the corresponding iconcat will be created later during legalization	2019-09-07 09:55:09 -07:00
bjorn3	0d5b87038a	Rustfmt	2019-09-07 09:55:09 -07:00
bjorn3	c7a8b6c9e5	Remove some dbg! invocations	2019-09-07 09:55:09 -07:00
bjorn3	d9ee08c088	Fix bug when i128 ebb param is unused	2019-09-07 09:55:09 -07:00

1 2

62 Commits