This patch adds a third mode for templates: REX inference is requestable
at template instantiation time. This reduces the number of recipes
by removing rex()/nonrex() redundancy for many instructions.
This removes `report!`, `fatal!`, and `nonfatal!` from the verifier code and replaces them with methods on `VerifierErrors`. In order to maintain similar ease-of-use, `VerifierError` is expanded with several `From` implementations that convert a tuple to a verifier error.
* Try and assign directly to return registers; backtrack to use struct-return param
Rather than trying to count number of return registers that would be used by a
given set of return values, optimistically assign the return values to
registers. If we later find that we can't fit them all in registers, then
backtrack and introduce the use of a struct-return pointer parameter.
* Rename `rets2` and wrap it in an option so we avoid the clone for non-multi-value
In some cases, compute_size() is used to choose between various different Encodings
before one has been assigned to an instruction. For x86, the REX.W bit is stored
in the Encoding. To share recipes between REX/non-REX, that bit must be inspected
by compute_size().
This commit fixes the build errors in the unwind info implementation for
the x86 ABI by changing `byteorder` to build `no_std`.
This copies two simple functions from the `WriteBytesExt` trait so that
we can easily write to a `Vec<u8>` with a particular endianness.
Fixes#1203.
* Add x86 encodings for `bint` converting to `i8` and `i16`
* Introduce tests for many multi-value returns
* Support arbitrary numbers of return values
This commit implements support for returning an arbitrary number of return
values from a function. During legalization we transform multi-value signatures
to take a struct return ("sret") return pointer, instead of returning its values
in registers. Callers allocate the sret space in their stack frame and pass a
pointer to it into the caller, and once the caller returns to them, they load
the return values back out of the sret stack slot. The callee's return
operations are legalized to store the return values through the given sret
pointer.
* Keep track of old, pre-legalized signatures
When legalizing a call or return for its new legalized signature, we may need to
look at the old signature in order to figure out how to legalize the call or
return.
* Add test for multi-value returns and `call_indirect`
* Encode bool -> int x86 instructions in a loop
* Rename `Signature::uses_sret` to `Signature::uses_struct_return_param`
* Rename `p` to `param`
* Add a clarifiying comment in `num_registers_required`
* Rename `num_registers_required` to `num_return_registers_required`
* Re-add newline
* Handle already-assigned parameters in `num_return_registers_required`
* Document what some debug assertions are checking for
* Make "illegalizing" closure's control flow simpler
* Add unit tests and comments for our rounding-up-to-the-next-multiple-of-a-power-of-2 function
* Use `append_isnt_arg` instead of doing the same thing manually
* Fix grammar in comment
* Add `Signature::uses_special_{param,return}` helper functions
* Inline the definition of `legalize_type_for_sret_load` for readability
* Move sret legalization debug assertions out into their own function
* Add `round_up_to_multiple_of_type_align` helper for readability
* Add a debug assertion that we aren't removing the wrong return value
* Rename `RetPtr` stack slots to `StructReturnSlot`
* Make `legalize_type_for_sret_store` more symmetrical to `legalized_type_for_sret`
* rustfmt
* Remove unnecessary loop labels
* Do not pre-assign offsets to struct return stack slots
Instead, let the existing frame layout algorithm decide where they should go.
* Expand "sret" into explicit "struct return" in doc comment
* typo: "than" -> "then" in comment
* Fold test's debug message into the assertion itself
* Implement emitting Windows unwind information for fastcall functions.
This commit implements emitting Windows unwind information for x64 fastcall
calling convention functions.
The unwind information can be used to construct a Windows function table at
runtime for JIT'd code, enabling stack walking and unwinding by the operating
system.
* Address code review feedback.
This commit addresses code review feedback:
* Remove unnecessary unsafe code.
* Emit the unwind information always as little endian.
* Fix comments.
A dependency from cranelift-codegen to the byteorder crate was added.
The byteorder crate is a no-dependencies crate with a reasonable
abstraction for writing binary data for a specific endianness.
* Address code review feedback.
* Disable default features for the `byteorder` crate.
* Add a comment regarding the Windows ABI unwind code numerical values.
* Panic if we encounter a Windows function with a prologue greater than 256
bytes in size.
The failure crate invents its own traits that don't use
std::error::Error (because failure predates certain features added to
Error); this prevents using ? on an error from failure in a function
using Error. The thiserror crate integrates with the standard Error
trait instead.
This situation could be triggered that can_add_var would return true
while a variable was already added for the given register.
For instance, when we have a reassignment (because of a fixed register
input requirement) and a fixed input conflict on the same fixed
register, this register will not be available in the regs_in set after
inputs_done (because of the fixed input conflict diversion) but will
have its own variable.
Previously, ConstantData was a type alias for `Vec<u8>` which prevented it from having an implementation; this meant that `V128Imm` and `&[u8; 16]` were used in places that otherwise could have accepted types of different byte lengths.
Add legalizations for icmp and icmp_imm for i64 and i128 operands for
the narrow legalization set, allowing 32-bit ISAs (like x86-32) to
compare 64-bit integers and all ISAs to compare 128-bit integers.
Fixes: https://github.com/bnjbvr/cranelift-x86/issues/2
RedundantReloadRemover participates in the state-recycling machinery
implemented in cranelift-codegen/src/context.rs, whose goal it is to cache
per-pass state so it can be used for compilation of multiple functions without
reallocation. Unfortunately RedundantReloadRemover::run simply ignores the
cached state and reallocates it new for each function. This patch fixes that.
This reduces the number of malloc'd blocks by about 2%.
ReplaceBuilder is available in the public API through
`DataFlowGraph::replace`, however it's documentation is not available
through rustdoc as the type isn't publicly importable.