* Remove reserved_reg functionality.
This wasn't implemented, and if we need it in the future, it seems like
it would be better to extend the concept of global values to cover this.
* Use GlobalValue::reserved_value() for sentinal values.
* Now diagnosing missing vmctx arguments (fixes#376).
* Added filetest for fix of #376.
* Respect formatting rules in verifier/mod.rs.
* Added parameters for each use of vmctx in test files.
* Added comments on additions on vmctx verifications.
This requires splitting X86PCRel4 into two separate relocations, to
distinguish the case where the instruction is a call, as Mach-O uses a
different relocation in that case.
This also makes it explicit that only x86-64 relocations are supported
currently.
In the text format, allow aliases to be defined multiple times, as long
as they're always aliasing the same value.
write.rs is already emitting redundant aliases, because it emits them at
their uses, so this change allows the parser to be able to parse such
code.
This switches from a custom list of architectures to use the
target-lexicon crate.
- "set is_64bit=1; isa x86" is replaced with "target x86_64", and
similar for other architectures, and the `is_64bit` flag is removed
entirely.
- The `is_compressed` flag is removed too; it's no longer being used to
control REX prefixes on x86-64, ARM and Thumb are separate
architectures in target-lexicon, and we can figure out how to
select RISC-V compressed encodings when we're ready.
* Optimize 0.0 floating point constants. Rather than using the existing
process of emitting bit patterns and moving them into floating point
registers, use the `xorps` instruction to zero out the register.
* is_zero predicate function will not accept negative zero. Fixed formatting for encoding recipe and filetests.
* Start adding the load_complex and store_complex instructions.
N.b.:
The text format is not correct yet. Requires changes to the lexer and parser.
I'm not sure why I needed to change the RuntimeError to Exception yet. Will fix.
* Get first few encodings of load_complex working. Still needs var args type checking.
* Clean up ModRM helper functions in binemit.
* Implement 32-bit displace for load_complex
* Use encoding helpers instead of doing them all by hand
* Initial implementation of store_complex
* Parse value list for load/store_complex with + as delimiter. Looks nice.
* Add sign/zero-extension and size variants for load_complex.
* Add size variants of store_complex.
* Add asm helper lines to load/store complex bin tests.
* Example of length-checking the instruction ValueList for an encoding. Extremely questionable implementation.
* Fix Python linting issues
* First draft of postopt pass to fold adds and loads into load_complex. Just simple loads for now.
* Optimization pass now works with all types of loads.
* Add store+add -> store_complex to postopt pass
* Put complex address optimization behind ISA flag.
* Add load/store complex for f32 and f64
* Fixes changes to lexer that broke NaN parsing.
Abstracts away the repeated checks for whether or not the characters
following a + or - are going to be parsed as a number or not.
* Fix formatting issues
* Fix register restrictions for complex addresses.
* Encoding tests for x86-32.
* Add documentation for newly added instructions, recipes, and cdsl changes.
* Fix python formatting again
* Apply value-list length predicates to all LoadComplex and StoreComplex instructions.
* Add predicate types to new encoding helpers for mypy.
* Import FieldPredicate to satisfy mypy.
* Add and fix some "asm" strings in the encoding tests.
* Line-up 'bin' comments in x86/binary64 test
* Test parsing of offset-less store_complex instruction.
* 'sNaN' not 'sNan'
* Bounds check the lookup for polymorphic typevar operand.
* Fix encodings for istore16_complex.
* initial set of work for windows fastcall (x64) call convention
- call conventions: rename `fastcall` to `windows_fastcall`
- add initial set of filetests
- ensure arguments are written after the shadow space/store (offset-wise)
The shadow space available before the arguments (range 0..32)
is not used as spill space yet.
* address review feedback
* x86 recipes: emit StackOverflow trap for all sp-relative loads and stores
* x86 recipes: emit StackOverflow trap for push and pop
* x86 binary filetests: add stk_ovf trap annotations
regmove, regfill, and regspill have immediates which aren't value
operands, so they aren't in the set of things that can be described by
the existing constraint system. Consequently, constraints saying that
the non-REX encodings only support registers that don't need REX
prefixes don't work. Fow now, just remove the non-REX encodings, so
that they don't get selected when they aren't valid.
This fixes the last known issue with instruction shrinking, so it can
be re-enabled.
Add a calling-convention setting to the `Flags` used as part of the
`TargetIsa`. This allows Cretonne code that generates calls to use the
correct convention, such as when emitting libcalls during legalization
or when the wasm frontend is decoding functions. This setting can be
overridden per-function.
This also adds "fast", "cold", and "fastcall" conventions, with "fast"
as the new default. Note that "fast" and "cold" are not intended to be
ABI-compatible across Cretonne versions.
This will also ensure Windows users will get an `unimplemented!` rather
than silent calling-convention mismatches, which reflects the fact that
Windows calling conventions are not yet implemented.
This also renames SpiderWASM, which isn't camel-case, to Baldrdash,
which is, and which is also a more relevant name.
When an instruction has multiple valid encodings, such as with and
without a REX prefix on x86-64, Cretonne typically picks the encoding
which gives the register allocator the most flexibility, which is
typically the longest encoding. This patch adds a pass that runs after
register allocation that picks the smallest encoding, working within the
constraints of the register allocator's choices. The result is smaller
and easier to read encodings.
In the future, we may want to merge this pass into the relaxation pass,
or possibly fold it into the final encoding step, however for now, a
discrete pass will suffice.
Choosing smaller instruction encodings on eg. x86 is an optimization,
rather than a useful discrete setting.
Use "is_compressed" only for ISAs that have an explicit compression feature
that users of the output may to be aware of, such as RISC-V's RVC or
ARM's Thumb-2.
This makes it a little more consistent; now, "cretonne" is never capitalized
in identifier, path, or URL contexts. It is capitalized in natural
language contexts when referring to the project.
This adds a "colocated" flag to function and symbolic global variables which
indicates that they are defined along with the current function, so they can
use PC-relative addressing.
This also changes the function decl syntax; the name now always precedes the
signature, and the "function" keyword is no longer included.
The regmove and regfill instructions temporarily divert a value's
location, and these temporary diversions are not reflected in
`func.locations`. For now, make an extra scan through the instructions
of the function to find any regmove or regfill instructions in order to
find all used callee-saved registers.
This fixes#296.
The main use for non-PIC code at present is JIT code, and JIT code can
live anywhere in memory and reference other symbols defined anywhere in
memory, so it needs to use the "large" code model.
func_addr and globalsym_addr instructions were already using `movabs`
to support arbitrary 64-bit addresses, so this just makes calls be
legalized to support arbitrary 64-bit addresses also.
* Only save callee-saved registers that are actually being used.
* Rename AllocatableSet to RegisterSet
* Style cleanup and small renames for readability.
* Adjust x86 prologue-epilogue test to account for callee-saved register optimization.
* Add more tests for prologue-epilogue optimizations.
To keep cross-compiling straightforward, Cretonne shouldn't have any
behavior that depends on the host. This renames the "Native" calling
convention to "SystemV", which has a defined meaning for each target,
so that it's clear that the calling convention doesn't change
depending on what host Cretonne is running on.
* Add a pre-opt optimization to change constants into immediates.
This converts 'iadd' + 'iconst' into 'iadd_imm', and so on.
* Optimize away redundant `bint` instructions.
Cretonne has a concept of "Testable" values, which can be either boolean
or integer. When the an instruction needing a "Testable" value receives
the result of a `bint`, converting boolean to integer, eliminate the
`bint`, as it's redundant.
* Postopt: Optimize using CPU flags.
This introduces a post-legalization optimization pass which converts
compare+branch sequences to use flags values on CPUs which support it.
* Define a form of x86's `urm` that doesn't clobber FLAGS.
movzbl/movsbl/etc. don't clobber FLAGS; define a form of the `urm`
recipe that represents this.
* Implement a DCE pass.
This pass deletes instructions with no side effects and no results that
are used.
* Clarify ambiguity about "32-bit" and "64-bit" in comments.
* Add x86 encodings for icmp_imm.
* Add a testcase for postopt CPU flags optimization.
This covers the basic functionality of transforming compare+branch
sequences to use CPU flags.
* Pattern-match irsub_imm in preopt.
* First draft of TrapSink implementation.
* Add trap sink calls to 'trapif' and 'trapff' recipes.
* Add SourceLoc to trap sink calls, and add trap sink calls to all loads and stores.
* Add IntegerDivisionByZero trap to div recipe.
* Only emit load/store traps if 'notrap' flag is not set on the instruction.
* Update filetest machinery to add new trap sink functionality.
* Update filetests to include traps in output.
* Add a few more trap outputs to filetests.
* Add trap output to CLI tool.
Value aliases aren't instructions, so they don't have a location in the
CFG, so it's not meaningful to query whether a value alias is defined
within a loop.