Add an explicit "is_ghost" property to selected instructions, and use
that to determine whether reload and coloring should visit instructions.
This allows them to visit fallthrough_return instructions and insert
fills and register moves as needed.
When one value is used multiple times for separate return values, we
need to copy it to produce a new value, so that each value can be
allocated a different register.
* Introduce a `TargetFrontendConfig` type.
`TargetFrontendConfig` is information specific to the target which is
provided to frontends to allow them to produce Cranelift IR for the
target. Currently this includes the pointer size and the default calling
convention.
The default calling convention is now inferred from the target, rather
than being a setting. cranelift-native is now just a provider of target
information, rather than also being a provider of settings, which gives
it a clearer role.
And instead of having cranelift-frontend routines require the whole
`TargetIsa`, just require the `TargetFrontendConfig`, and add a way to
get the `TargetFrontendConfig` from a `Module`.
Fixes#529.
Fixes#555.
* Rename size to base_size and introduce a compute_size function;
* Add infra to inspect in/outs registers when computing the size of an instruction;
* Remove the GPR_SAFE_DEREF and GPR_ZERO_DEREF_SAFE register classes on x86 (fixes#335);
* Move `return_at_end` out of Settings and into the wasm FuncEnvironment.
The `return_at_end` flag supports users that want to append a custom
epilogue to Cranelift-produced functions. It arranges for functions to
always return via a single return statement at the end, and users are
expected to remove this return to append their code.
This patch makes two changes:
- First, introduce a `fallthrough_return` instruction and use that
instead of adding a `return` at the end. That's simpler than having
users remove the `return` themselves.
- Second, move this setting out of the Settings and into the wasm
FuncEnvironment. This flag isn't something the code generator uses,
it's something that the wasm translator uses. The code generator
needs to preserve the property, however we can give the
`fallthrough_return` instruction properties to ensure this as needed,
such as marking it non-cloneable.
WebAssembly doesn't have non-dense jump tables, and higher-level users
are better served by the facilities in lib/frontend/src/switch.rs for
working with non-dense switches.
This eliminates the concept of "absent" jump table entries, which
were represented as "0" in the text format.
Also, jump table contents are now enclosed in `[` and `]`, so that
we can unambiguously display empty jump tables. Previously, empty jump
tables were displayed as if they had a single absent entry.
* Add encodings for i8 and i16 copy, spill, fill, ireduce.i8.i16
Also adds legalization for srem, irsub_imm, {u,s}extend.i16.i8
Fixes#477 cc #466
* Legalize popcnt, clz and ctz for i8 and i16
* Fix bug in call_memset
* Add 'jump_table_entry' and 'indirect_jump' instructions.
* Update CodeSink to keep track of code size. Pretty up clif-util's disassembly output.
* Only disassemble the machine portion of output. Pretty print the read-only data after it.
* Update switch frontend code to use new br_table instruction w/ default.
* Reorganize the global value kinds.
This:
- renames "deref" global values to "load" and gives it a offset that works
like the "load" instructions' does
- adds an explicit "iadd_imm" global value kind, which replaces the
builtin iadd in "vmctx" and "deref" global values.
- also renames "globalsym" to "symbol"
This makes several changes:
- It adds an index_type to heap declarations, allowing heaps to specify the
type for indexing. This also anticipates 64-bit heap support.
- It adds a memory_type to deref global values, allowing deref globals to
have types other than pointers. This is used to allow the bound variable
in dynamic heaps to have type i32, to match the index type in heaps
with i32 index type.
- And, it fixes heap legalization to do the bounds check in the heap's
index type.
* fix error not reported if at least one other error expected.
* Fixed unused extern crate error if wasm feature is not enabled.
* No longer reporting deref cycles multiple times.
* Fix filetest type_check.clif.
* Switched comparison order for perf.
* Fixed isa/riscv/verify-encoding.clif filetest.
* Remove reserved_reg functionality.
This wasn't implemented, and if we need it in the future, it seems like
it would be better to extend the concept of global values to cover this.
* Use GlobalValue::reserved_value() for sentinal values.
* Now diagnosing missing vmctx arguments (fixes#376).
* Added filetest for fix of #376.
* Respect formatting rules in verifier/mod.rs.
* Added parameters for each use of vmctx in test files.
* Added comments on additions on vmctx verifications.
This requires splitting X86PCRel4 into two separate relocations, to
distinguish the case where the instruction is a call, as Mach-O uses a
different relocation in that case.
This also makes it explicit that only x86-64 relocations are supported
currently.
In the text format, allow aliases to be defined multiple times, as long
as they're always aliasing the same value.
write.rs is already emitting redundant aliases, because it emits them at
their uses, so this change allows the parser to be able to parse such
code.
This switches from a custom list of architectures to use the
target-lexicon crate.
- "set is_64bit=1; isa x86" is replaced with "target x86_64", and
similar for other architectures, and the `is_64bit` flag is removed
entirely.
- The `is_compressed` flag is removed too; it's no longer being used to
control REX prefixes on x86-64, ARM and Thumb are separate
architectures in target-lexicon, and we can figure out how to
select RISC-V compressed encodings when we're ready.
* Optimize 0.0 floating point constants. Rather than using the existing
process of emitting bit patterns and moving them into floating point
registers, use the `xorps` instruction to zero out the register.
* is_zero predicate function will not accept negative zero. Fixed formatting for encoding recipe and filetests.
* Start adding the load_complex and store_complex instructions.
N.b.:
The text format is not correct yet. Requires changes to the lexer and parser.
I'm not sure why I needed to change the RuntimeError to Exception yet. Will fix.
* Get first few encodings of load_complex working. Still needs var args type checking.
* Clean up ModRM helper functions in binemit.
* Implement 32-bit displace for load_complex
* Use encoding helpers instead of doing them all by hand
* Initial implementation of store_complex
* Parse value list for load/store_complex with + as delimiter. Looks nice.
* Add sign/zero-extension and size variants for load_complex.
* Add size variants of store_complex.
* Add asm helper lines to load/store complex bin tests.
* Example of length-checking the instruction ValueList for an encoding. Extremely questionable implementation.
* Fix Python linting issues
* First draft of postopt pass to fold adds and loads into load_complex. Just simple loads for now.
* Optimization pass now works with all types of loads.
* Add store+add -> store_complex to postopt pass
* Put complex address optimization behind ISA flag.
* Add load/store complex for f32 and f64
* Fixes changes to lexer that broke NaN parsing.
Abstracts away the repeated checks for whether or not the characters
following a + or - are going to be parsed as a number or not.
* Fix formatting issues
* Fix register restrictions for complex addresses.
* Encoding tests for x86-32.
* Add documentation for newly added instructions, recipes, and cdsl changes.
* Fix python formatting again
* Apply value-list length predicates to all LoadComplex and StoreComplex instructions.
* Add predicate types to new encoding helpers for mypy.
* Import FieldPredicate to satisfy mypy.
* Add and fix some "asm" strings in the encoding tests.
* Line-up 'bin' comments in x86/binary64 test
* Test parsing of offset-less store_complex instruction.
* 'sNaN' not 'sNan'
* Bounds check the lookup for polymorphic typevar operand.
* Fix encodings for istore16_complex.
* initial set of work for windows fastcall (x64) call convention
- call conventions: rename `fastcall` to `windows_fastcall`
- add initial set of filetests
- ensure arguments are written after the shadow space/store (offset-wise)
The shadow space available before the arguments (range 0..32)
is not used as spill space yet.
* address review feedback