This commit changes 128-bit constant parsing in two ways:
- it adds the ability to use underscores to separate digits when writing a 128-bit constant in hexadecimal; e.g. `0x00010203...` can now be written as `0x0001_0203_...`
- it adds a new mechanism for parsing 128-bit constants using integer/float/boolean literals; e.g. `vconst.i32x4 [1 2 3 4]`. Note that currently the controlling type of the instruction dictates how many literals to parse inside the brackets.
Also, as a ridealong fix, removes R32 encodings for x86_64 in `enc_r32_r64`,
since the type `rXX` by definition only exists for targets with word size `XX`
bits.
* Fix segfault due to b64 encoding
Prior to this patch, bconst.b64 encoded its instruction with a 32-bit immediate that caused improper decoding of the MOV instruction; instead, use a REX prefix and rely on zero-extension of the immediate. Fixes#911.
* Add ability to run CLIF IR using `clif-util run [-v] {file}` and add `test run` to cranelift-filetests to allow executing CLIF
This re-factors the compile/execute parts to a FunctionRunner that is shared between cranelift-filetests and clif-util. CLIF can be now be run using `clif-util run` as well as during `clif-util test` for files with a `test run` header. As before, only functions suffixed with a `run` comment are executed. The `run: fn(...) == ...` expression syntax is left for a subsequent change.
In talking to @sunfishcode, he preferred to avoid the confusion of more ISA predicates by eliminating SSE2. SSE2 was released with the Pentium 4 in 2000 so it is unlikely that current CPUs would have SIMD enabled and not have this feature. I tried to note the SSE2-specific instructions with comments in the code.
This makes non-legalized jump table instructions operate on operands with
pointer-sized types. This means we need to extend smaller types into the
pointer-sized operand, when the two don't match.
The result of the emitter is a vector of bytes holding machine code,
jump tables, and (in the future) other read-only data. Some clients,
notably Firefox's Wasm compiler, needs to separate the machine code
from the data in order to insert more code directly after the code
generated by Cranelift.
To make such separation possible, we record more information about the
emitted bytes: the sizes of each of the sections of code, jump tables,
and read-only data, as well as the locations within the code that
reference (PC-relatively) the jump tables and read-only data.
* Use single index for param register allocation for windows callconv (#691)
The used registers depend entirely on the parameter index (1st, 2nd, 3rd, 4th, ... param)
and we cannot shift unused registers to other indexes, if they are not designated for
the use for that parameter index.
This was previously using the following condition to decide that a block
hadn't been visited yet: either dest_offset is non-0 or the block isn't
the entry block. Unfortunately, this didn't work when the first block
would be non-empty but wouldn't generate code at all.
Since the original code would do at least one pass over the entire code,
the first pass that determines initial EBB offsets is done separately,
without considering branch relaxation. This ensures that all EBBs have
been visited and have correct initial offsets, and doesn't require a
special check to know whether an EBB has been visited or not.