This commit is based on the assumption that floats are already stored in XMM registers in x86. When extracting a lane, cranelift was moving the float to a regular register and back to an XMM register; this change avoids this by shuffling the float value to the lowest bits of the XMM register. It also assumes that the upper bits can be left as is (instead of zeroing them out).
raw_bitcast matches the intent of this legalization more clearly (to simply change the CLIF type without changing any bits) and the additional null encodings added are necessary for later instructions
This patch restricts the `Err(..)` return from `blocktype_to_type` to be
`Err(..)` only in the case where it really is an error to continue. The three
use points of `blocktype_to_type` are changed to check for an `Err(..)` rather
than silently ignoring it. There are also cosmetic changes to `type_to_type`
and `tabletype_to_type`.
When compiling wasm_lua_binarytrees, this reduces the number of blocks
allocated by CL by 1.9%. Instruction count falls by 0.1%.
Details:
* `type_to_type` and `tabletype_to_type`:
- Added the function name in the failure message
- No functional change for non-error cases
- Push the `Ok(..)` to expression leaves, where it really applies. This
corrects the misleading impression that, in the case of an unsupported
type, the function returns `Ok` wrapped around whatever
`wasm_unsupported` returns. It doesn't do that, but it certainly reads
like that. This assumes that the LLVM backend will do tail merging, so
the generated code will be unchanged.
* `blocktype_to_type`:
- Change return type from `WasmResult<ir::Type>` to `WasmResult<Option<ir::Type>>`
- Manually inline the call to `type_to_type`, to make this function easier
to read.
- For the non-error case: map `TypeOrFuncType::Type(Type::EmptyBlockType)`
to `Ok(None)` rather than `Err(..)`, since that's what all the call sites
expect - For the error cases, add the function name in the failure
messages
* cranelift-wasm/src/code_translator.rs
- For the three uses of `blocktype_to_type`, use `?` to detect failures and
drop out immediately, meaning that the code will no longer silently ignore
errors.
* [codegen] add new recipe "rout"
Add a new recipe "rout" intended to be used by arithematic operations
that output flags, currently being used for `iadd_cout` and `isub_bout`.
Fixes: https://github.com/CraneStation/cranelift/issues/1009
This function is responsible for 8.5% of all heap allocation (calls) in CL.
This change avoids almost all of them by using a SmallVec::<[Value; 32]>
instead. Dynamic instruction count falls by 0.25%. The fixed size of 32 was
arrived at after profiling with fixed sizes of 1, 2, 4, 8, 16, 32, 64 and 128.
32 is as high as I can push it without the instruction count starting to creep
up again, and gets almost all the block-reduction win of 64 and 128.
Allocations associated with pushes to EbbHeaderBlockData::predecessors account
for 4.9% of all heap allocation (calls) in CL. This change avoids almost all
of them by changing it to be a SmallVec<[PredBlock; 4]>. Dynamic instruction
count falls by 0.15%.
Pushing on the `val_stack` vector is CL's biggest source of calls to
malloc/realloc/free, by some margin. It accounts for about 27.7% of all heap
blocks allocated when compiling wasm_lua_binarytrees. This change removes
pretty much all dynamic allocation by changing to a SmallVec<[Value; 8]>
instead. A fixed size of 4 gets all the gains to be had, in testing, so 8
gives some safety margin and is harmless from a stack-use perspective: 8
Values will occupy 32 bytes.
As a bonus, this change also reduces the compiler's dynamic instruction count
by about 0.5%.
This commit adds a hook to the `ModuleEnvironment` trait to learn when a
custom section in a wasm file is read. This hook can in theory be used
to parse and handle custom sections as they appear in the wasm file
without having to re-iterate over the wasm file after cranelift has
already parsed the wasm file.
The `translate_module` function is now less strict in that it doesn't
require sections to be in a particular order, but it's figured that the
wasm file is already validated elsewhere to verify the section order.