When AVX512VL and AVX512F are available, use a single instruction
(`VCVTUDQ2PS`) instead of a length 9-instruction sequence. This
optimization is a port from the legacy x86 backend.
This change implements `vselect` using SSE4.1's `BLENDVPS`, `BLENDVPD`,
and `PBLENDVB`. `vselect` is a lane-selecting instruction that is used
by
[simple_preopt.rs](fa1faf5d22/cranelift/codegen/src/simple_preopt.rs (L947-L999))
to lower `bitselect` to a single x86 instruction when the condition mask
is known to be boolean (all 1s or 0s, e.g., from a conversion). This is
better than `bitselect` in general, which lowers to 4-5 instructions.
The old backend had the `vselect` lowering; this simply introduces it to
the new backend.
Previously the inclusion of the `criterion` crate had brought in a
transitive dependency to `cast`, which used old versions of several
libraries. Now that https://github.com/japaric/cast.rs/pull/26 is merged
and a new version published, we can update `cast` and remove the
cargo-deny rules for the duplicated, older versions.
Since `uadd_sat`, `sadd_sat`, `usub_sat`, and `ssub_sat` are now only
available to vector types, this removes the lowering code for the
scalar versions of these instructions in the arm32 and aarch64 backends.
This adds the machinery to encode the VPMULLQ instruction which is
available in AVX512VL and AVX512DQ. When these feature sets are
available, we use this instruction instead of a lengthy 12-instruction
sequence.
Since the lowering of `imul` complicated the other ALU operations it was
matched with and since future commits will alter the multiplication
lowering further, this change moves the `imul` lowering to its own match
block.
This adds benchmarks around module instantiation using criterion.
Both the default (i.e. on-demand) and pooling allocators are tested
sequentially and in parallel using a thread pool.
Instantiation is tested with an empty module, a module with a single page
linear memory, a larger linear memory with a data initializer, and a "hello
world" Rust WASI program.
Until https://github.com/japaric/cast.rs/pull/26 is resolved, the `cast`
crate will pull in older versions of the `rustc_version`, `semver`, and
`semver-parser` crates. `cast` is a build dependency of `criterion`
which is used for benchmarking and is itself a dev dependency, not a
normal dependency.
This change adds a criterion-enabled benchmark, x64-evex-encoding, to
compare the performance of the builder pattern used to encode EVEX
instructions in the new x64 backend against the function pattern
used to encode EVEX instructions in the legacy x86 backend. At face
value, the results imply that the builder pattern is faster, but no
efforts were made to analyze and optimize these approaches further.
In order to benchmark the encoding code with criterion, the functions
and structures must be public. Moving this code to its own module
(instead of keeping as a submodule to `inst`), allows `inst` to remain
private. This avoids having to expose and document (or ignore
documenting) the numerous instruction variants in `inst` while allowing
access to the encoding code. This commit changes no functionality.
* Upgrade to the latest versions of gimli, addr2line, object
And adapt to API changes. New gimli supports wasm dwarf, resulting in
some simplifications in the debug crate.
* upgrade gimli usage in linux-specific profiling too
* Add "continue" statement after interpreting a wasm local dwarf opcode
This adds support for the IBM z/Architecture (s390x-ibm-linux).
The status of the s390x backend in its current form is:
- Wasmtime is fully functional and passes all tests on s390x.
- All back-end features supported, with the exception of SIMD.
- There is still a lot of potential for performance improvements.
- Currently the only supported processor type is z15.
On Windows, `metadata` computes only partial metadata results, which don't
include what WASI needs for the `inode` field in `readdir` results. cap-std
has a `full_metadata` function which is able to include this extra
information, however it has more strict access requirements, so it sometimes
fails even when plain `metadata` would succeed.
Make WASI's `readdir` silently skip over files that can't be accessed by
`full_metadata`. These files wouldn't be openable in any other way by
WASI programs, so the only benefit of listing them would be to
let applications know that they exist. This allows it to avoid failing
and avoid returning bogus results.
This is part of a fix for bytecodealliance/cap-std#169.
Transient dependencies depend on two different versions of `cpuid-bool`.
This advisory does not appear to be urgent. We should review this ignore
after a few weeks to see if our deps have switched over.
text of the advisory:
Issued
May 6, 2021
Package
cpuid-bool (crates.io)
Type
Unmaintained
Details
https://github.com/RustCrypto/utils/pull/381
Patched
no patched versions
Description
Please use the `cpufeatures`` crate going forward:
https://github.com/RustCrypto/utils/tree/master/cpufeatures
There will be no further releases of cpuid-bool.