Commit Graph

10838 Commits

Author SHA1 Message Date
Trevor Elliott
116e5a665f Bump regalloc2 to 0.6.0 (#5742)
* Bump regalloc2
* Certify regalloc2 0.6.0
2023-02-07 15:57:49 -08:00
Andrew Brown
121094054b doc: update Wasm threads support status (#5739) 2023-02-07 14:56:26 -08:00
Trevor Elliott
3343cf80e9 Add assertions for matches that used to use analyze_branch (#5733)
Following up from #5730, add debug assertions to ensure that new branch instructions don't slip through matches that used to use analyze_branch.
2023-02-07 14:51:18 -08:00
Nick Fitzgerald
317cc51337 Rename VMCallerCheckedAnyfunc to VMCallerCheckedFuncRef (#5738)
At some point what is now `funcref` was called `anyfunc` and the spec changed,
but we didn't update our internal names. This does that.

Co-authored-by: Jamey Sharp <jsharp@fastly.com>
2023-02-07 22:09:02 +00:00
Andrew Brown
edfa10d607 wasi-threads: an initial implementation (#5484)
This commit includes a set of changes that add initial support for `wasi-threads` to Wasmtime:

* feat: remove mutability from the WasiCtx Table

This patch adds interior mutability to the WasiCtx Table and the Table elements.

Major pain points:
* `File` only needs `RwLock<cap_std::fs::File>` to implement
  `File::set_fdflags()` on Windows, because of [1]
* Because `File` needs a `RwLock` and `RwLock*Guard` cannot
  be hold across an `.await`, The `async` from
  `async fn num_ready_bytes(&self)` had to be removed
* Because `File` needs a `RwLock` and `RwLock*Guard` cannot
  be dereferenced in `pollable`, the signature of
  `fn pollable(&self) -> Option<rustix::fd::BorrowedFd>`
  changed to `fn pollable(&self) -> Option<Arc<dyn AsFd + '_>>`

[1] da238e324e/src/fs/fd_flags.rs (L210-L217)

* wasi-threads: add an initial implementation

This change is a first step toward implementing `wasi-threads` in
Wasmtime. We may find that it has some missing pieces, but the core
functionality is there: when `wasi::thread_spawn` is called by a running
WebAssembly module, a function named `wasi_thread_start` is found in the
module's exports and called in a new instance. The shared memory of the
original instance is reused in the new instance.

This new WASI proposal is in its early stages and details are still
being hashed out in the [spec] and [wasi-libc] repositories. Due to its
experimental state, the `wasi-threads` functionality is hidden behind
both a compile-time and runtime flag: one must build with `--features
wasi-threads` but also run the Wasmtime CLI with `--wasm-features
threads` and `--wasi-modules experimental-wasi-threads`. One can
experiment with `wasi-threads` by running:

```console
$ cargo run --features wasi-threads -- \
    --wasm-features threads --wasi-modules experimental-wasi-threads \
    <a threads-enabled module>
```

Threads-enabled Wasm modules are not yet easy to build. Hopefully this
is resolved soon, but in the meantime see the use of
`THREAD_MODEL=posix` in the [wasi-libc] repository for some clues on
what is necessary. Wiggle complicates things by requiring the Wasm
memory to be exported with a certain name and `wasi-threads` also
expects that memory to be imported; this build-time obstacle can be
overcome with the `--import-memory --export-memory` flags only available
in the latest Clang tree. Due to all of this, the included tests are
written directly in WAT--run these with:

```console
$ cargo test --features wasi-threads -p wasmtime-cli -- cli_tests
```

[spec]: https://github.com/WebAssembly/wasi-threads
[wasi-libc]: https://github.com/WebAssembly/wasi-libc

This change does not protect the WASI implementations themselves from
concurrent access. This is already complete in previous commits or left
for future commits in certain cases (e.g., wasi-nn).

* wasi-threads: factor out process exit logic

As is being discussed [elsewhere], either calling `proc_exit` or
trapping in any thread should halt execution of all threads. The
Wasmtime CLI already has logic for adapting a WebAssembly error code to
a code expected in each OS. This change factors out this logic to a new
function, `maybe_exit_on_error`, for use within the `wasi-threads`
implementation.

This will work reasonably well for CLI users of Wasmtime +
`wasi-threads`, but embedders will want something better in the future:
when a `wasi-threads` threads fails, they may not want their application
to exit. Handling this is tricky, because it will require cancelling the
threads spawned by the `wasi-threads` implementation, something that is
not trivial to do in Rust. With this change, we defer that work until
later in order to provide a working implementation of `wasi-threads` for
experimentation.

[elsewhere]: https://github.com/WebAssembly/wasi-threads/pull/17

* review: work around `fd_fdstat_set_flags`

In order to make progress with wasi-threads, this change temporarily
works around limitations induced by `wasi-common`'s
`fd_fdstat_set_flags` to allow `&mut self` use in the implementation.
Eventual resolution is tracked in
https://github.com/bytecodealliance/wasmtime/issues/5643. This change
makes several related helper functions (e.g., `set_fdflags`) take `&mut
self` as well.

* test: use `wait`/`notify` to improve `threads.wat` test

Previously, the test simply executed in a loop for some hardcoded number
of iterations. This changes uses `wait` and `notify` and atomic
operations to keep track of when the spawned threads are done and join
on the main thread appropriately.

* various fixes and tweaks due to the PR review

---------

Signed-off-by: Harald Hoyer <harald@profian.com>
Co-authored-by: Harald Hoyer <harald@profian.com>
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
2023-02-07 13:43:02 -08:00
Trevor Elliott
2c8425998b Refactor matches that used to consume BranchInfo (#5734)
Explicitly borrow the instruction data, and use a mutable borrow to avoid rematch.
2023-02-07 13:29:42 -08:00
Raekye
fdd4a778fc Fix ABI of jitted function in cranelift-jit example. (#5736) 2023-02-07 21:16:05 +00:00
Alex Crichton
72962c9f08 Add some minor souper-harvested optimizations (#5735)
I was playing around with souper recently on some wasms I had lying
around and these are some optimization opportunities that popped out
which seemed easy-enough to add to the egraph-based optimizations.
2023-02-07 14:06:24 -06:00
Brendan Burns
08403c9915 Update base64 dependency to 0.21.0 (#5702)
* Update base64 to 0.21.0

* Update code for base64 0.21.0
2023-02-07 04:34:01 +00:00
Chris Fallin
673b448cfe Cranelift DFG: make inst clone deep-clone varargs lists. (#5727)
When investigating #5716, I found that rematerialization of a `call`, in
addition to blowing up for other reasons, caused aliasing of the varargs
list (the `EntityList` in the `ListPool`), such that editing the args of
the second copy of the call instruction inadvertently updated the first
as well.

This PR modifies `DataFlowGraph::clone_inst` so that it always clones
the varargs list if present. This shouldn't have any functional impact
on Cranelift today, because we don't rematerialize any instructions with
varargs; but it's important to get it right to avoid a bug later!
2023-02-07 01:21:09 +00:00
Trevor Elliott
c8a6adf825 Remove analyze_branch and BranchInfo (#5730)
We don't have overlap in behavior for branch instructions anymore, so we can remove analyze_branch and instead match on the InstructionData directly.

Co-authored-by: Jamey Sharp <jamey@minilop.net>
2023-02-06 17:06:57 -08:00
Chris Fallin
75ae976adc egraphs: fix accidental remat of call. (#5726)
In the provided test case in #5716, the result of a call was then
added to 0. We have a rewrite rule that sets the remat-bit on any add
of a value and a constant, because these frequently appear (e.g. from
address offset calculations) and this can frequently reduce register
pressure (one long-lived base vs. many long-lived base+offset values).
Separately, we have an algebraic rule that `x+0` rewrites to `x`.

The result of this was that we had an eclass with the remat bit set on
the add, but the add was also union'd into the call. We pick the
latter during extraction, because it's cheaper not to do the add at
all; but we still get the remat bit, and try to remat a call (!),
which blows up later.

This PR fixes the logic to look up the "best value" for a value (i.e.,
whatever extraction determined), and look up the remat bit on *that*
node, not the canonical node.

(Why did the canonical node become the iadd and not the call? Because
the former had a lower value-number, as an accident of IR
construction; we don't impose any requirements on the input CLIF's
value-number ordering, and I don't think this breaks any of the
important acyclic properties, even though there is technically a
dependence from a lower-numbered to a higher-numbered node. In essence
one can think of them as having "virtual numbers" in any true
topologically-sorted order, and the only place the actual integer
indices matter should be in choosing the "canonical ID", which is just
used for dedup'ing, modulo this bug.)

Fixes #5716.
2023-02-06 23:36:16 +00:00
bjorn3
16afefdab1 Some refactorings to the ISLE parser (#5693)
* Use is_ascii_digit and is_ascii_hexdigit in the ISLE lexer

* Use range pattern in ISLE lexer

* Use a couple of shorthands in the ISLE parser

* Use parse_ident instead of symbol + str_to_ident

* Introduce token eating api

This is a non-fatal version of the take api

* Rename take to expect and add expect_ prefixes to several methods

* Review comments
2023-02-06 15:11:25 -08:00
Trevor Elliott
e9c05622c0 Keep reachable jump tables (#5721)
Instead of identifying unused branch tables by looking for unused blocks inside of them, track used branch tables while traversing reachable blocks. This introduces an extra allocation of an EntitySet to track the used jump tables, but as those are few and this function runs once per ir::Function, the allocation seems reasonable.
2023-02-06 14:10:47 -08:00
Jamey Sharp
65c1f654f2 Cranelift: Only build iconst for ints <= 64 bits (#5723)
I audited the egraph "algebraic" optimization rules for any which
construct an `iconst` on the right-hand side of the rule. In these cases
we need to constrain the type passed to `iconst` to be both `fits_in_64`
and `ty_int`, because `iconst` is not defined on other types.
2023-02-06 14:10:29 -08:00
Alex Crichton
284fec127a Remove explicit S type from component functions (#5722)
I ended up forgetting this as part of #5275.
2023-02-06 16:07:57 -06:00
Saúl Cabrera
939b6ea933 winch: Fix retrieving function signature for compilation (#5725)
This commit fixes an incorrect usage of `func_type_at` to retrieve a defined
function signature and instead uses `function_at` to retrieve the signature.

Additionally it enhances `winch-tools` `compile` and `test` commands to handle
modules with multiple functions correctly.
2023-02-06 22:02:38 +00:00
Alex Crichton
1390882b56 Update release notes for 6.0.0 (#5719) 2023-02-06 19:59:05 +00:00
Alex Crichton
de0e0bea3f Legalize b{and,or,xor}_not into component instructions (#5709)
* Remove trailing whitespace in `lower.isle` files

* Legalize the `band_not` instruction into simpler form

This commit legalizes the `band_not` instruction into `band`-of-`bnot`,
or two instructions. This is intended to assist with egraph-based
optimizations where the `band_not` instruction doesn't have to be
specifically included in other bit-operation-patterns.

Lowerings of the `band_not` instruction have been moved to a
specialization of the `band` instruction.

* Legalize `bor_not` into components

Same as prior commit, but for the `bor_not` instruction.

* Legalize bxor_not into bxor-of-bnot

Same as prior commits. I think this also ended up fixing a bug in the
s390x backend where `bxor_not x y` was actually translated as `bnot
(bxor x y)` by accident given the test update changes.

* Simplify not-fused operands for riscv64

Looks like some delegated-to rules have special-cases for "if this
feature is enabled use the fused instruction" so move the clause for
testing the feature up to the lowering phase to help trigger other rules
if the feature isn't enabled. This should make the riscv64 backend more
consistent with how other backends are implemented.

* Remove B{and,or,xor}Not from cost of egraph metrics

These shouldn't ever reach egraphs now that they're legalized away.

* Add an egraph optimization for `x^-1 => ~x`

This adds a simplification node to translate xor-against-minus-1 to a
`bnot` instruction. This helps trigger various other optimizations in
the egraph implementation and also various backend lowering rules for
instructions. This is chiefly useful as wasm doesn't have a `bnot`
equivalent, so it's encoded as `x^-1`.

* Add a wasm test for end-to-end bitwise lowerings

Test that end-to-end various optimizations are being applied for input
wasm modules.

* Specifically don't self-update rustup on CI

I forget why this was here originally, but this is failing on Windows
CI. In general there's no need to update rustup, so leave it as-is.

* Cleanup some aarch64 lowering rules

Previously a 32/64 split was necessary due to the `ALUOp` being different
but that's been refactored away no so there's no longer any need for
duplicate rules.

* Narrow a x64 lowering rule

This previously made more sense when it was `band_not` and rarely used,
but be more specific in the type-filter on this rule that it's only
applicable to SIMD types with lanes.

* Simplify xor-against-minus-1 rule

No need to have the commutative version since constants are already
shuffled right for egraphs

* Optimize band-of-bnot when bnot is on the left

Use some more rules in the egraph algebraic optimizations to
canonicalize band/bor/bxor with a `bnot` operand to put the operand on
the right. That way the lowerings in the backends only have to list the
rule once, with the operand on the right, to optimize both styles of
input.

* Add commutative lowering rules

* Update cranelift/codegen/src/isa/x64/lower.isle

Co-authored-by: Jamey Sharp <jamey@minilop.net>

---------

Co-authored-by: Jamey Sharp <jamey@minilop.net>
2023-02-06 13:53:40 -06:00
Afonso Bordado
99c3936616 fuzzgen: Enable rotl for riscv64 (#5715)
It was fixed in #5611
2023-02-06 18:27:31 +00:00
Pat Hickey
743a40a6c4 Cargo update cap-std family, and audit deps (#5710)
* update cap-std family and its deps, and audit them

* audit base64: append a safe-to-deploy entry

I mistakenly marked it safe-to-run not understanding that safe-to-deploy was required.

* update to fd-lock 3.0.10

eliminates duplicate dep on windows-sys
2023-02-06 10:16:19 -08:00
Jamey Sharp
23e1d6b5e3 egraphs/cprop: Don't extend constants to i128 (#5717)
Fixes #5711.
2023-02-06 17:34:21 +00:00
wasmtime-publish
482f541101 Bump Wasmtime to 7.0.0 (#5712)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2023-02-06 09:10:19 -06:00
Chris Fallin
43022c862a Add cargo-vet updates for audit backlog. (#5708) 2023-02-04 01:46:44 +00:00
Jamey Sharp
97381792ac Generalize u/sextend constant folding to all types (#5706)
Also move these optimization rules to cprop.isle; it's where all the
other similar rules are.

Like the other cprop rules, these can subsume any other rules. We can't
do better than reducing an expression to a constant.

The new i64_sextend_imm64 and u64_uextend_imm64 constructors are useful
helpers to clean up other code. I applied them to `imm64_icmp` while I
was here, as well as using the existing `ty_mask` helper to clean up
`imm64_masked`.
2023-02-03 17:29:21 -08:00
Pat Hickey
331bc281a1 cargo-vet: audit base64 0.21.0 (#5707) 2023-02-04 01:17:47 +00:00
Trevor Elliott
6d8f2be9e1 Use andn for band_not when bmi1 is present (#5701)
We can use the andn instruction for the lowering of band_not on x64 when bmi1 is available.
2023-02-03 16:23:18 -08:00
Saúl Cabrera
0ba1448fa4 winch: Add missing conversions between x64 types (#5703)
This commit adds some missing conversions between Winch's x64 `Reg` type and
Cranelift's `Gpr`, `WritableGpr` and `GprMemImm`. This results in less
boilerplate. This is also a bit of groundwork in the assembler to support
the rest of the integer binary instructions.
2023-02-03 15:55:30 -08:00
Nick Fitzgerald
e18d4cb711 Cranelift: Introduce support for return_call in the interpreter (#5697)
Co-authored-by: Jamey Sharp <jsharp@fastly.com>
2023-02-03 15:53:54 -08:00
Nick Fitzgerald
72c8513411 Cranelift: Correctly wrap shifts in constant propagation (#5695)
Fixes #5690
Fixes #5696

Co-authored-by: Jamey Sharp <jsharp@fastly.com>
2023-02-03 00:12:57 +00:00
yuyang
fd67ccf9cd Perform I-Cache Maintenance on RISC-V (#5698)
* re modify this.

* fix compile failure.

* fix unused warning.
2023-02-02 16:11:24 -08:00
Saúl Cabrera
426c49b8e3 winch: Use aarch64 backend for code emission. (#5652)
This patch introduces basic aarch64 code generation by using
`cranelift-codegen`'s backend.

This commit *does not*:

* Change the semantics of the code generation
* Adds support for other Wasm instructions

The most notable change in this patch is how addressing modes are handled at the
MacroAssembler layer: instead of having a canonical address representation, this
patch introduces the addressing mode as an associated type in the
MacroAssembler trait. This approach has the advantage that gives each ISA enough
flexiblity to describe the addressing modes and their constraints in isolation
without having to worry on how a particular addressing mode is going to affect
other ISAs. In the case of Aarch64 this becomes useful to describe indexed
addressing modes (particularly from the stack pointer).

This patch uses the concept of a shadow stack pointer (x28) as a workaround to
Aarch64's stack pointer 16-byte alignment. This constraint is enforced by:

* Introducing specialized addressing modes when using the real stack pointer; this
enables auditing when the real stack pointer is used. As of this change, the
real stack pointer is only used in the function's prologue and epilogue.

* Asserting that the real stack pointer is not used as a base for addressing
modes.

* Ensuring that at any point during the code generation process where the stack
pointer changes (e.g. when stack space is allocated / deallocated) the value of
the real stack pointer is copied into the shadow stack pointer.
2023-02-02 14:24:11 -08:00
Alex Crichton
a2a0a9ef5b Update to the latest wit-parser (#5694)
This notably pulls in support in WIT for types-in-worlds.
2023-02-02 19:21:01 +00:00
Alex Crichton
545749b279 Fix some wit-bindgen-related issues with generated bindings (#5692)
* Prefix component-bindgen-generated-functions with `call_`

This fixes clashes between Rust-native methods and the methods
themselves. For example right now `new` is a Rust-generated function for
constructing the wrapper but this can conflict with a world-exported
function called `new`.

Closes #5585

* Fix types being both shared and owned

This refactors some inherited cruft from the original `wit-bindgen`
repository to be more Wasmtime-specific and fixes a codegen case where
a type was used in both a shared and an owned context.

Closes #5688
2023-02-02 11:54:35 -06:00
Alex Crichton
63d80fc509 Remove the need to have a Store for an InstancePre (#5683)
* Remove the need to have a `Store` for an `InstancePre`

This commit relaxes a requirement of the `InstancePre` API, notably its
construction via `Linker::instantiate_pre`. Previously this function
required a `Store<T>` to be present to be able to perform type-checking
on the contents of the linker, and now this requirement has been
removed.

Items stored within a linker are either a `HostFunc`, which has type
information inside of it, or an `Extern`, which doesn't have type
information inside of it. Due to the usage of `Extern` this is why a
`Store` was required during the `InstancePre` construction process, it's
used to extract the type of an `Extern`. This commit implements a
solution where the type information of an `Extern` is stored alongside
the `Extern` itself, meaning that the `InstancePre` construction process
no longer requires a `Store<T>`.

One caveat of this implementation is that some items, such as tables and
memories, technically have a "dynamic type" where during type checking
their current size is consulted to match against the minimum size
required of an import. This no longer works when using
`Linker::instantiate_pre` as the current size used is the one when it
was inserted into the linker rather than the one available at
instantiation time. It's hoped, however, that this is a relatively
esoteric use case that doesn't impact many real-world users.

Additionally note that this is an API-breaking change. Not only is the
`Store` argument removed from `Linker::instantiate_pre`, but some other
methods such as `Linker::define` grew a `Store` argument as the type
needs to be extracted when an item is inserted into a linker.

Closes #5675

* Fix the C API

* Fix benchmark compilation

* Add C API docs

* Update crates/wasmtime/src/linker.rs

Co-authored-by: Andrew Brown <andrew.brown@intel.com>

---------

Co-authored-by: Andrew Brown <andrew.brown@intel.com>
2023-02-02 11:54:20 -06:00
Saúl Cabrera
f5f517e811 winch: Small clean-up for x64 (#5691)
This commit contains a small set of clean up items for x64.

Notably:

* Adds filetests
* Documents why 16 for the arg base offset abi implementation, for clarity.
* Fixes a bug in the spill implementation caught while anlyzing the
filetests results. The fix consists of emitting a load instead of a store into
the scratch register before spiiling its value.
* Remove dead code for pretty printing registers which is not needed anymore
since we now have proper disassembly.
2023-02-02 16:40:31 +00:00
Trevor Elliott
446337c746 Generate an instance_pre wrapper in the component bindgen output (#5685) 2023-02-02 09:26:09 -06:00
Jun Ryung Ju
9cd4146939 Implemented b{and,or,xor}_not bitops for ty_int_ref_scalar_64 type. (#5604)
* Implemented `b{and,or,xor}_not` bitops for ty_int_ref_scalar_64 type.

* Added tests.
2023-02-01 21:57:18 -08:00
Jamey Sharp
ac4d28f4dd Constant-fold icmp instructions (#5666)
We found examples of icmp instructions with both operands constant in
spidermonkey.wasm.
2023-02-01 21:55:36 +00:00
Nick Fitzgerald
bdfb746548 Cranelift: Introduce the return_call and return_call_indirect instructions (#5679)
* Cranelift: Introduce the `tail` calling convention

This is an unstable-ABI calling convention that we will eventually use to
support Wasm tail calls.

Co-Authored-By: Jamey Sharp <jsharp@fastly.com>

* Cranelift: Introduce the `return_call` and `return_call_indirect` instructions

These will be used to implement tail calls for Wasm and any other language
targeting CLIF. The `return_call_indirect` instruction differs from the Wasm
instruction of the same name by taking a native address callee rather than a
Wasm function index.

Co-Authored-By: Jamey Sharp <jsharp@fastly.com>

* Cranelift: Implement verification rules for `return_call[_indirect]`

They must:

* have the same return types between the caller and callee,
* have the same calling convention between caller and callee,
* and that calling convention must support tail calls.

Co-Authored-By: Jamey Sharp <jsharp@fastly.com>

* cargo fmt

---------

Co-authored-by: Jamey Sharp <jsharp@fastly.com>
2023-02-01 21:20:35 +00:00
Nick Fitzgerald
ffbbfbffce Cranelift: Rewrite or(and(x, y), not(y)) => or(x, not(y)) again (#5684)
This rewrite was introduced in #5676 and then reverted in #5682 due to a footgun
where we accidentally weren't actually checking the `y == !z` precondition. This
commit fixes the precondition check. It also fixes the arithmetic to be
correctly masked to the value type's width.

This reverts commit 268f6bfc1d.
2023-02-01 20:53:22 +00:00
Alex Crichton
91b8a2c527 Always allocate Instance memory with malloc (#5656)
This commit removes the pooling of `Instance` allocations from the
pooling instance allocator. This means that the allocation of `Instance`
(and `VMContext`) memory, now always happens through the system `malloc`
and `free` instead of optionally being part of the pooling instance
allocator. Along the way this refactors the `InstanceAllocator` trait so
the pooling and on-demand allocators can share more structure with this
new property of the implementation.

The main rationale for this commit is to reduce the RSS of long-lived
programs which allocate instances with the pooling instance allocator
and aren't using the "next available" allocation strategy. In this
situation the memory for an instance is never decommitted until the end
of the program, meaning that eventually all instance slots will become
occupied and resident. This has the effect of Wasmtime slowly eating
more and more memory over time as each slot gets an instance allocated.
By switching to the system allocator this should reduce the current RSS
workload from O(used slots) to O(active slots), which is more in line
with expectations.
2023-02-01 19:37:45 +00:00
Alex Crichton
8ffbb9cfd7 Reimplement the pooling instance allocation strategy (#5661)
* Reimplement the pooling instance allocation strategy

This commit is a reimplementation of the strategy by which the pooling
instance allocator selects a slot for a module. Previously there was a
choice amongst three different algorithms: "reuse affinity", "next
available", and "random". The default was "reuse affinity" but some new
data has come to light which shows that this may not always be a good
default.

Notably the pooling allocator will retain some memory per-slot in the
pooling instance allocator, for example instance data or memory data
if-so-configured. This means that a currently unused, but previously
used, slot can contribute to the RSS usage of a program using Wasmtime.
Consequently the RSS impact here is O(max slots) which can be
counter-intuitive for embedders. This particularly affects "reuse
affinity" because the algorithm for picking a slot when there are no
affine slots is "pick a random slot", which means eventually all slots
will get used.

In discussions about possible ways to tackle this, an alternative to
"pick a strategy" arose and is now implemented in this commit.
Concretely the new allocation algorithm for a slot is now:

* First pick the most recently used affine slot, if one exists.
* Otherwise if the number of affine slots to other modules is above some
  threshold N then pick the least-recently used affine slot.
* Otherwise pick a slot that's affine to nothing.

The "N" in this algorithm is configurable and setting it to 0 is the
same as the old "next available" strategy while setting it to infinity
is the same as the "reuse affinity" algorithm. Setting it to something
in the middle provides a knob to allow a modest "cache" of affine slots
while not allowing the total set of slots used to grow too much beyond
the maximal concurrent set of modules. The "random" strategy is now no
longer possible and was removed to help simplify the allocator.

* Resolve rustdoc warnings in `wasmtime-runtime` crate

* Remove `max_cold` as it duplicates the `slot_state.len()`

* More descriptive names

* Add a comment and debug assertion

* Add some list assertions
2023-02-01 11:43:51 -06:00
yuyang
cb3b6c621f fix rotl.i16 with i128 shift value. (#5611)
* fix issue 5523.

* fix.

* add missing issue file.

* fix issue.

* fix duplicate shamt_128.

* issue 5523 add test target,and fix some wrong comment.

* fix output file.

* enable llvm_abi_extensions for regression test file.
2023-02-01 03:44:13 +00:00
Trevor Elliott
268f6bfc1d Revert "Cranelift: Rewrite or(and(x, y), not(y)) => or(x, not(y)) (#5676)" (#5682)
This reverts commit 8c9eb9939b.

Fixes #5680
2023-02-01 02:53:23 +00:00
yuyang
0c66a1bba7 Fix issue 5528 (#5605)
* fix parameter error.

* fix float convert to i8 and i16   should extract sign bit.

* add missing regression test file.

* using tmp register.

* float convert i8 will consume more instructions.

* fix worse inst emit size.

* fix worst_case_size.
2023-01-31 15:37:36 -08:00
Nick Fitzgerald
8c9eb9939b Cranelift: Rewrite or(and(x, y), not(y)) => or(x, not(y)) (#5676)
Co-authored-by: Rainy Sinclair <844493+itsrainy@users.noreply.github.com>
2023-01-31 22:44:45 +00:00
Trevor Elliott
e82995f03c Add a convenience function for displaying a BlockCall (#5677)
Add a display method to BlockCall that returns a std::fmt::Displayable result. Rework the display code in the write module of cranelift-codegen to use this method instead.
2023-01-31 14:26:10 -08:00
Nick Fitzgerald
253e28ca4f Cranelift: Rewrite (x>>k)<<k into masking off the bottom k bits (#5673)
* Cranelift: Rewrite `(x>>k)<<k` into masking off the bottom `k` bits

* Add a runtest for exercising our rewrite of `(x >> k) << k` into masking
2023-01-31 21:11:12 +00:00
Alex Crichton
7f2c8e6344 Fix some warnings on nightly Rust (#5668)
* Fix some warnings on nightly Rust

Cargo is warning about the usage of workspace dependencies where the
workspace declaration does not mention `default-features` but the
dependency mentions `default-features`, so this explicitly turns off
default features for `cranelift-codegen` at the workspace level and
removes the explicit `default-features = false` at the manifest levels.

* Explicitly enable default feature in wasmtime

* Enable another feature
2023-01-31 20:54:58 +00:00