wasmtime

Author	SHA1	Message	Date
Chris Fallin	570dee63f3	Use MemFdSlot in the on-demand allocator as well.	2022-01-31 13:59:51 -08:00
Chris Fallin	3702e81d30	Remove ftruncate-trick for heap growth with memfd backend. Testing so far with recent Wasmtime has not been able to show the need for avoiding the process-wide mmap lock in real-world use-cases. As such, the technique of using an anonymous file and ftruncate() to extend it seems unnecessary; instead, memfd can always use anonymous zeroed memory for heap backing where the CoW image is not present, and mprotect() to extend the heap limit by changing page protections.	2022-01-31 12:53:22 -08:00
Chris Fallin	b73ac83c37	Add a pooling allocator mode based on copy-on-write mappings of memfds. As first suggested by Jan on the Zulip here [1], a cheap and effective way to obtain copy-on-write semantics of a "backing image" for a Wasm memory is to mmap a file with `MAP_PRIVATE`. The `memfd` mechanism provided by the Linux kernel allows us to create anonymous, in-memory-only files that we can use for this mapping, so we can construct the image contents on-the-fly then effectively create a CoW overlay. Furthermore, and importantly, `madvise(MADV_DONTNEED, ...)` will discard the CoW overlay, returning the mapping to its original state. By itself this is almost enough for a very fast instantiation-termination loop of the same image over and over, without changing the address space mapping at all (which is expensive). The only missing bit is how to implement heap growth. But here memfds can help us again: if we create another anonymous file and map it where the extended parts of the heap would go, we can take advantage of the fact that a `mmap()` mapping can be larger than the file itself, with accesses beyond the end generating a `SIGBUS`, and the fact that we can cheaply resize the file with `ftruncate`, even after a mapping exists. So we can map the "heap extension" file once with the maximum memory-slot size and grow the memfd itself as `memory.grow` operations occur. The above CoW technique and heap-growth technique together allow us a fastpath of `madvise()` and `ftruncate()` only when we re-instantiate the same module over and over, as long as we can reuse the same slot. This fastpath avoids all whole-process address-space locks in the Linux kernel, which should mean it is highly scalable. It also avoids the cost of copying data on read, as the `uffd` heap backend does when servicing pagefaults; the kernel's own optimized CoW logic (same as used by all file mmaps) is used instead. [1] https://bytecodealliance.zulipchat.com/#narrow/stream/206238-general/topic/Copy.20on.20write.20based.20instance.20reuse/near/266657772	2022-01-31 12:53:18 -08:00
Chris Fallin	90e7cef56c	Merge pull request #3699 from cfallin/epoch-interruption Add epoch-based interruption for cooperative async timeslicing.	2022-01-20 14:45:30 -08:00
Chris Fallin	8a55b5c563	Add epoch-based interruption for cooperative async timeslicing. This PR introduces a new way of performing cooperative timeslicing that is intended to replace the "fuel" mechanism. The tradeoff is that this mechanism interrupts with less precision: not at deterministic points where fuel runs out, but rather when the Engine enters a new epoch. The generated code instrumentation is substantially faster, however, because it does not need to do as much work as when tracking fuel; it only loads the global "epoch counter" and does a compare-and-branch at backedges and function prologues. This change has been measured as ~twice as fast as fuel-based timeslicing for some workloads, especially control-flow-intensive workloads such as the SpiderMonkey JS interpreter on Wasm/WASI. The intended interface is that the embedder of the `Engine` performs an `engine.increment_epoch()` call periodically, e.g. once per millisecond. An async invocation of a Wasm guest on a `Store` can specify a number of epoch-ticks that are allowed before an async yield back to the executor's event loop. (The initial amount and automatic "refills" are configured on the `Store`, just as for fuel.) This call does only signal-safe work (it increments an `AtomicU64`) so could be invoked from a periodic signal, or from a thread that wakes up once per period.	2022-01-20 13:58:17 -08:00
Chris Fallin	2615ef967f	Merge pull request #3702 from uweigand/isle-prep-s390x s390x: Codegen fixes and preparation for ISLE migration	2022-01-20 12:02:08 -08:00
Nick Fitzgerald	0670d7beb5	Merge pull request #3703 from uweigand/isle-prep-common ISLE standard prelude: Additional types and helpers	2022-01-20 10:09:51 -08:00
Ulrich Weigand	be60a19623	ISLE standard prelude: Additional types and helpers In preparing to move the s390x back-end to ISLE, I noticed a few missing pieces in the common prelude code. This patch: - Defines the reference types $R32 / $R64. - Provides a trap_code_bad_conversion_to_integer helper. - Provides an avoid_div_traps helper. This requires passing the generic flags in addition to the ISA-specifc flags into the ISLE lowering context.	2022-01-20 17:23:31 +01:00
Ulrich Weigand	c08a013b53	s390x: Codegen fixes and preparation for ISLE migration In preparing the back-end to move to ISLE, I detected a number of codegen bugs in the existing code, which are fixed here: - Fix internal compiler error with uload16/icmp corner case. - Fix broken Cls lowering. - Correctly mask shift count for i8/i16 shifts. In addition, I made several changes to operand encodings in various MInst patterns. These should not have any functional effect, but will make the ISLE migration easier: - Encode floating-point constants as u32/u64 in MInst patterns. - Encode shift amounts as u8 and Reg in ShiftOp pattern. - Use MemArg in LoadMultiple64 and StoreMultiple64 patterns.	2022-01-20 16:59:18 +01:00
Chris Fallin	9321a9db88	Add some agenda items to next Cranelift and Wasmtime meetings. (#3700 ) - Cranelift: items to discuss 2022 roadmap, coordinate who's working on what in ISLE transition for all platforms, and discuss platform tiers and arm32 (wrt above) - Wasmtime: item to discuss new revelations re: memfd/CoW and epoch interruption schemes	2022-01-19 18:18:04 -06:00
Chris Fallin	ae476fde60	Merge pull request #3698 from cfallin/cold-blocks Cranelift: add support for cold blocks.	2022-01-19 12:58:33 -08:00
Chris Fallin	f489b83835	Cranelift: add support for cold blocks. This PR adds a flag to each block that can be set via the frontend/builder interface that indicates that the block will not be frequently executed. As such, the compiler backend should place the block "out of line" in the final machine code, so that the ordinary, more frequent execution path that excludes the block does not have to jump around it. This is useful for adding handlers for exceptional conditions (slow-paths, guard violations) in a way that minimizes performance cost. Fixes #2747.	2022-01-19 12:17:41 -08:00
Chris Fallin	4a331b8981	Merge pull request #3679 from FreddieLiardet/fp_const_fmov Improve code generation for floating-point constants	2022-01-19 09:59:34 -08:00
Benjamin Bouvier	2649d2352c	Support vtune profiling of trampolines too (#3687 ) * Provide helpers for demangling function names * Profile trampolines in vtune too * get rid of mapping * avoid code duplication with jitdump_linux * maintain previous default display name for wasm functions * no dash, grrr * Remove unused profiling error type	2022-01-19 09:49:23 -06:00
Mrmaxmeier	2afd6900f4	runtime: expose DefaultMemoryCreator (#3670 )	2022-01-18 09:17:33 -06:00
Freddie Liardet	b5531580e7	Improve code generation for floating-point constants Copyright (c) 2022, Arm Limited.	2022-01-18 10:39:05 +00:00
Chris Fallin	06a7bfdcbd	Merge pull request #3692 from akirilov-arm/abi_isa_flags Cranelift: Pass the ISA-specific compilation flags to the ABI implementations	2022-01-17 22:46:58 -08:00
Anton Kirilov	89919f4b1f	Pass the ISA-specific compilation flags to the ABI implementations Copyright (c) 2021, Arm Limited.	2022-01-14 14:18:01 +00:00
Nick Fitzgerald	df37074218	Merge pull request #3690 from fitzgen/a-bunch-more-isle cranelift: Port a bunch more lowerings to ISLE on x64	2022-01-13 18:08:31 -08:00
Nick Fitzgerald	a052285340	Fix typo: s/sentinals/sentinels/	2022-01-13 16:50:15 -08:00
Nick Fitzgerald	658c5d33c1	cranelift: Port `trap` and `resumable_trap` lowering to ISLE on x64	2022-01-13 15:57:17 -08:00
Nick Fitzgerald	5bb3645bd4	cranelift: Port `ineg` SIMD lowering to ISLE on x64	2022-01-13 15:57:17 -08:00
Nick Fitzgerald	7d943f68c5	Merge pull request #3688 from fitzgen/ushr-simd-isle cranelift: Port `ushr` SIMD lowerings to ISLE on x64	2022-01-13 15:35:05 -08:00
Nick Fitzgerald	5917f1d2c2	cranelift: Port `ineg` scalar lowering to ISLE on x64	2022-01-13 15:08:01 -08:00
Nick Fitzgerald	b78731839b	cranelift: Use `x64_` prefix to disambiguate with clif in ISLE Instead of using `m_` like we used to, which was short for "mach inst" but not obvious or clear at all.	2022-01-13 14:59:09 -08:00
Nick Fitzgerald	a41fdb0303	cranelift: Port `rotr` lowering to ISLE on x64	2022-01-13 14:59:09 -08:00
Nick Fitzgerald	4120e40318	cranelift: Update assertions to indicate that `rotl` is fully ported to ISLE on x64	2022-01-13 14:59:09 -08:00
Nick Fitzgerald	4e34dd8239	cranelift: Port `ushr` SIMD lowerings to ISLE on x64	2022-01-13 14:39:06 -08:00
Alex Crichton	46ade3dab3	Try to fix CI for Rust 1.58 (#3689 ) PATH lookup for WIndows command execution was tweaked slightly to not search the cwd, so let's see if this fixes things...	2022-01-13 16:38:32 -06:00
Nick Fitzgerald	a7dba81c1d	cranelift: Port `ishl` SIMD lowerings to ISLE (#3686 )	2022-01-13 09:34:37 -06:00
Chris Fallin	13f17db297	Merge pull request #3680 from bjorn3/remove_code_sink Remove the CodeSink interface in favor of MachBufferFinalized	2022-01-12 10:47:23 -08:00
Nick Fitzgerald	eeca41d666	Merge pull request #3683 from bytecodealliance/demangle-names-in-profiling Try demangling names before forwarding them to the profiler	2022-01-12 10:31:56 -08:00
Benjamin Bouvier	e53f213ac4	Try demangling names before forwarding them to the profiler Before this PR, each profiler (perf/vtune, at the moment) had to have a demangler for each of the programming languages that could have been compiled to wasm and fed into wasmtime. With this, wasmtime now demangles names before even forwarding them to the underlying profiler, which makes for a unified representation in profilers, and avoids incorrect demangling in profilers.	2022-01-12 19:17:42 +01:00
bjorn3	17021bc77a	Extract helper functions	2022-01-12 17:19:34 +01:00
Nick Fitzgerald	7454f1f3af	cranelift: port `sshr` to ISLE on x64 (#3681 )	2022-01-12 09:13:58 -06:00
bjorn3	f0e821b9e0	Remove all Sink traits	2022-01-11 19:03:10 +01:00
bjorn3	b803514d55	Remove sink arguments from compile_and_emit The data can be accessed after the fact using context.mach_compile_result	2022-01-11 18:17:29 +01:00
bjorn3	55d722db05	Remove CodeSink	2022-01-11 17:10:37 +01:00
bjorn3	a48a60f958	Remove reloc_external from CodeSink And introduce MachBufferFinalized::relocs() in the place.	2022-01-11 16:54:27 +01:00
bjorn3	63e2360346	Remove trap from CodeSink And introduce MachBufferFinalized::traps() in the place.	2022-01-11 16:42:52 +01:00
bjorn3	38aaa6e1da	Remove add_call_site from CodeSink and RelocSink And introduce MachBufferFinalized::call_sites() in the place.	2022-01-11 16:32:57 +01:00
bjorn3	379c9c65a3	Inline MemoryCodeSink::write	2022-01-11 15:10:02 +01:00
bjorn3	37598ad170	Remove end_codegen method from CodeSink	2022-01-11 14:52:04 +01:00
bjorn3	354c4f7bf8	Remove unused CodeSink methods	2022-01-11 14:52:04 +01:00
bjorn3	88baac4ca6	Move the TestCodeSink functionality to MachBufferFinalized	2022-01-11 14:40:53 +01:00
Alex Crichton	1ef0abb12c	Update lots of `isa//.clif` tests to `precise-output` (#3677 ) * Update lots of `isa//.clif` tests to `precise-output` This commit goes through the `aarch64` and `x64` subdirectories and subjectively changes tests from `test compile` to add `precise-output`. This then auto-updates all the test expectations so they can be automatically instead of manually updated in the future. Not all tests were migrated, largely subject to the whims of myself, mainly looking to see if the test was looking for specific instructions or just checking the whole assembly output. * Filter out `;;` comments from test expctations Looks like the cranelift parser picks up all comments, not just those trailing the function, so use a convention where `;;` is used for human-readable-comments in test cases and `;`-prefixed comments are the test expectation.	2022-01-10 13:38:23 -06:00
Alex Crichton	a8ea0ec097	cranelift: Add ability to auto-update test expectations (#3612 ) * cranelift: Add ability to auto-update test expectations One of the problems of the current `.clif` testing is that the files are difficult to update when widespread changes are made (such as removing modification of the frame pointer). Additionally when changing register allocation or similar it can cause a large number of changes in tests but the tests themselves didn't actually break. For this reason this commit adds the ability to automatically update test expectations. The idea behind this commit is that tests of the form `test compile` can also optionally be flagged with the `precise-output` flag: test compile precise-output and when doing so the compiled form of each function is asserted to 100% match the following comments and their test expectations. If a match is not found then a `BLESS=1` environment variable can be used to automatically rewrite the test file itself with the correct assertion. If the environment variable isn't present and the expectation doesn't match then the test fails. It's hoped that, if approved, a follow-up commit can add `precise-output` to all current `test compile` tests (or make it the default) and all tests can be mass-updated. When developing locally test expectations need not be written and instead tests can be run with `BLESS=1` and the output can be manually verified. The environment variable will not be present on CI which means that changes to the output which don't also change the test expectation will cause CI to fail. Furthermore this should still make updates to the test output easily readable in review on CI because the test expectations are intended to look the same as before. Closes #1539 Use raw vcode output in tests * Fix a merge conflict * Review comments	2022-01-10 11:59:45 -06:00
Chris Fallin	c6a62427a4	Merge pull request #3676 from bytecodealliance/bnjbvr-patch-1 Meeting notes for Cranelift meeting on 2022-01-10	2022-01-10 09:55:51 -08:00
Benjamin Bouvier	092be49175	Meeting notes for Cranelift meeting on 2022-01-10	2022-01-10 18:43:48 +01:00
Nick Fitzgerald	2e2620c07f	Merge pull request #3671 from alexcrichton/table-ops-terminate fuzz: Fix infinite loops in table_ops fuzzers	2022-01-10 09:34:07 -08:00

1 2 3 4 5 ...

9461 Commits