wasmtime

Author	SHA1	Message	Date
Alex Crichton	08d44e3746	Change how wasm DWARF is inserted into artifacts (#5358 ) This commit fixes a bug with components by changing how DWARF information from a wasm binary is copied over to the final compiled artifact. Note that this is not the Wasmtime-generated DWARF but rather the native wasm DWARF itself used in backtraces. Previously the wasm dwarf was inserted into sections `..wasm` where `` was `debug_info`, `debug_str`, etc -- one per `gimli::SectionId` as found in the original wasm module. This does not work with components, however, where modules did not correctly separate their debug information into separate sections or otherwise disambiguate. The fix in this commit is to instead smash all the debug information together into one large section and store offsets into that giant section. This is similar to the `name`-section scraping or the trap metadata section where one section contains all the data for all the modules in a component. This simplifies the object file parsing by only looking for one section name and doesn't add all that much complexity to serializing and looking up dwarf information as well.	2022-12-06 14:29:13 -06:00
Alex Crichton	03715dda9d	Tidy up some internals of instance allocation (#5346 ) * Simplify the `ModuleRuntimeInfo` trait slightly Fold two functions into one as they're only called from one location anyway. * Remove ModuleRuntimeInfo::signature This is redundant as the array mapping is already stored within the `VMContext` so that can be consulted rather than having a separate trait function for it. This required altering the `Global` creation slightly to work correctly in this situation. * Remove a now-dead constant * Shared `VMOffsets` across instances This commit removes the computation of `VMOffsets` to being per-module instead of per-instance. The `VMOffsets` structure is also quite large so this shaves off 112 bytes per instance which isn't a huge impact but should help lower the cost of instantiating small modules. * Remove `InstanceAllocator::adjust_tunables` This is no longer needed or necessary with the pooling allocator. * Fix compile warning * Fix a vtune warning * Fix pooling tests * Fix another test warning	2022-12-01 22:22:08 +00:00
Alex Crichton	e0b9663e44	Remove some custom error types in Wasmtime (#5347 ) * Remove some custom error types in Wasmtime These types are mostly cumbersome to work with nowadays that `anyhow` is used everywhere else. This commit removes `InstantiationError` and `SetupError` in favor of using `anyhow::Error` throughout. This can eventually culminate in creation of specific errors for embedders to downcast to but for now this should be general enough. * Fix Windows build	2022-12-01 14:47:10 -06:00
Chris Fallin	d59caf39b6	Wasmtime+Cranelift: strip out some dead x86-32 code. (#5226 ) * Wasmtime+Cranelift: strip out some dead x86-32 code. I was recently pointed to fastly/Viceroy#200 where it seems some folks are trying to compile Wasmtime (via Viceroy) for Windows x86-32 and the failures may not be loud enough. I've tried to reproduce this cross-compiling to i686-pc-windows-gnu from Linux and hit build failures (as expected) in several places. Nevertheless, while trying to discern what others may be attempting, I noticed some dead x86-32-specific code in our repo, and figured it would be a good idea to clean this up. Otherwise, it (i) sends some mixed messages -- "hey look, this codebase does support x86-32" -- and (ii) keeps untested code around, which is generally not great. This PR removes x86-32-specific cases in traphandlers and unwind code, and Cranelift's native feature detection. It adds helpful compile-error messages in a few cases. If we ever support x86-32 (contributors welcome! The big missing piece is Cranelift support; see #1980), these compile errors and git history should be enough to recover any knowledge we are now encoding in the source. I left the x86-32 support in `wasmtime-fiber` alone because that seems like a bit of a special case -- foundation library, separate from the rest of Wasmtime, with specific care to provide a (presumably working) full 32-bit version. * Remove some extraneous compile_error!s, already covered by others.	2022-11-08 23:03:17 +00:00
Alex Crichton	cd53bed898	Implement AOT compilation for components (#5160 ) * Pull `Module` out of `ModuleTextBuilder` This commit is the first in what will likely be a number towards preparing for serializing a compiled component to bytes, a precompiled artifact. To that end my rough plan is to merge all of the compiled artifacts for a component into one large object file instead of having lots of separate object files and lots of separate mmaps to manage. To that end I plan on eventually using `ModuleTextBuilder` to build one large text section for all core wasm modules and trampolines, meaning that `ModuleTextBuilder` is no longer specific to one module. I've extracted out functionality such as function name calculation as well as relocation resolving (now a closure passed in) in preparation for this. For now this just keeps tests passing, and the trajectory for this should become more clear over the following commits. * Remove component-specific object emission This commit removes the `ComponentCompiler::emit_obj` function in favor of `Compiler::emit_obj`, now renamed `append_code`. This involved significantly refactoring code emission to take a flat list of functions into `append_code` and the caller is responsible for weaving together various "families" of functions and un-weaving them afterwards. * Consolidate ELF parsing in `CodeMemory` This commit moves the ELF file parsing and section iteration from `CompiledModule` into `CodeMemory` so one location keeps track of section ranges and such. This is in preparation for sharing much of this code with components which needs all the same sections to get tracked but won't be using `CompiledModule`. A small side benefit from this is that the section parsing done in `CodeMemory` and `CompiledModule` is no longer duplicated. * Remove separately tracked traps in components Previously components would generate an "always trapping" function and the metadata around which pc was allowed to trap was handled manually for components. With recent refactorings the Wasmtime-standard trap section in object files is now being generated for components as well which means that can be reused instead of custom-tracking this metadata. This commit removes the manual tracking for the `always_trap` functions and plumbs the necessary bits around to make components look more like modules. * Remove a now-unnecessary `Arc` in `Module` Not expected to have any measurable impact on performance, but complexity-wise this should make it a bit easier to understand the internals since there's no longer any need to store this somewhere else than its owner's location. * Merge compilation artifacts of components This commit is a large refactoring of the component compilation process to produce a single artifact instead of multiple binary artifacts. The core wasm compilation process is refactored as well to share as much code as necessary with the component compilation process. This method of representing a compiled component necessitated a few medium-sized changes internally within Wasmtime: * A new data structure was created, `CodeObject`, which represents metadata about a single compiled artifact. This is then stored as an `Arc` within a component and a module. For `Module` this is always uniquely owned and represents a shuffling around of data from one owner to another. For a `Component`, however, this is shared amongst all loaded modules and the top-level component. * The "module registry" which is used for symbolicating backtraces and for trap information has been updated to account for a single region of loaded code holding possibly multiple modules. This involved adding a second-level `BTreeMap` for now. This will likely slow down instantiation slightly but if it poses an issue in the future this should be able to be represented with a more clever data structure. This commit additionally solves a number of longstanding issues with components such as compiling only one host-to-wasm trampoline per signature instead of possibly once-per-module. Additionally the `SignatureCollection` registration now happens once-per-component instead of once-per-module-within-a-component. * Fix compile errors from prior commits * Support AOT-compiling components This commit adds support for AOT-compiled components in the same manner as `Module`, specifically adding: * `Engine::precompile_component` * `Component::serialize` * `Component::deserialize` * `Component::deserialize_file` Internally the support for components looks quite similar to `Module`. All the prior commits to this made adding the support here (unsurprisingly) easy. Components are represented as a single object file as are modules, and the functions for each module are all piled into the same object file next to each other (as are areas such as data sections). Support was also added here to quickly differentiate compiled components vs compiled modules via the `e_flags` field in the ELF header. * Prevent serializing exported modules on components The current representation of a module within a component means that the implementation of `Module::serialize` will not work if the module is exported from a component. The reason for this is that `serialize` doesn't actually do anything and simply returns the underlying mmap as a list of bytes. The mmap, however, has `.wasmtime.info` describing component metadata as opposed to this module's metadata. While rewriting this section could be implemented it's not so easy to do so and is otherwise seen as not super important of a feature right now anyway. * Fix windows build * Fix an unused function warning * Update crates/environ/src/compilation.rs Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>	2022-11-02 15:26:26 +00:00
Alex Crichton	434fbf2b27	Refactor metadata storage in AOT artifacts (#5153 ) * Refactor metadata storage in AOT artifacts This commit is a reorganization of how metadata is stored in Wasmtime's compiled artifacts. Currently Wasmtime's ELF artifacts have data appended after them to contain metadata about the `Engine` as well as type information for the module itself. This extra data at the end of the file is ignored by ELF-related utilities generally and is assembled during the module serialization process. In working on AOT-compiling components, though, I've discovered a number of issues with this: * Primarily it's possible to mistakenly change an artifact if it's deserialized and then serialized again. This issue is probably theoretical but the deserialized artifact records the `Engine` configuration at time of creation but when re-serializing that it serializes the current `Engine` state, not the original `Engine` state. * Additionally the serialization strategy here is tightly coupled to `Module` and its serialization format. While this makes sense it is not conducive for future refactorings to use a similar serialization format for components. The engine metadata, for example, does not necessarily need to be tied up with type information. * The storage for this extra metadata is a bit wonky by shoving it at the end of the ELF file. The original reason for this was to have a compiled artifact be multiple objects concatenated with each other to support serializing module-linking-using modules. Module linking is no longer a thing and I have since decided that for the component model all compilation artifacts will go into one object file to assist debugability. This means that the extra stick-it-at-the-end is no longer necessary. To solve these issues this commit splits up the `module/serialization.rs` file in two, mostly moving the logic to `engine/serialization.rs`. The engine serialization logic now handles everything related to `Engine` compatibility such as targets, compiler flags, wasm features, etc. The module serialization logic is now exclusively interested in type information. The engine metadata and serialized type information additionally live in sections of the final file now instead of at the end. This means that there are three primary `bincode`-encoded sections that are parsed on deserializing a file: 1. The `Engine`-specific metadata. This will be the same for both modules and components. 2. The `CompiledModuleInfo` structure. For core wasm there's just one of these but for the component model there will be multiple, one per core wasm module. 3. The type information. For core wasm this is a `ModuleTypes` but for a component this will be a `ComponentTypes`. No true functional change is expected from this commit. Binary artifacts might get inflated by a small handful of bytes due to using ELF sections to represent this now. A related change I made during this commit as well was the plumbing of the `is_branch_protection_enabled` flag. This is technically `Engine`-level metadata but I didn't want to plumb it all over the place as was done now, so instead a new section was added to the final binary just for this bti information. This means that it no longer needs to be a parameter to `CodeMemory::publish` and additionally is more amenable to a `Component`-is-just-one-object world where no single module owns this piece of metadata. * Exclude some functions in a cranelift-less build	2022-10-29 17:13:32 +00:00
Alex Crichton	81f7ef7fbe	Reduce calls to `section_by_name` loading artifacts (#5151 ) * Reduce calls to `section_by_name` loading artifacts Data is stored in binary artifacts as an ELF object and when loading an artifact lots of calls are made to the `object` crate's `section_by_name` method which ends up doing a linear search through the list of sections for a particular name. To avoid doing this linear search every time I've replaced this with one loop over the sections of an object at the beginning when an object is loaded, or at least most of the calls with this loop. This isn't really a pressing issue today but some upcoming work I hope to do for AOT-compiled components will be adding more sections to the artifact so it seems best to keep the number of linear searches small and avoided if possible. * Fix an off-by-one	2022-10-28 22:55:34 +00:00
Afonso Bordado	4639e85c4e	Flush Icache on AArch64 Windows (#4997 ) * cranelift: Add FlushInstructionCache for AArch64 on Windows This was previously done on #3426 for linux. * wasmtime: Add FlushInstructionCache for AArch64 on Windows This was previously done on #3426 for linux. * cranelift: Add MemoryUse flag to JIT Memory Manager This allows us to keep the icache flushing code self-contained and not leak implementation details. This also changes the windows icache flushing code to only flush pages that were previously unflushed. * Add jit-icache-coherence crate * cranelift: Use `jit-icache-coherence` * wasmtime: Use `jit-icache-coherence` * jit-icache-coherence: Make rustix feature additive Mutually exclusive features cause issues. * wasmtime: Remove rustix from wasmtime-jit We now use it via jit-icache-coherence * Rename wasmtime-jit-icache-coherency crate * Use cfg-if in wasmtime-jit-icache-coherency crate * Use inline instead of inline(always) * Add unsafe marker to clear_cache * Conditionally compile all rustix operations membarrier does not exist on MacOS * Publish `wasmtime-jit-icache-coherence` * Remove explicit windows check This is implied by the target_os = "windows" above * cranelift: Remove len != 0 check This is redundant as it is done in non_protected_allocations_iter * Comment cleanups Thanks @akirilov-arm! * Make clear_cache safe * Rename pipeline_flush to pipeline_flush_mt * Revert "Make clear_cache safe" This reverts commit 21165d81c9030ed9b291a1021a367214d2942c90. * More docs! * Fix pipeline_flush reference on clear_cache * Update more docs! * Move pipeline flush after `mprotect` calls Technically the `clear_cache` operation is a lie in AArch64, so move the pipeline flush after the `mprotect` calls so that it benefits from the implicit cache cleaning done by it. * wasmtime: Remove rustix backend from icache crate * wasmtime: Use libc for macos * wasmtime: Flush icache on all arch's for windows * wasmtime: Add flags to membarrier call	2022-10-12 11:15:38 -07:00
Alex Crichton	7bab5c1b28	Consolidate module definition in `wasmtime-jit` (#5000 ) Minor thing I noticed from #4990 but I stylistically prefer to keep the `mod foo;` definitions canonicalized to one location to emphasize how multiple targets can use the same definition.	2022-10-03 11:04:07 -05:00
Yuyi Wang	6bcc430855	Initial work to build for Windows ARM64 (#4990 ) * Make wasmtime build for windows-aarch64 * Add check for win arm64 build. * Fix checks for winarm64 key in workflows. * Add target in windows arm64 build. * Add tracking issue for Windows ARM64 trap handling	2022-10-02 19:45:42 -07:00
yuyang-ok	cdecc858b4	add riscv64 backend for cranelift. (#4271 ) Add a RISC-V 64 (`riscv64`, RV64GC) backend. Co-authored-by: yuyang <756445638@qq.com> Co-authored-by: Chris Fallin <chris@cfallin.org> Co-authored-by: Afonso Bordado <afonsobordado@az8.co>	2022-09-27 17:30:31 -07:00
Anton Kirilov	d8b290898c	Initial forward-edge CFI implementation (#3693 ) * Initial forward-edge CFI implementation Give the user the option to start all basic blocks that are targets of indirect branches with the BTI instruction introduced by the Branch Target Identification extension to the Arm instruction set architecture. Copyright (c) 2022, Arm Limited. * Refactor `from_artifacts` to avoid second `make_executable` (#1) This involves "parsing" twice but this is parsing just the header of an ELF file so it's not a very intensive operation and should be ok to do twice. * Address the code review feedback Copyright (c) 2022, Arm Limited. Co-authored-by: Alex Crichton <alex@alexcrichton.com>	2022-09-08 09:35:58 -05:00
Benjamin Bouvier	f0337c9c76	Upgrade to the high-level `ittapi` v0.3.0 crate (#4003 ) * Upgrade to the high-level ittapi v0.3.0 crate * Add exclusion for windows mingw	2022-07-18 10:13:09 -05:00
Alex Crichton	601e8f3094	Remove dependency on the `region` crate (#4407 ) This commit removes Wasmtime's dependency on the `region` crate. The motivation for this came about when I was updating dependencies and saw that `region` had a new major version at 3.0.0 as opposed to our currently used 2.3 track. In reviewing the use cases of `region` within Wasmtime I found two trends in particular which motivated this commit: * Some unix-specific areas of `wasmtime_runtime` use `rustix::mm::mprotect` instead of `region::protect` already. This means that the usage of `region::protect` for changing virtual memory protections was already inconsistent. * Many uses of `region::protect` were already in unix-specific regions which could make use of `rustix`. Overall I opted to remove the dependency on the `region` crate to avoid chasing its versions over time. Unix-specific changes of protections were easily changed to `rustix::mm::mprotect`. There were two locations where a windows/unix split is now required and I subjectively ruled "that seems ok". Finally removing `region` also meant that the "what is the current page size" query needed to be inlined into `wasmtime_runtime`, which I have also subjectively ruled "that seems fine". Finally one final refactoring here was that the `unix.rs` and `linux.rs` split for the pooling allocator was merged. These two files already only differed in one function so I slapped a `cfg_if!` in there to help reduce the duplication.	2022-07-07 21:28:25 +00:00
Alex Crichton	df1502531d	Migrate from `winapi` to `windows-sys` (#4346 ) * Migrate from `winapi` to `windows-sys` I believe that Microsoft itself is supporting the development of `windows-sys` and it's also used by `cap-std` now so this switches Wasmtime's dependencies on Windows APIs from the `winapi` crate to the `windows-sys` crate. We still have `winapi` in our dependency graph but that may get phased out over time. * Make windows-sys a target-specific dependency	2022-06-28 18:02:41 +00:00
Alex Crichton	66b829b1bf	Change how unwind information is stored on Windows (#4314 ) * Change how unwind information is stored on Windows Unwind information on Windows is stored in two separate locations. The first location is the unwind information itself which corresponds to `UNWIND_INFO`. The second location is a list of `RUNTIME_INFO` structures which point to function bodes and `UNWIND_INFO` structures. Currently in Wasmtime the `UNWIND_INFO` structures are stored just after functions themselves with a somewhat cryptic comment indicating that Windows prefers this (I'm unsure as to the provenance of this comment). The `RUNTIME_INFO` data is then stored in a separate section which has the custom name of `_wasmtime_winx64_unwind`. After my recent foray into trying to debug windows-2022 bad unwind information again I realized though that Windows actually has official sections for these two unwind information items. The `.xdata` section is used to store the `UNWIND_INFO` structures and the `.pdata` section stores the `RUNTIME_INFO` list. To try to be somewhat idiomatic and perhaps one day even hook into standard Windows debugging tools I went ahead and refactored how our unwind information is stored to match this. Perhaps the main benefit of this is that it reduces the size of the read/execute section of the binary. Previously the unwind information was executable since it was stored in the `.text` section, but unnecessarily so. Now it's in a read-only section which is in theory a small amount of hardening. Otherwise though I don't think this will really help all that much to hook up in to standard debugging tools like `objdump` because it's all still stored in an ELF file rather than a COFF file. * Review comments	2022-06-28 15:40:04 +00:00
Alex Crichton	7fdc616368	Remove the `Paged` memory initialization variant (#4046 ) * Remove the `Paged` memory initialization variant This commit simplifies the `MemoryInitialization` enum by removing the `Paged` variant. The `Paged` variant was originally added for uffd, but that support has now been removed in #4040. This is no longer necessary but is still used as an intermediate step of becoming a `Static` variant of initialized memory (which copy-on-write uses). As a result this commit largely modifies the static initialization of memory steps and folds the two methods together. * Apply suggestions from code review Co-authored-by: Peter Huene <peter@huene.dev> Co-authored-by: Peter Huene <peter@huene.dev>	2022-05-05 09:44:48 -05:00
Alex Crichton	90791a0e32	Reduce contention on the global module rwlock (#4041 ) * Reduce contention on the global module rwlock This commit intendes to close #4025 by reducing contention on the global rwlock Wasmtime has for module information during instantiation and dropping a store. Currently registration of a module into this global map happens during instantiation, but this can be a hot path as embeddings may want to, in parallel, instantiate modules. Instead this switches to a strategy of inserting into the global module map when a `Module` is created and then removing it from the map when the `Module` is dropped. Registration in a `Store` now preserves the entire `Module` within the store as opposed to trying to only save it piecemeal. In reality the only piece that wasn't saved within a store was the `TypeTables` which was pretty inconsequential for core wasm modules anyway. This means that instantiation should now clone a singluar `Arc` into a `Store` per `Module` (previously it cloned two) with zero managemnt on the global rwlock as that happened at `Module` creation time. Additionally dropping a `Store` again involves zero rwlock management and only a single `Arc` drop per-instantiated module (previously it was two). In the process of doing this I also went ahead and removed the `Module::new_with_name` API. This has been difficult to support historically with various variations on the internals of `ModuleInner` because it involves mutating a `Module` after it's been created. My hope is that this API is pretty rarely used and/or isn't super important, so it's ok to remove. Finally this change removes some internal `Arc` layerings that are no longer necessary, attempting to use either `T` or `&T` where possible without dealing with the overhead of an `Arc`. Closes #4025 * Move back to a `BTreeMap` in `ModuleRegistry`	2022-04-19 15:13:47 -05:00
Alex Crichton	453feb6f82	Remove some dead code (#3970 ) This commit removes methods that are never used between crates or trait impls like `Clone` which may have been used one day but are no longer used.	2022-03-30 13:51:34 -05:00
Alex Crichton	d1d10dc8da	Refactor the `TypeTables` type (#3971 ) * Remove duplicate `TypeTables` type This was once needed historically but it is no longer needed. * Make the internals of `TypeTables` private Instead of reaching internally for the `wasm_signatures` map an `Index` implementation now exists to indirect accesses through the type of the index being accessed. For the component model this table of types will grow a number of other tables and this'll assist in consuming sites not having to worry so much about which map they're reaching into.	2022-03-30 13:51:25 -05:00
Alex Crichton	76b82910c9	Remove the module linking implementation in Wasmtime (#3958 ) * Remove the module linking implementation in Wasmtime This commit removes the experimental implementation of the module linking WebAssembly proposal from Wasmtime. The module linking is no longer intended for core WebAssembly but is instead incorporated into the component model now at this point. This means that very large parts of Wasmtime's implementation of module linking are no longer applicable and would change greatly with an implementation of the component model. The main purpose of this is to remove Wasmtime's reliance on the support for module-linking in `wasmparser` and tooling crates. With this reliance removed we can move over to the `component-model` branch of `wasmparser` and use the updated support for the component model. Additionally given the trajectory of the component model proposal the embedding API of Wasmtime will not look like what it looks like today for WebAssembly. For example the core wasm `Instance` will not change and instead a `Component` is likely to be added instead. Some more rationale for this is in #3941, but the basic idea is that I feel that it's not going to be viable to develop support for the component model on a non-`main` branch of Wasmtime. Additionaly I don't think it's viable, for the same reasons as `wasm-tools`, to support the old module linking proposal and the new component model at the same time. This commit takes a moment to not only delete the existing module linking implementation but some abstractions are also simplified. For example module serialization is a bit simpler that there's only one module. Additionally instantiation is much simpler since the only initializer we have to deal with are imports and nothing else. Closes #3941 * Fix doc link * Update comments	2022-03-23 14:57:34 -05:00
Alex Crichton	41594dc5d9	Expose details for mlocking modules externally (#3944 ) This commit exposes some various details and config options for having finer-grain control over mlock-ing the memory of modules. This amounts to three different changes being present in this commit: * A new `Module::image_range` API is added to expose the range in host memory of where the compiled image resides. This enables embedders to make mlock-ing decisions independently of Wasmtime. Otherwise though there's not too much useful that can be done with this range information at this time. * A new `Config::force_memory_init_memfd` option has been added. This option is used to force the usage of `memfd_create` on Linux even when the original module comes from a file on disk. With mlock-ing the main purpose for Wasmtime is likely to be avoiding major page faults that go back to disk, so this is another major source of avoiding page faults by ensuring that the initialization contents of memory are always in RAM. * The `memory_images` field of a `Module` has gone back to being lazily created on the first instantiation, effectively reverting #3914. This enables embedders to defer the creation of the image to as late as possible to allow modules to be created from precompiled images without actually loading all the contents of the data segments from disk immediately. These changes are all somewhat low-level controls which aren't intended to be generally used by embedders. If fine-grained control is desired though it's hoped that these knobs provide what's necessary to be achieved.	2022-03-18 13:51:55 -05:00
Alex Crichton	f21aa98ccb	Fuzz-code-coverage motivated improvements (#3905 ) * fuzz: Fuzz padding between compiled functions This commit hooks up the custom `wasmtime_linkopt_padding_between_functions` configuration option to the cranelift compiler into the fuzz configuration, enabling us to ensure that randomly inserting a moderate amount of padding between functions shouldn't tamper with any results. * fuzz: Fuzz the `Config::generate_address_map` option This commit adds fuzz configuration where `generate_address_map` is either enabled or disabled, unlike how it's always enabled for fuzzing today. * Remove unnecessary handling of relocations This commit removes a number of bits and pieces all related to handling relocations in JIT code generated by Wasmtime. None of this is necessary nowadays that the "old backend" has been removed (quite some time ago) and relocations are no longer expected to be in the JIT code at all. Additionally with the minimum x86_64 features required to run wasm code it should be expected that no libcalls are required either for Wasmtime-based JIT code.	2022-03-09 10:58:27 -08:00
bjorn3	4ed353a7e1	Extract jit_int.rs and most of jitdump_linux.rs for use outside of wasmtime (#2744 ) * Extract gdb jit_int into wasmtime-jit-debug * Move a big chunk of the jitdump code to wasmtime-jit-debug * Fix doc markdown in perf_jitdump.rs	2022-02-22 09:23:44 -08:00
Andrew Brown	c183e93b80	x64: enable VTune support by default (#3821 ) * x64: enable VTune support by default After significant work in the `ittapi-rs` crate, this dependency should build without issue on Wasmtime's supported operating systems: Windows, Linux, and macOS. The difference in the release binary is <20KB, so this change makes `vtune` a default build feature. This change upgrades `ittapi-rs` to v0.2.0 and updates the documentation. * review: add configuration for defaults in more places * review: remove OS conditional compilation, add architecture * review: do not default vtune feature in wasmtime-jit	2022-02-22 08:32:09 -08:00
Alex Crichton	c0c368d151	Use mmap'd `.cwasm` as a source for memory initialization images (#3787 ) Skip memfd creation with precompiled modules This commit updates the memfd support internally to not actually use a memfd if a compiled module originally came from disk via the `wasmtime::Module::deserialize_file` API. In this situation we already have a file descriptor open and there's no need to copy a module's heap image to a new file descriptor. To facilitate a new source of `mmap` the currently-memfd-specific-logic of creating a heap image is generalized to a new form of `MemoryInitialization` which is attempted for all modules at module-compile-time. This means that the serialized artifact to disk will have the memory image in its entirety waiting for us. Furthermore the memory image is ensured to be padded and aligned carefully to the target system's page size, notably meaning that the data section in the final object file is page-aligned and the size of the data section is also page aligned. This means that when a precompiled module is mapped from disk we can reuse the underlying `File` to mmap all initial memory images. This means that the offset-within-the-memory-mapped-file can differ for memfd-vs-not, but that's just another piece of state to track in the memfd implementation. In the limit this waters down the term "memfd" for this technique of quickly initializing memory because we no longer use memfd unconditionally (only when the backing file isn't available). This does however open up an avenue in the future to porting this support to other OSes because while `memfd_create` is Linux-specific both macOS and Windows support mapping a file with copy-on-write. This porting isn't done in this PR and is left for a future refactoring. Closes #3758 * Enable "memfd" support on all unix systems Cordon off the Linux-specific bits and enable the memfd support to compile and run on platforms like macOS which have a Linux-like `mmap`. This only works if a module is mapped from a precompiled module file on disk, but that's better than not supporting it at all! * Fix linux compile * Use `Arc<File>` instead of `MmapVecFileBacking` * Use a named struct instead of mysterious tuples * Comment about unsafety in `Module::deserialize_file` * Fix tests * Fix uffd compile * Always align data segments No need to have conditional alignment since their sizes are all aligned anyway * Update comment in build.rs * Use rustix, not `region` * Fix some confusing logic/names around memory indexes These functions all work with memory indexes, not specifically defined memory indexes.	2022-02-10 15:40:40 -06:00
Alex Crichton	520a7f26d7	Move function names out of `Module` (#3789 ) * Move function names out of `Module` This commit moves function names in a module out of the `wasmtime_environ::Module` type and into separate sections stored in the final compiled artifact. Spurred on by #3787 to look at module load times I noticed that a huge amount of time was spent in deserializing this map. The `spidermonkey.wasm` file, for example, has a 3MB name section which is a lot of unnecessary data to deserialize at module load time. The names of functions are now split out into their own dedicated section of the compiled artifact and metadata about them is stored in a more compact format at runtime by avoiding a `BTreeMap` and instead using a sorted array. Overall this improves deserialize times by up to 80% for modules with large name sections since the name section is no longer deserialized at load time and it's lazily paged in as names are actually referenced. * Fix a typo * Fix compiled module determinism Need to not only sort afterwards but also first to ensure the data of the name section is consistent.	2022-02-10 14:34:48 -06:00
Chris Fallin	99ed8cc9be	Merge pull request #3697 from cfallin/memfd-cow memfd/madvise-based CoW pooling allocator	2022-02-02 13:04:26 -08:00
Chris Fallin	b73ac83c37	Add a pooling allocator mode based on copy-on-write mappings of memfds. As first suggested by Jan on the Zulip here [1], a cheap and effective way to obtain copy-on-write semantics of a "backing image" for a Wasm memory is to mmap a file with `MAP_PRIVATE`. The `memfd` mechanism provided by the Linux kernel allows us to create anonymous, in-memory-only files that we can use for this mapping, so we can construct the image contents on-the-fly then effectively create a CoW overlay. Furthermore, and importantly, `madvise(MADV_DONTNEED, ...)` will discard the CoW overlay, returning the mapping to its original state. By itself this is almost enough for a very fast instantiation-termination loop of the same image over and over, without changing the address space mapping at all (which is expensive). The only missing bit is how to implement heap growth. But here memfds can help us again: if we create another anonymous file and map it where the extended parts of the heap would go, we can take advantage of the fact that a `mmap()` mapping can be larger than the file itself, with accesses beyond the end generating a `SIGBUS`, and the fact that we can cheaply resize the file with `ftruncate`, even after a mapping exists. So we can map the "heap extension" file once with the maximum memory-slot size and grow the memfd itself as `memory.grow` operations occur. The above CoW technique and heap-growth technique together allow us a fastpath of `madvise()` and `ftruncate()` only when we re-instantiate the same module over and over, as long as we can reuse the same slot. This fastpath avoids all whole-process address-space locks in the Linux kernel, which should mean it is highly scalable. It also avoids the cost of copying data on read, as the `uffd` heap backend does when servicing pagefaults; the kernel's own optimized CoW logic (same as used by all file mmaps) is used instead. [1] https://bytecodealliance.zulipchat.com/#narrow/stream/206238-general/topic/Copy.20on.20write.20based.20instance.20reuse/near/266657772	2022-01-31 12:53:18 -08:00
Dan Gohman	881c19473d	Use `ptr::cast` instead of `as` casts in several places. (#3507 ) `ptr::cast` has the advantage of being unable to silently cast `const T` to `mut T`. This turned up several places that were performing such casts, which this PR also fixes.	2022-01-21 13:03:17 -08:00
Benjamin Bouvier	2649d2352c	Support vtune profiling of trampolines too (#3687 ) * Provide helpers for demangling function names * Profile trampolines in vtune too * get rid of mapping * avoid code duplication with jitdump_linux * maintain previous default display name for wasm functions * no dash, grrr * Remove unused profiling error type	2022-01-19 09:49:23 -06:00
Benjamin Bouvier	e53f213ac4	Try demangling names before forwarding them to the profiler Before this PR, each profiler (perf/vtune, at the moment) had to have a demangler for each of the programming languages that could have been compiled to wasm and fed into wasmtime. With this, wasmtime now demangles names before even forwarding them to the underlying profiler, which makes for a unified representation in profilers, and avoids incorrect demangling in profilers.	2022-01-12 19:17:42 +01:00
Andrew Brown	99b00cd973	docs: update VTune documentation (#3604 ) While using VTune, it seemed a good idea to check that the VTune documentation for Wasmtime was still correct. It is and VTune support still works (improvements: click-through to x86 assembly is not available). These changes simply re-organize the documentation and add a section for running VTune from a GUI.	2021-12-17 15:47:09 -08:00
Alex Crichton	f1225dfd93	Add a compilation section to disable address maps (#3598 ) * Add a compilation section to disable address maps This commit adds a new `Config::generate_address_map` compilation setting which is used to disable emission of the `.wasmtime.addrmap` section of compiled artifacts. This section is currently around the size of the entire `.text` section itself unfortunately and for size reasons may wish to be omitted. Functionality-wise all that is lost is knowing the precise wasm module offset address of a faulting instruction or in a backtrace of instructions. This also means that if the module has DWARF debugging information available with it Wasmtime isn't able to produce a filename and line number in the backtrace. This option remains enabled by default. This option may not be needed in the future with #3547 perhaps, but in the meantime it seems reasonable enough to support a configuration mode where the section is entirely omitted if the smallest module possible is desired. * Fix some CI issues * Update tests/all/traps.rs Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> * Do less work in compilation for address maps But only when disabled Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>	2021-12-13 13:48:05 -06:00
Alex Crichton	0e90d4b903	Update addr2line and gimli deps (#3580 ) Just a routine update, figured it was good to stay close to their most recent versions	2021-12-01 15:48:36 -06:00
Dan Gohman	ea0cb971fb	Update to rustix 0.26.2. (#3521 ) This pulls in a fix for Android, where Android's seccomp policy on older versions is to make `openat2` irrecoverably crash the process, so we have to do a version check up front rather than relying on `ENOSYS` to determine if `openat2` is supported. And it pulls in the fix for the link errors when multiple versions of rsix/rustix are linked in. And it has updates for two crate renamings: rsix has been renamed to rustix, and unsafe-io has been renamed to io-extras.	2021-11-15 10:21:13 -08:00
Anton Kirilov	e9c4164b94	Call membarrier() after making JIT mappings executable on AArch64 Linux The membarrier() system call ensures that no processor has fetched a stale instruction stream. Copyright (c) 2021, Arm Limited.	2021-10-25 13:25:35 +01:00
Alex Crichton	e2a724ce18	Update the `object` crate to 0.27.0 (#3465 ) Mostly just keeping us up to date with changes there since we somewhat heavily rely on it now.	2021-10-20 10:52:06 -05:00
Anton Kirilov	a986cf2438	Increase the default code section alignment to 64 KB for AArch64 targets (#3424 ) Some platforms such as AArch64 Linux support different memory page sizes, so we need to be conservative when choosing the code section alignment (which is equal to the page size) by using the maximum. Copyright (c) 2021, Arm Limited.	2021-10-07 12:49:40 -05:00
Alex Crichton	38463d11ed	Load generated trampolines into jitdump when profiling (#3344 ) * Load generated trampolines into jitdump when profiling This commit updates the jitdump profiler to generate JIT profiling records for generated trampolines in a wasm module in addition to the functions already in a module. It's also updated to learn about trampolines generated via `Func::new` and friends. These trampolines were all not previously registered meaning that stack traces with these pc values would be confusing to see in the profile output. While the names aren't the best it should at least be more clear than before if a function is hot! * Fix more builds	2021-09-21 13:05:31 -05:00
Dan Gohman	47490b4383	Use rsix to make system calls in Wasmtime. (#3355 ) * Use rsix to make system calls in Wasmtime. `rsix` is a system call wrapper crate that we use in `wasi-common`, which can provide the following advantages in the rest of Wasmtime: - It eliminates some `unsafe` blocks in Wasmtime's code. There's still an `unsafe` block in the library, but this way, the `unsafe` is factored out and clearly scoped. - And, it makes error handling more consistent, factoring out code for checking return values and `io::Error::last_os_error()`, and code that does `errno::set_errno(0)`. This doesn't cover all system calls; `rsix` doesn't implement signal-handling APIs, and this doesn't cover calls made through `std` or crates like `userfaultfd`, `rand`, and `region`.	2021-09-17 15:28:56 -07:00
Nick Fitzgerald	4b256ab968	Place unwind info directly after the text section, even when debug info is enabled When debug info was enabled, we would put the debug info sections in between the text section and the unwind info section. But the unwind info is encoded in a position-independent manner (so that we don't need relocs for it) that relies on it directly following the text section. The result of the misplacement was some crashes inside the unwinder.	2021-09-09 13:39:30 -07:00
Nick Fitzgerald	0499cca2fa	Name unwind info `.eh_frame` in the Wasmtime's compiled ELF artifact We were previously using `_wasmtime_eh_frame` but there is no good reason to add the prefix Wasmtime-specific prefix. Using the standard name allows for better inspection with standard tools like `dwarfdump`.	2021-09-09 12:54:49 -07:00
Alex Crichton	1532516a36	Use relative `call` instructions between wasm functions (#3275 ) * Use relative `call` instructions between wasm functions This commit is a relatively major change to the way that Wasmtime generates code for Wasm modules and how functions call each other. Prior to this commit all function calls between functions, even if they were defined in the same module, were done indirectly through a register. To implement this the backend would emit an absolute 8-byte relocation near all function calls, load that address into a register, and then call it. While this technique is simple to implement and easy to get right, it has two primary downsides associated with it: * Function calls are always indirect which means they are more difficult to predict, resulting in worse performance. * Generating a relocation-per-function call requires expensive relocation resolution at module-load time, which can be a large contributing factor to how long it takes to load a precompiled module. To fix these issues, while also somewhat compromising on the previously simple implementation technique, this commit switches wasm calls within a module to using the `colocated` flag enabled in Cranelift-speak, which basically means that a relative call instruction is used with a relocation that's resolved relative to the pc of the call instruction itself. When switching the `colocated` flag to `true` this commit is also then able to move much of the relocation resolution from `wasmtime_jit::link` into `wasmtime_cranelift::obj` during object-construction time. This frontloads all relocation work which means that there's actually no relocations related to function calls in the final image, solving both of our points above. The main gotcha in implementing this technique is that there are hardware limitations to relative function calls which mean we can't simply blindly use them. AArch64, for example, can only go +/- 64 MB from the `bl` instruction to the target, which means that if the function we're calling is a greater distance away then we would fail to resolve that relocation. On x86_64 the limits are +/- 2GB which are much larger, but theoretically still feasible to hit. Consequently the main increase in implementation complexity is fixing this issue. This issue is actually already present in Cranelift itself, and is internally one of the invariants handled by the `MachBuffer` type. When generating a function relative jumps between basic blocks have similar restrictions. This commit adds new methods for the `MachBackend` trait and updates the implementation of `MachBuffer` to account for all these new branches. Specifically the changes to `MachBuffer` are: * For AAarch64 the `LabelUse::Branch26` value now supports veneers, and AArch64 calls use this to resolve relocations. * The `emit_island` function has been rewritten internally to handle some cases which previously didn't come up before, such as: * When emitting an island the deadline is now recalculated, where previously it was always set to infinitely in the future. This was ok prior since only a `Branch19` supported veneers and once it was promoted no veneers were supported, so without multiple layers of promotion the lack of a new deadline was ok. * When emitting an island all pending fixups had veneers forced if their branch target wasn't known yet. This was generally ok for 19-bit fixups since the only kind getting a veneer was a 19-bit fixup, but with mixed kinds it's a bit odd to force veneers for a 26-bit fixup just because a nearby 19-bit fixup needed a veneer. Instead fixups are now re-enqueued unless they're known to be out-of-bounds. This may run the risk of generating more islands for 19-bit branches but it should also reduce the number of islands for between-function calls. * Otherwise the internal logic was tweaked to ideally be a bit more simple, but that's a pretty subjective criteria in compilers... I've added some simple testing of this for now. A synthetic compiler option was create to simply add padded 0s between functions and test cases implement various forms of calls that at least need veneers. A test is also included for x86_64, but it is unfortunately pretty slow because it requires generating 2GB of output. I'm hoping for now it's not too bad, but we can disable the test if it's prohibitive and otherwise just comment the necessary portions to be sure to run the ignored test if these parts of the code have changed. The final end-result of this commit is that for a large module I'm working with the number of relocations dropped to zero, meaning that nothing actually needs to be done to the text section when it's loaded into memory (yay!). I haven't run final benchmarks yet but this is the last remaining source of significant slowdown when loading modules, after I land a number of other PRs both active and ones that I only have locally for now. * Fix arm32 * Review comments	2021-09-01 13:27:38 -05:00
Alex Crichton	9e0c910023	Add a `Module::deserialize_file` method (#3266 ) * Add a `Module::deserialize_file` method This commit adds a new method to the `wasmtime::Module` type, `deserialize_file`. This is intended to be the same as the `deserialize` method except for the serialized module is present as an on-disk file. This enables Wasmtime to internally use `mmap` to avoid copying bytes around and generally makes loading a module much faster. A C API is added in this commit as well for various bindings to use this accelerated path now as well. Another option perhaps for a Rust-based API is to have an API taking a `File` itself to allow for a custom file descriptor in one way or another, but for now that's left for a possible future refactoring if we find a use case. * Fix compat with main - handle readdonly mmap * wip * Try to fix Windows support	2021-08-31 13:05:51 -05:00
Alex Crichton	ef3ec594ce	Don't copy executable code into a `CodeMemory` (#3265 ) * Don't copy executable code into a `CodeMemory` This commit moves a copy from compiled artifacts into a `CodeMemory`. In general this commit drastically changes the meaning of a `CodeMemory`. Previously it was an iteratively-pushed-on structure that would accumulate executable code over time. Afterwards, however, it's a manager for an `MmapVec` which updates the permissions on text section to ensure that the pages are executable. By taking ownership of an `MmapVec` within a `CodeMemory` there's no need to copy any data around, which means that the `.text` section in the ELF image produced by Wasmtime is usable as-is after placement in memory and relocations have been resolved. This moves Wasmtime one step closer to being able to directly use a module after it's `mmap`'d into memory, optimizing when a module is loaded. * Fix windows section alignment * Review comments	2021-08-30 13:38:35 -05:00
Alex Crichton	eb251deca9	Remove `scroll` dependency from `wasmtime-jit` (#3260 ) Similar functionality to `scroll` is provided with the `object` crate and doesn't have a `*_derive` crate to go with it. This commit updates the jitdump linux support to use `object` instead of `scroll` to achieve the needs of writing structs-as-bytes onto disk.	2021-08-30 13:26:07 -05:00
Alex Crichton	a237e73b5a	Remove some allocations in `CodeMemory` (#3253 ) * Remove some allocations in `CodeMemory` This commit removes the `FinishedFunctions` type as well as allocations associated with trampolines when allocating inside of a `CodeMemory`. The main goal of this commit is to improve the time spent in `CodeMemory` where currently today a good portion of time is spent simply parsing symbol names and trying to extract function indices from them. Instead this commit implements a new strategy (different from #3236) where compilation records offset/length information for all functions/trampolines so this doesn't need to be re-learned from the object file later. A consequence of this commit is that this offset information will be decoded/encoded through `bincode` unconditionally, but we can also optimize that later if necessary as well. Internally this involved quite a bit of refactoring since the previous map for `FinishedFunctions` was relatively heavily relied upon. * comments	2021-08-30 10:35:17 -05:00
Alex Crichton	c73be1f13a	Use an mmap-friendly serialization format (#3257 ) * Use an mmap-friendly serialization format This commit reimplements the main serialization format for Wasmtime's precompiled artifacts. Previously they were generally a binary blob of `bincode`-encoded metadata prefixed with some versioning information. The downside of this format, though, is that loading a precompiled artifact required pushing all information through `bincode`. This is inefficient when some data, such as trap/address tables, are rarely accessed. The new format added in this commit is one which is designed to be `mmap`-friendly. This means that the relevant parts of the precompiled artifact are already page-aligned for updating permissions of pieces here and there. Additionally the artifact is optimized so that if data is rarely read then we can delay reading it until necessary. The new artifact format for serialized modules is an ELF file. This is not a public API guarantee, so it cannot be relied upon. In the meantime though this is quite useful for exploring precompiled modules with standard tooling like `objdump`. The ELF file is already constructed as part of module compilation, and this is the main contents of the serialized artifact. THere is some extra information, though, not encoded in each module's individual ELF file such as type information. This information continues to be `bincode`-encoded, but it's intended to be much smaller and much faster to deserialize. This extra information is appended to the end of the ELF file. This means that the original ELF file is still a valid ELF file, we just get to have extra bits at the end. More information on the new format can be found in the module docs of the serialization module of Wasmtime. Another refatoring implemented as part of this commit is to deserialize and store object files directly in `mmap`-backed storage. This avoids the need to copy bytes after the artifact is loaded into memory for each compiled module, and in a future commit it opens up the door to avoiding copying the text section into a `CodeMemory`. For now, though, the main change is that copies are not necessary when loading from a precompiled compilation artifact once the artifact is itself in mmap-based memory. To assist with managing `mmap`-based memory a new `MmapVec` type was added to `wasmtime_jit` which acts as a form of `Vec<T>` backed by a `wasmtime_runtime::Mmap`. This type notably supports `drain(..N)` to slice the buffer into disjoint regions that are all separately owned, such as having a separately owned window into one artifact for all object files contained within. Finally this commit implements a small refactoring in `wasmtime-cache` to use the standard artifact format for cache entries rather than a bincode-encoded version. This required some more hooks for serializing/deserializing but otherwise the crate still performs as before. * Review comments	2021-08-30 09:19:20 -05:00
Alex Crichton	12515e6646	Move trap information to a section of the compiled image (#3241 ) This commit moves the `traps` field of `FunctionInfo` into a section of the compiled artifact produced by Cranelift. This section is quite large and when previously encoded/decoded with `bincode` this can take quite some time to process. Traps are expected to be relatively rare and it's not necessarily the right tradeoff to spend so much time serializing/deserializing this data, so this commit offloads the section into a custom-encoded binary format located elsewhere in the compiled image. This is similar to #3240 in its goal which is to move very large pieces of metadata to their own sections to avoid decoding anything when we load a precompiled modules. This also has a small benefit that it's slightly more efficient storage for the trap information too, but that's a negligible benefit. This is part of #3230 to make loading modules fast.	2021-08-27 01:09:55 -05:00

1 2 3 4 5

207 Commits