Commit Graph

12 Commits

Author SHA1 Message Date
Alex Crichton
ff0c45b4a0 Minor changes for components related to wit-bindgen support (#5053)
* Plumb type exports in components around more

This commit adds some more plumbing for type exports to ensure that they
show up in the final compiled representation of a component. For now
they continued to be ignored for all purposes in the embedding API
itself but I found this useful to explore in `wit-bindgen` based tooling
which is leveraging the component parsing in Wasmtime.

* Add a field to `ModuleTranslation` to store the original wasm

This commit adds a field to be able to refer back to the original wasm
binary for a `ModuleTranslation`. This field is used in the upcoming
support for host generation in `wit-component` to "decompile" a
component into core wasm modules to get instantiated. This is used to
extract a core wasm module from the original component.

* FIx a build warning
2022-10-13 12:11:34 -05:00
Peter Huene
42233e8eda components: ignore export aliases to types in translation. (#4604)
* components: ignore export aliases to types in translation.

Currently, translation is ignoring type exports from components during
translation by skipping over them before adding them to the exports map.

If a component instantiates an inner component and aliases a type export of
that instance, it will cause wasmtime to panic with a failure to find the
export in the exports map.

The fix is to add a representation for exported types to the map that is simply
ignored when encountered. This also makes it easier to track places where we
would have to support type exports in translation in the future.

* Keep type information for type exports.

This commit keeps the type information for type exports so that types can be
properly aliased from an instance export and thereby adjusting the type index
space accordingly.

* Add a simple test case for type exports for the component model.
2022-08-04 22:45:11 +00:00
Alex Crichton
b4d7ab36f9 Add a dataflow-based representation of components (#4597)
* Add a dataflow-based representation of components

This commit updates the inlining phase of compiling a component to
creating a dataflow-based representation of a component instead of
creating a final `Component` with a linear list of initializers. This
dataflow graph is then linearized in a final step to create the actual
final `Component`.

The motivation for this commit stems primarily from my work implementing
strings in fused adapters. In doing this my plan is to defer most
low-level transcoding to the host itself rather than implementing that
in the core wasm adapter modules. This means that small
cranelift-generated trampolines will be used for adapter modules to call
which then call "transcoding libcalls". The cranelift-generated
trampolines will get raw pointers into linear memory and pass those to
the libcall which core wasm doesn't have access to when passing
arguments to an import.

Implementing this with the previous representation of a `Component` was
becoming too tricky to bear. The initialization of a transcoder needed
to happen at just the right time: before the adapter module which needed
it was instantiated but after the linear memories referenced had been
extracted into the `VMComponentContext`. The difficulty here is further
compounded by the current adapter module injection pass already being
quite complicated. Adapter modules are already renumbering the index
space of runtime instances and shuffling items around in the
`GlobalInitializer` list. Perhaps the worst part of this was that
memories could already be referenced by host function imports or exports
to the host, and if adapters referenced the same memory it shouldn't be
referenced twice in the component. This meant that `ExtractMemory`
initializers ideally needed to be shuffled around in the initializer
list to happen as early as possible instead of wherever they happened to
show up during translation.

Overall I did my best to implement the transcoders but everything always
came up short. I have decided to throw my hands up in the air and try a
completely different approach to this, namely the dataflow-based
representation in this commit. This makes it much easier to edit the
component after initial translation for injection of adapters, injection
of transcoders, adding dependencies on possibly-already-existing items,
etc. The adapter module partitioning pass in this commit was greatly
simplified to something which I believe is functionally equivalent but
is probably an order of magnitude easier to understand.

The biggest downside of this representation I believe is having a
duplicate representation of a component. The `component::info` was
largely duplicated into the `component::dfg` module in this commit.
Personally though I think this is a more appropriate tradeoff than
before because it's very easy to reason about "convert representation A
to B" code whereas it was very difficult to reason about shuffling
around `GlobalInitializer` items in optimal fashions. This may also have
a cost at compile-time in terms of shuffling data around, but my hope is
that we have lots of other low-hanging fruit to optimize if it ever
comes to that which allows keeping this easier-to-understand
representation.

Finally, to reiterate, the final representation of components is not
changed by this PR. To the runtime internals everything is still the
same.

* Fix compile of factc
2022-08-04 15:42:06 -05:00
Alex Crichton
285bc5ce24 Implement variant translation in fused adapters (#4534)
* Implement variant translation in fused adapters

This commit implements the most general case of variants for fused
adapter trampolines. Additionally a number of other primitive types are
filled out here to assist with testing variants. The implementation
internally was relatively straightforward given the shape of variants,
but there's room for future optimization as necessary especially around
converting locals to various types.

This commit also introduces a "one off" fuzzer for adapters to ensure
that the generated adapter is valid. I hope to extend this fuzz
generator as more types are implemented to assist in various corner
cases that might arise. For now the fuzzer simply tests that the output
wasm module is valid, not that it actually executes correctly. I hope to
integrate with a fuzzer along the lines of #4307 one day to test the
run-time-correctness of the generated adapters as well, at which point
this fuzzer would become obsolete.

Finally this commit also fixes an issue with `u8` translation where
upper bits weren't zero'd out and were passed raw across modules.
Instead smaller-than-32 types now all mask out their upper bits and do
sign-extension as appropriate for unsigned/signed variants.

* Fuzz memory64 in the new trampoline fuzzer

Currently memory64 isn't supported elsewhere in the component model
implementation of Wasmtime but the trampoline compiler seems as good a
place as any to ensure that it at least works in isolation. This plumbs
through fuzz input into a `memory64` boolean which gets fed into
compilation. Some miscellaneous bugs were fixed as a result to ensure
that memory64 trampolines all validate correctly.

* Tweak manifest for doc build
2022-07-27 09:14:43 -05:00
Alex Crichton
97894bc65e Add initial support for fused adapter trampolines (#4501)
* Add initial support for fused adapter trampolines

This commit lands a significant new piece of functionality to Wasmtime's
implementation of the component model in the form of the implementation
of fused adapter trampolines. Internally within a component core wasm
modules can communicate with each other by having their exports
`canon lift`'d to get `canon lower`'d into a different component. This
signifies that two components are communicating through a statically
known interface via the canonical ABI at this time. Previously Wasmtime
was able to identify that this communication was happening but it simply
panicked with `unimplemented!` upon seeing it. This commit is the
beginning of filling out this panic location with an actual
implementation.

The implementation route chosen here for fused adapters is to use a
WebAssembly module itself for the implementation. This means that, at
compile time of a component, Wasmtime is generating core WebAssembly
modules which then get recursively compiled within Wasmtime as well. The
choice to use WebAssembly itself as the implementation of fused adapters
stems from a few motivations:

* This does not represent a significant increase in the "trusted
  compiler base" of Wasmtime. Getting the Wasm -> CLIF translation
  correct once is hard enough much less for an entirely different IR to
  CLIF. By generating WebAssembly no new interactions with Cranelift are
  added which drastically reduces the possibilities for mistakes.

* Using WebAssembly means that component adapters are insulated from
  miscompilations and mistakes. If something goes wrong it's defined
  well within the WebAssembly specification how it goes wrong and what
  happens as a result. This means that the "blast zone" for a wrong
  adapter is the component instance but not the entire host itself.
  Accesses to linear memory are guaranteed to be in-bounds and otherwise
  handled via well-defined traps.

* A fully-finished fused adapter compiler is expected to be a
  significant and quite complex component of Wasmtime. Functionality
  along these lines is expected to be needed for Web-based polyfills of
  the component model and by using core WebAssembly it provides the
  opportunity to share code between Wasmtime and these polyfills for the
  component model.

* Finally the runtime implementation of managing WebAssembly modules is
  already implemented and quite easy to integrate with, so representing
  fused adapters with WebAssembly results in very little extra support
  necessary for the runtime implementation of instantiating and managing
  a component.

The compiler added in this commit is dubbed Wasmtime's Fused Adapter
Compiler of Trampolines (FACT) because who doesn't like deriving a name
from an acronym. Currently the trampoline compiler is limited in its
support for interface types and only supports a few primitives. I plan
on filing future PRs to flesh out the support here for all the variants
of `InterfaceType`. For now this PR is primarily focused on all of the
other infrastructure for the addition of a trampoline compiler.

With the choice to use core WebAssembly to implement fused adapters it
means that adapters need to be inserted into a module. Unfortunately
adapters cannot all go into a single WebAssembly module because adapters
themselves have dependencies which may be provided transitively through
instances that were instantiated with other adapters. This means that a
significant chunk of this PR (`adapt.rs`) is dedicated to determining
precisely which adapters go into precisely which adapter modules. This
partitioning process attempts to make large modules wherever it can to
cut down on core wasm instantiations but is likely not optimal as
it's just a simple heuristic today.

With all of this added together it's now possible to start writing
`*.wast` tests that internally have adapted modules communicating with
one another. A `fused.wast` test suite was added as part of this PR
which is the beginning of tests for the support of the fused adapter
compiler added in this PR. Currently this is primarily testing some
various topologies of adapters along with direct/indirect modes. This
will grow many more tests over time as more types are supported.

Overall I'm not 100% satisfied with the testing story of this PR. When a
test fails it's very difficult to debug since everything is written in
the text format of WebAssembly meaning there's no "conveniences" to
print out the state of the world when things go wrong and easily debug.
I think this will become even more apparent as more tests are written
for more types in subsequent PRs. At this time though I know of no
better alternative other than leaning pretty heavily on fuzz-testing to
ensure this is all exercised.

* Fix an unused field warning

* Fix tests in `wasmtime-runtime`

* Add some more tests for compiled trampolines

* Remap exports when injecting adapters

The exports of a component were accidentally left unmapped which meant
that they indexed the instance indexes pre-adapter module insertion.

* Fix typo

* Rebase conflicts
2022-07-25 23:13:26 +00:00
Alex Crichton
3032e3fcfb Track type information during component translation (#4448)
This commit augments the current translation phase of components with
extra machinery to track the type information of component items such as
instances, components, and functions. The end goal of this commit is to
enable the `Lower` instruction to know the type of the component
function being lowered. Currently during the inlining pass where
component fusion is detected the type of the lifted function is known,
but to implement fusion entirely the type of the lowered function must
be known. Note that these two types are expected to be different to
allow for the subtyping rules specified by the component model.

For now nothing is actually done with this information other than noting
its presence in the face of a lifted-then-lowered function. My hope
though was to split this out for a separate review to avoid making a
future component-adapter-compiler-containing-PR too large.
2022-07-18 09:21:40 -05:00
Alex Crichton
76a2545a7f Implement nested instance exports for components (#4364)
This commit adds support to Wasmtime for components which themselves
export instances. The support here adds new APIs for how instance
exports are accessed in the embedding API. For now this is mostly just a
first-pass where the API is somewhat confusing and has a lot of
lifetimes. I'm hoping that over time we can figure out how to simplify
this but for now it should at least be expressive enough for exploring
the exports of an instance.
2022-07-05 16:04:54 +00:00
Alex Crichton
f0278c5db7 Implement canon lower of a canon lift function in the same component (#4347)
* Implement `canon lower` of a `canon lift` function in the same component

This commit implements the "degenerate" logic for implementing a
function within a component that is lifted and then immediately lowered
again. In this situation the lowered function will immediately generate
a trap and doesn't need to implement anything else.

The implementation in this commit is somewhat heavyweight but I think is
probably justified moreso in future additions to the component model
rather than what exactly is here right now. It's not expected that this
"always trap" functionality will really be used all that often since it
would generally mean a buggy component, but the functionality plumbed
through here is hopefully going to be useful for implementing
component-to-component adapter trampolines.

Specifically this commit implements a strategy where the `canon.lower`'d
function is generated by Cranelift and simply has a single trap
instruction when called, doing nothing else. The main complexity comes
from juggling around all the data associated with these functions,
primarily plumbing through the traps into the `ModuleRegistry` to
ensure that the global `is_wasm_trap_pc` function returns `true` and at
runtime when we lookup information about the trap it's all readily
available (e.g. translating the trapping pc to a `TrapCode`).

* Fix non-component build

* Fix some offset calculations

* Only create one "always trap" per signature

Use an internal map to deduplicate during compilation.
2022-06-29 16:35:37 +00:00
Alex Crichton
eef1758d19 Implement a first-class error for reexported component functions (#4348)
Currently I don't know how we can reasonably implement this. Given all
the signatures of how we call functions and how functions are called on
the host there's no real feasible way that I know of to hook these two
up "seamlessly". This means that a component which reexports an imported
function can't be run in Wasmtime.

One of the main reasons for this is that when calling a component
function Wasmtime wants to lower arguments first and then have them
lifted when the host is called. With a reexport though there's not
actually anything to lower into so we'd sort of need something similar
to a table on the side or maybe a linear memory and that seems like it'd
get quite complicated quite quickly for not really all that much
benefit. As-such for now this simply returns a first-class error (rather
than the current panic) in situations like this.
2022-06-29 09:05:40 -05:00
Alex Crichton
c1b3962f7b Implement lowered-then-lifted functions (#4327)
* Implement lowered-then-lifted functions

This commit is a few features bundled into one, culminating in the
implementation of lowered-then-lifted functions for the component model.
It's probably not going to be used all that often but this is possible
within a valid component so Wasmtime needs to do something relatively
reasonable. The main things implemented in this commit are:

* Component instances are now assigned a `RuntimeComponentInstanceIndex`
  to differentiate each one. This will be used in the future to detect
  fusion (one instance lowering a function from another instance). For
  now it's used to allocate separate `VMComponentFlags` for each
  internal component instance.

* The `CoreExport<FuncIndex>` of lowered functions was changed to a
  `CoreDef` since technically a lowered function can use another lowered
  function as the callee. This ended up being not too difficult to plumb
  through as everything else was already in place.

* A need arose to compile host-to-wasm trampolines which weren't already
  present. Currently wasm in a component is always entered through a
  host-to-wasm trampoline but core wasm modules are the source of all
  the trampolines. In the case of a lowered-then-lifted function there
  may not actually be any core wasm modules, so component objects now
  contain necessary trampolines not otherwise provided by the core wasm
  objects. This feature required splitting a new function into the
  `Compiler` trait for creating a host-to-wasm trampoline. After doing
  this core wasm compilation was also updated to leverage this which
  further enabled compiling trampolines in parallel as opposed to the
  previous synchronous compilation.

* Review comments
2022-06-28 18:50:08 +00:00
Alex Crichton
3339dd1f01 Implement the post-return attribute (#4297)
This commit implements the `post-return` feature of the canonical ABI in
the component model. This attribute is an optionally-specified function
which is to be executed after the return value has been processed by the
caller to optionally clean-up the return value. This enables, for
example, returning an allocated string and the host then knows how to
clean it up to prevent memory leaks in the original module.

The API exposed in this PR changes the prior `TypedFunc::call` API in
behavior but not in its signature. Previously the `TypedFunc::call`
method would set the `may_enter` flag on the way out, but now that
operation is deferred until a new `TypedFunc::post_return` method is
called. This means that once a method on an instance is invoked then
nothing else can be done on the instance until the `post_return` method
is called. Note that the method must be called irrespective of whether
the `post-return` canonical ABI option was specified or not. Internally
wasm will be invoked if necessary.

This is a pretty wonky and unergonomic API to work with. For now I
couldn't think of a better alternative that improved on the ergonomics.
In the theory that the raw Wasmtime bindings for a component may not be
used all that heavily (instead `wit-bindgen` would largely be used) I'm
hoping that this isn't too much of an issue in the future.

cc #4185
2022-06-23 14:36:21 -05:00
Alex Crichton
651f40855f Add support for nested components (#4285)
* Add support for nested components

This commit is an implementation of a number of features of the
component model including:

* Defining nested components
* Outer aliases to components and modules
* Instantiating nested components

The implementation here is intended to be a foundational pillar of
Wasmtime's component model support since recursion and nested components
are the bread-and-butter of the component model. At a high level the
intention for the component model implementation in Wasmtime has long
been that the recursive nature of components is "erased" at compile time
to something that's more optimized and efficient to process. This commit
ended up exemplifying this quite well where the vast majority of the
internal changes here are in the "compilation" phase of a component
rather than the runtime instantiation phase. The support in the
`wasmtime` crate, the runtime instantiation support, only had minor
updates here while the internals of translation have seen heavy updates.

The `translate` module was greatly refactored here in this commit.
Previously it would, as a component is parsed, create a final
`Component` to hand off to trampoline compilation and get persisted at
runtime. Instead now it's a thin layer over `wasmparser` which simply
records a list of `LocalInitializer` entries for how to instantiate the
component and its index spaces are built. This internal representation
of the instantiation of a component is pretty close to the binary format
intentionally.

Instead of performing dataflow legwork the `translate` phase of a
component is now responsible for two primary tasks:

1. All components and modules are discovered within a component. They're
   assigned `Static{Component,Module}Index` depending on where they're
   found and a `{Module,}Translation` is prepared for each one. This
   "flattens" the recursive structure of the binary into an indexed list
   processable later.

2. The lexical scope of components is managed here to implement outer
   module and component aliases. This is a significant design
   implementation because when closing over an outer component or module
   that item may actually be imported or something like the result of a
   previous instantiation. This means that the capture of
   modules and components is both a lexical concern as well as a runtime
   concern. The handling of the "runtime" bits are handled in the next
   phase of compilation.

The next and currently final phase of compilation is a new pass where
much of the historical code in `translate.rs` has been moved to (but
heavily refactored). The goal of compilation is to produce one "flat"
list of initializers for a component (as happens prior to this PR) and
to achieve this an "inliner" phase runs which runs through the
instantiation process at compile time to produce a list of initializers.
This `inline` module is the main addition as part of this PR and is now
the workhorse for dataflow analysis and tracking what's actually
referring to what.

During the `inline` phase the local initializers recorded in the
`translate` phase are processed, in sequence, to instantiate a
component. Definitions of items are tracked to correspond to their root
definition which allows seeing across instantiation argument boundaries
and such. Handling "upvars" for component outer aliases is handled in
the `inline` phase as well by creating state for a component whenever a
component is defined as was recorded during the `translate` phase.
Finally this phase is chiefly responsible for doing all string-based
name resolution at compile time that it can. This means that at runtime
no string maps will need to be consulted for item exports and such.
The final result of inlining is a list of "global initializers" which is
a flat list processed during instantiation time. These are almost
identical to the initializers that were processed prior to this PR.

There are certainly still more gaps of the component model to implement
but this should be a major leg up in terms of functionality that
Wasmtime implements. This commit, however leaves behind a "hole" which
is not intended to be filled in at this time, namely importing and
exporting components at the "root" level from and to the host. This is
tracked and explained in more detail as part of #4283.

cc #4185 as this completes a number of items there

* Tweak code to work on stable without warning

* Review comments
2022-06-21 13:48:56 -05:00