Add support for nested components (#4285)

* Add support for nested components

This commit is an implementation of a number of features of the
component model including:

* Defining nested components
* Outer aliases to components and modules
* Instantiating nested components

The implementation here is intended to be a foundational pillar of
Wasmtime's component model support since recursion and nested components
are the bread-and-butter of the component model. At a high level the
intention for the component model implementation in Wasmtime has long
been that the recursive nature of components is "erased" at compile time
to something that's more optimized and efficient to process. This commit
ended up exemplifying this quite well where the vast majority of the
internal changes here are in the "compilation" phase of a component
rather than the runtime instantiation phase. The support in the
`wasmtime` crate, the runtime instantiation support, only had minor
updates here while the internals of translation have seen heavy updates.

The `translate` module was greatly refactored here in this commit.
Previously it would, as a component is parsed, create a final
`Component` to hand off to trampoline compilation and get persisted at
runtime. Instead now it's a thin layer over `wasmparser` which simply
records a list of `LocalInitializer` entries for how to instantiate the
component and its index spaces are built. This internal representation
of the instantiation of a component is pretty close to the binary format
intentionally.

Instead of performing dataflow legwork the `translate` phase of a
component is now responsible for two primary tasks:

1. All components and modules are discovered within a component. They're
   assigned `Static{Component,Module}Index` depending on where they're
   found and a `{Module,}Translation` is prepared for each one. This
   "flattens" the recursive structure of the binary into an indexed list
   processable later.

2. The lexical scope of components is managed here to implement outer
   module and component aliases. This is a significant design
   implementation because when closing over an outer component or module
   that item may actually be imported or something like the result of a
   previous instantiation. This means that the capture of
   modules and components is both a lexical concern as well as a runtime
   concern. The handling of the "runtime" bits are handled in the next
   phase of compilation.

The next and currently final phase of compilation is a new pass where
much of the historical code in `translate.rs` has been moved to (but
heavily refactored). The goal of compilation is to produce one "flat"
list of initializers for a component (as happens prior to this PR) and
to achieve this an "inliner" phase runs which runs through the
instantiation process at compile time to produce a list of initializers.
This `inline` module is the main addition as part of this PR and is now
the workhorse for dataflow analysis and tracking what's actually
referring to what.

During the `inline` phase the local initializers recorded in the
`translate` phase are processed, in sequence, to instantiate a
component. Definitions of items are tracked to correspond to their root
definition which allows seeing across instantiation argument boundaries
and such. Handling "upvars" for component outer aliases is handled in
the `inline` phase as well by creating state for a component whenever a
component is defined as was recorded during the `translate` phase.
Finally this phase is chiefly responsible for doing all string-based
name resolution at compile time that it can. This means that at runtime
no string maps will need to be consulted for item exports and such.
The final result of inlining is a list of "global initializers" which is
a flat list processed during instantiation time. These are almost
identical to the initializers that were processed prior to this PR.

There are certainly still more gaps of the component model to implement
but this should be a major leg up in terms of functionality that
Wasmtime implements. This commit, however leaves behind a "hole" which
is not intended to be filled in at this time, namely importing and
exporting components at the "root" level from and to the host. This is
tracked and explained in more detail as part of #4283.

cc #4185 as this completes a number of items there

* Tweak code to work on stable without warning

* Review comments
This commit is contained in:
Alex Crichton
2022-06-21 13:48:56 -05:00
committed by GitHub
parent b306368565
commit 651f40855f
14 changed files with 2303 additions and 858 deletions

View File

@@ -7,8 +7,7 @@ use std::path::Path;
use std::ptr::NonNull;
use std::sync::Arc;
use wasmtime_environ::component::{
ComponentTypes, Initializer, LoweredIndex, ModuleUpvarIndex, TrampolineInfo, Translation,
Translator,
ComponentTypes, GlobalInitializer, LoweredIndex, StaticModuleIndex, TrampolineInfo, Translator,
};
use wasmtime_environ::PrimaryMap;
use wasmtime_jit::CodeMemory;
@@ -28,7 +27,7 @@ struct ComponentInner {
/// Core wasm modules that the component defined internally, indexed by the
/// compile-time-assigned `ModuleUpvarIndex`.
upvars: PrimaryMap<ModuleUpvarIndex, Module>,
static_modules: PrimaryMap<StaticModuleIndex, Module>,
/// Registered core wasm signatures of this component, or otherwise the
/// mapping of the component-local `SignatureIndex` to the engine-local
@@ -111,29 +110,26 @@ impl Component {
let mut validator =
wasmparser::Validator::new_with_features(engine.config().features.clone());
let mut types = Default::default();
let translation = Translator::new(tunables, &mut validator, &mut types)
let (component, modules) = Translator::new(tunables, &mut validator, &mut types)
.translate(binary)
.context("failed to parse WebAssembly module")?;
let types = Arc::new(types.finish());
let Translation {
component, upvars, ..
} = translation;
let (upvars, trampolines) = engine.join_maybe_parallel(
let (static_modules, trampolines) = engine.join_maybe_parallel(
// In one (possibly) parallel task all the modules found within this
// component are compiled. Note that this will further parallelize
// function compilation internally too.
|| -> Result<_> {
let upvars = upvars.into_iter().map(|(_, t)| t).collect::<Vec<_>>();
let upvars = modules.into_iter().map(|(_, t)| t).collect::<Vec<_>>();
let modules = engine.run_maybe_parallel(upvars, |module| {
let (mmap, info) =
Module::compile_functions(engine, module, types.module_types())?;
// FIXME: the `SignatureCollection` here is re-registering the
// entire list of wasm types within `types` on each invocation.
// That's ok semantically but is quite slow to do so. This
// should build up a mapping from `SignatureIndex` to
// `VMSharedSignatureIndex` once and then reuse that for each
// module somehow.
// FIXME: the `SignatureCollection` here is re-registering
// the entire list of wasm types within `types` on each
// invocation. That's ok semantically but is quite slow to
// do so. This should build up a mapping from
// `SignatureIndex` to `VMSharedSignatureIndex` once and
// then reuse that for each module somehow.
Module::from_parts(engine, mmap, info, types.clone())
})?;
@@ -146,7 +142,7 @@ impl Component {
.initializers
.iter()
.filter_map(|init| match init {
Initializer::LowerImport(i) => Some(i),
GlobalInitializer::LowerImport(i) => Some(i),
_ => None,
})
.collect::<Vec<_>>();
@@ -162,7 +158,7 @@ impl Component {
Ok((trampolines, wasmtime_jit::mmap_vec_from_obj(obj)?))
},
);
let upvars = upvars?;
let static_modules = static_modules?;
let (trampolines, trampoline_obj) = trampolines?;
let mut trampoline_obj = CodeMemory::new(trampoline_obj);
let code = trampoline_obj.publish()?;
@@ -180,12 +176,12 @@ impl Component {
Ok(Component {
inner: Arc::new(ComponentInner {
component,
upvars,
static_modules,
types,
trampolines,
signatures,
trampoline_obj,
text,
signatures,
trampolines,
}),
})
}
@@ -194,8 +190,8 @@ impl Component {
&self.inner.component
}
pub(crate) fn upvar(&self, idx: ModuleUpvarIndex) -> &Module {
&self.inner.upvars[idx]
pub(crate) fn static_module(&self, idx: StaticModuleIndex) -> &Module {
&self.inner.static_modules[idx]
}
pub(crate) fn types(&self) -> &Arc<ComponentTypes> {

View File

@@ -7,11 +7,11 @@ use anyhow::{anyhow, Context, Result};
use std::marker;
use std::sync::Arc;
use wasmtime_environ::component::{
ComponentTypes, CoreDef, CoreExport, Export, ExportItem, Initializer, InstantiateModule,
LowerImport, RuntimeImportIndex, RuntimeInstanceIndex, RuntimeMemoryIndex, RuntimeModuleIndex,
RuntimeReallocIndex,
ComponentTypes, CoreDef, CoreExport, Export, ExportItem, ExtractMemory, ExtractRealloc,
GlobalInitializer, InstantiateModule, LowerImport, RuntimeImportIndex, RuntimeInstanceIndex,
RuntimeModuleIndex,
};
use wasmtime_environ::{EntityIndex, MemoryIndex, PrimaryMap};
use wasmtime_environ::{EntityIndex, PrimaryMap};
use wasmtime_runtime::component::{ComponentInstance, OwnedComponentInstance};
/// An instantiated component.
@@ -167,6 +167,16 @@ impl InstanceData {
let instance = store.instance_mut(id);
let idx = match &item.item {
ExportItem::Index(idx) => (*idx).into(),
// FIXME: ideally at runtime we don't actually do any name lookups
// here. This will only happen when the host supplies an imported
// module so while the structure can't be known at compile time we
// do know at `InstancePre` time, for example, what all the host
// imports are. In theory we should be able to, as part of
// `InstancePre` construction, perform all name=>index mappings
// during that phase so the actual instantiation of an `InstancePre`
// skips all string lookups. This should probably only be
// investigated if this becomes a performance issue though.
ExportItem::Name(name) => instance.module().exports[name],
};
instance.get_export_by_index(idx)
@@ -220,13 +230,13 @@ impl<'a> Instantiator<'a> {
let env_component = self.component.env_component();
for initializer in env_component.initializers.iter() {
match initializer {
Initializer::InstantiateModule(m) => {
GlobalInitializer::InstantiateModule(m) => {
let module;
let imports = match m {
// Since upvars are statically know we know that the
// `args` list is already in the right order.
InstantiateModule::Upvar(idx, args) => {
module = self.component.upvar(*idx);
InstantiateModule::Static(idx, args) => {
module = self.component.static_module(*idx);
self.build_imports(store.0, module, args.iter())
}
@@ -234,6 +244,10 @@ impl<'a> Instantiator<'a> {
// lookups with strings to determine the order of the
// imports since it's whatever the actual module
// requires.
//
// FIXME: see the note in `ExportItem::Name` handling
// above for how we ideally shouldn't do string lookup
// here.
InstantiateModule::Import(idx, args) => {
module = match &self.imports[*idx] {
RuntimeImport::Module(m) => m,
@@ -255,23 +269,21 @@ impl<'a> Instantiator<'a> {
self.data.instances.push(i);
}
Initializer::LowerImport(import) => self.lower_import(import),
GlobalInitializer::LowerImport(import) => self.lower_import(import),
Initializer::ExtractMemory { index, export } => {
self.extract_memory(store.0, *index, export)
GlobalInitializer::ExtractMemory(mem) => self.extract_memory(store.0, mem),
GlobalInitializer::ExtractRealloc(realloc) => {
self.extract_realloc(store.0, realloc)
}
Initializer::ExtractRealloc { index, def } => {
self.extract_realloc(store.0, *index, def)
}
Initializer::SaveModuleUpvar(idx) => {
GlobalInitializer::SaveStaticModule(idx) => {
self.data
.exported_modules
.push(self.component.upvar(*idx).clone());
.push(self.component.static_module(*idx).clone());
}
Initializer::SaveModuleImport(idx) => {
GlobalInitializer::SaveModuleImport(idx) => {
self.data.exported_modules.push(match &self.imports[*idx] {
RuntimeImport::Module(m) => m.clone(),
_ => unreachable!(),
@@ -307,30 +319,22 @@ impl<'a> Instantiator<'a> {
self.data.funcs.push(func.clone());
}
fn extract_memory(
&mut self,
store: &mut StoreOpaque,
index: RuntimeMemoryIndex,
export: &CoreExport<MemoryIndex>,
) {
let memory = match self.data.lookup_export(store, export) {
fn extract_memory(&mut self, store: &mut StoreOpaque, memory: &ExtractMemory) {
let mem = match self.data.lookup_export(store, &memory.export) {
wasmtime_runtime::Export::Memory(m) => m,
_ => unreachable!(),
};
self.data.state.set_runtime_memory(index, memory.definition);
self.data
.state
.set_runtime_memory(memory.index, mem.definition);
}
fn extract_realloc(
&mut self,
store: &mut StoreOpaque,
index: RuntimeReallocIndex,
def: &CoreDef,
) {
let anyfunc = match self.data.lookup_def(store, def) {
fn extract_realloc(&mut self, store: &mut StoreOpaque, realloc: &ExtractRealloc) {
let anyfunc = match self.data.lookup_def(store, &realloc.def) {
wasmtime_runtime::Export::Function(f) => f.anyfunc,
_ => unreachable!(),
};
self.data.state.set_runtime_realloc(index, anyfunc);
self.data.state.set_runtime_realloc(realloc.index, anyfunc);
}
fn build_imports<'b>(

View File

@@ -101,7 +101,7 @@ impl<T> Linker<T> {
/// [`Component`] specified with the items defined within this linker.
///
/// This method will perform as much work as possible short of actually
/// instnatiating an instance. Internally this will use the names defined
/// instantiating an instance. Internally this will use the names defined
/// within this linker to satisfy the imports of the [`Component`] provided.
/// Additionally this will perform type-checks against the component's
/// imports against all items defined within this linker.
@@ -215,7 +215,7 @@ impl<T> LinkerInstance<'_, T> {
/// first parameter.
///
/// Note that `func` must be an `Fn` and must also be `Send + Sync +
/// 'static`. Shared state within a func is typically accesed with the `T`
/// 'static`. Shared state within a func is typically accessed with the `T`
/// type parameter from [`Store<T>`](crate::Store) which is accessible
/// through the leading [`StoreContextMut<'_, T>`](crate::StoreContextMut)
/// argument which can be provided to the `func` given here.
@@ -248,7 +248,7 @@ impl<T> LinkerInstance<'_, T> {
self.as_mut().into_instance(name)
}
/// Same as [`LinkerInstance::instance`] except with different liftime
/// Same as [`LinkerInstance::instance`] except with different lifetime
/// parameters.
pub fn into_instance(mut self, name: &str) -> Result<Self> {
let name = self.strings.intern(name);