From 651f40855fc9aaf19f1294682643a267cc50b474 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 21 Jun 2022 13:48:56 -0500 Subject: [PATCH] Add support for nested components (#4285) * Add support for nested components This commit is an implementation of a number of features of the component model including: * Defining nested components * Outer aliases to components and modules * Instantiating nested components The implementation here is intended to be a foundational pillar of Wasmtime's component model support since recursion and nested components are the bread-and-butter of the component model. At a high level the intention for the component model implementation in Wasmtime has long been that the recursive nature of components is "erased" at compile time to something that's more optimized and efficient to process. This commit ended up exemplifying this quite well where the vast majority of the internal changes here are in the "compilation" phase of a component rather than the runtime instantiation phase. The support in the `wasmtime` crate, the runtime instantiation support, only had minor updates here while the internals of translation have seen heavy updates. The `translate` module was greatly refactored here in this commit. Previously it would, as a component is parsed, create a final `Component` to hand off to trampoline compilation and get persisted at runtime. Instead now it's a thin layer over `wasmparser` which simply records a list of `LocalInitializer` entries for how to instantiate the component and its index spaces are built. This internal representation of the instantiation of a component is pretty close to the binary format intentionally. Instead of performing dataflow legwork the `translate` phase of a component is now responsible for two primary tasks: 1. All components and modules are discovered within a component. They're assigned `Static{Component,Module}Index` depending on where they're found and a `{Module,}Translation` is prepared for each one. This "flattens" the recursive structure of the binary into an indexed list processable later. 2. The lexical scope of components is managed here to implement outer module and component aliases. This is a significant design implementation because when closing over an outer component or module that item may actually be imported or something like the result of a previous instantiation. This means that the capture of modules and components is both a lexical concern as well as a runtime concern. The handling of the "runtime" bits are handled in the next phase of compilation. The next and currently final phase of compilation is a new pass where much of the historical code in `translate.rs` has been moved to (but heavily refactored). The goal of compilation is to produce one "flat" list of initializers for a component (as happens prior to this PR) and to achieve this an "inliner" phase runs which runs through the instantiation process at compile time to produce a list of initializers. This `inline` module is the main addition as part of this PR and is now the workhorse for dataflow analysis and tracking what's actually referring to what. During the `inline` phase the local initializers recorded in the `translate` phase are processed, in sequence, to instantiate a component. Definitions of items are tracked to correspond to their root definition which allows seeing across instantiation argument boundaries and such. Handling "upvars" for component outer aliases is handled in the `inline` phase as well by creating state for a component whenever a component is defined as was recorded during the `translate` phase. Finally this phase is chiefly responsible for doing all string-based name resolution at compile time that it can. This means that at runtime no string maps will need to be consulted for item exports and such. The final result of inlining is a list of "global initializers" which is a flat list processed during instantiation time. These are almost identical to the initializers that were processed prior to this PR. There are certainly still more gaps of the component model to implement but this should be a major leg up in terms of functionality that Wasmtime implements. This commit, however leaves behind a "hole" which is not intended to be filled in at this time, namely importing and exporting components at the "root" level from and to the host. This is tracked and explained in more detail as part of #4283. cc #4185 as this completes a number of items there * Tweak code to work on stable without warning * Review comments --- crates/environ/src/component/info.rs | 140 +- crates/environ/src/component/translate.rs | 1166 ++++++----------- .../environ/src/component/translate/inline.rs | 927 +++++++++++++ crates/environ/src/component/types.rs | 34 +- crates/wasmtime/src/component/component.rs | 40 +- crates/wasmtime/src/component/instance.rs | 70 +- crates/wasmtime/src/component/linker.rs | 6 +- crates/wast/src/spectest.rs | 22 + crates/wast/src/wast.rs | 4 +- tests/all/component_model.rs | 1 + tests/all/component_model/nested.rs | 172 +++ .../component-model/instance.wast | 116 ++ .../component-model/nested.wast | 451 +++++++ .../component-model/simple.wast | 12 + 14 files changed, 2303 insertions(+), 858 deletions(-) create mode 100644 crates/environ/src/component/translate/inline.rs create mode 100644 tests/all/component_model/nested.rs create mode 100644 tests/misc_testsuite/component-model/nested.wast diff --git a/crates/environ/src/component/info.rs b/crates/environ/src/component/info.rs index 2fe4cec8d6..1e54ffd7dd 100644 --- a/crates/environ/src/component/info.rs +++ b/crates/environ/src/component/info.rs @@ -1,23 +1,50 @@ // General runtime type-information about a component. // -// ## Optimizing instantiation +// Compared to the `Module` structure for core wasm this type is pretty +// significantly different. The core wasm `Module` corresponds roughly 1-to-1 +// with the structure of the wasm module itself, but instead a `Component` is +// more of a "compiled" representation where the original structure is thrown +// away in favor of a more optimized representation. The considerations for this +// are: // -// One major consideration for the structure of the types in this module is to -// make instantiation as fast as possible. To facilitate this the representation -// here avoids the need to create a `PrimaryMap` during instantiation of a -// component for each index space like the func, global, table, etc, index -// spaces. Instead a component is simply defined by a list of instantiation -// instructions, and arguments to the instantiation of each instance are a list -// of "pointers" into previously created instances. This means that we only need -// to build up one list of instances during instantiation. +// * This representation of a `Component` avoids the need to create a +// `PrimaryMap` of some form for each of the index spaces within a component. +// This is less so an issue about allocations and moreso that this information +// generally just isn't needed any time after instantiation. Avoiding creating +// these altogether helps components be lighter weight at runtime and +// additionally accelerates instantiation. // -// Additionally we also try to avoid string lookups wherever possible. In the -// component model instantiation and aliasing theoretically deals with lots of -// string lookups here and there. This is slower than indexing lookup, though, -// and not actually necessary when the structure of a module is statically -// known. This means that `ExportItem` below has two variants where we try to -// use the indexing variant as much as possible, which can be done for -// everything except imported core wasm modules. +// * Components can have arbitrary nesting and internally do instantiations via +// string-based matching. At instantiation-time, though, we want to do as few +// string-lookups in hash maps as much as we can since they're significantly +// slower than index-based lookups. Furthermore while the imports of a +// component are not statically known the rest of the structure of the +// component is statically known which enables the ability to track precisely +// what matches up where and do all the string lookups at compile time instead +// of instantiation time. +// +// * Finally by performing this sort of dataflow analysis we are capable of +// identifying what adapters need trampolines for compilation or fusion. For +// example this tracks when host functions are lowered which enables us to +// enumerate what trampolines are required to enter into a component. +// Additionally (eventually) this will track all of the "fused" adapter +// functions where a function from one component instance is lifted and then +// lowered into another component instance. Altogether this enables Wasmtime's +// AOT-compilation where the artifact from compilation is suitable for use in +// running the component without the support of a compiler at runtime. +// +// Note, however, that the current design of `Component` has fundamental +// limitations which it was not designed for. For example there is no feasible +// way to implement either importing or exporting a component itself from the +// root component. Currently we rely on the ability to have static knowledge of +// what's coming from the host which at this point can only be either functions +// or core wasm modules. Additionally one flat list of initializers for a +// component are produced instead of initializers-per-component which would +// otherwise be required to export a component from a component. +// +// For now this tradeoff is made as it aligns well with the intended use case +// for components in an embedding. This may need to be revisited though if the +// requirements of embeddings change over time. use crate::component::*; use crate::{EntityIndex, PrimaryMap, SignatureIndex}; @@ -81,10 +108,7 @@ pub struct Component { /// have instantiations, for example, in addition to entries which /// initialize `VMComponentContext` fields with previously instantiated /// instances. - /// - /// NB: at this time recursive components are not supported, and that may - /// change this somewhat significantly. - pub initializers: Vec, + pub initializers: Vec, /// The number of runtime instances (maximum `RuntimeInstanceIndex`) created /// when instantiating this component. @@ -114,22 +138,18 @@ pub struct Component { pub num_runtime_modules: u32, } -/// Initializer instructions to get processed when instantiating a component +/// GlobalInitializer instructions to get processed when instantiating a component /// /// The variants of this enum are processed during the instantiation phase of /// a component in-order from front-to-back. These are otherwise emitted as a /// component is parsed and read and translated. -/// -/// NB: at this time recursive components are not supported, and that may -/// change this somewhat significantly. -/// // // FIXME(#2639) if processing this list is ever a bottleneck we could // theoretically use cranelift to compile an initialization function which // performs all of these duties for us and skips the overhead of interpreting // all of these instructions. #[derive(Debug, Serialize, Deserialize)] -pub enum Initializer { +pub enum GlobalInitializer { /// A core wasm module is being instantiated. /// /// This will result in a new core wasm instance being created, which may @@ -143,7 +163,7 @@ pub enum Initializer { /// This initializer entry is intended to be used to fill out the /// `VMComponentContext` and information about this lowering such as the /// cranelift-compiled trampoline function pointer, the host function - /// pointer the trampline calls, and the canonical ABI options. + /// pointer the trampoline calls, and the canonical ABI options. LowerImport(LowerImport), /// A core wasm linear memory is going to be saved into the @@ -154,30 +174,39 @@ pub enum Initializer { /// previously created module instance, and stored into the /// `VMComponentContext` at the `index` specified. This lowering is then /// used in the future by pointers from `CanonicalOptions`. - ExtractMemory { - /// The index of the memory being defined. - index: RuntimeMemoryIndex, - /// Where this memory is being extracted from. - export: CoreExport, - }, + ExtractMemory(ExtractMemory), /// Same as `ExtractMemory`, except it's extracting a function pointer to be /// used as a `realloc` function. - ExtractRealloc { - /// The index of the realloc being defined. - index: RuntimeReallocIndex, - /// Where this realloc is being extracted from. - def: CoreDef, - }, + ExtractRealloc(ExtractRealloc), /// The `module` specified is saved into the runtime state at the next /// `RuntimeModuleIndex`, referred to later by `Export` definitions. - SaveModuleUpvar(ModuleUpvarIndex), + SaveStaticModule(StaticModuleIndex), /// Same as `SaveModuleUpvar`, but for imports. SaveModuleImport(RuntimeImportIndex), } +/// Metadata for extraction of a memory of what's being extracted and where it's +/// going. +#[derive(Debug, Serialize, Deserialize)] +pub struct ExtractMemory { + /// The index of the memory being defined. + pub index: RuntimeMemoryIndex, + /// Where this memory is being extracted from. + pub export: CoreExport, +} + +/// Same as `ExtractMemory` but for the `realloc` canonical option. +#[derive(Debug, Serialize, Deserialize)] +pub struct ExtractRealloc { + /// The index of the realloc being defined. + pub index: RuntimeReallocIndex, + /// Where this realloc is being extracted from. + pub def: CoreDef, +} + /// Different methods of instantiating a core wasm module. #[derive(Debug, Serialize, Deserialize)] pub enum InstantiateModule { @@ -187,7 +216,7 @@ pub enum InstantiateModule { /// order of imports required is statically known and can be pre-calculated /// to avoid string lookups related to names at runtime, represented by the /// flat list of arguments here. - Upvar(ModuleUpvarIndex, Box<[CoreDef]>), + Static(StaticModuleIndex, Box<[CoreDef]>), /// An imported module is being instantiated. /// @@ -201,7 +230,7 @@ pub enum InstantiateModule { } /// Description of a lowered import used in conjunction with -/// `Initializer::LowerImport`. +/// `GlobalInitializer::LowerImport`. #[derive(Debug, Serialize, Deserialize)] pub struct LowerImport { /// The index of the lowered function that's being created. @@ -237,13 +266,16 @@ pub enum CoreDef { Export(CoreExport), /// This item is a core wasm function with the index specified here. Note /// that this `LoweredIndex` corresponds to the nth - /// `Initializer::LowerImport` instruction. + /// `GlobalInitializer::LowerImport` instruction. Lowered(LoweredIndex), } -impl From> for CoreDef { - fn from(export: CoreExport) -> CoreDef { - CoreDef::Export(export) +impl From> for CoreDef +where + EntityIndex: From, +{ + fn from(export: CoreExport) -> CoreDef { + CoreDef::Export(export.map_index(|i| i.into())) } } @@ -265,6 +297,20 @@ pub struct CoreExport { pub item: ExportItem, } +impl CoreExport { + /// Maps the index type `T` to another type `U` if this export item indeed + /// refers to an index `T`. + pub fn map_index(self, f: impl FnOnce(T) -> U) -> CoreExport { + CoreExport { + instance: self.instance, + item: match self.item { + ExportItem::Index(i) => ExportItem::Index(f(i)), + ExportItem::Name(s) => ExportItem::Name(s), + }, + } + } +} + /// An index at which to find an item within a runtime instance. #[derive(Debug, Clone, Serialize, Deserialize, Hash, Eq, PartialEq)] pub enum ExportItem { @@ -300,7 +346,7 @@ pub enum Export { /// A module defined within this component is exported. /// /// The module index here indexes a module recorded with - /// `Initializer::SaveModule` above. + /// `GlobalInitializer::SaveModule` above. Module(RuntimeModuleIndex), } diff --git a/crates/environ/src/component/translate.rs b/crates/environ/src/component/translate.rs index 0b81abc39d..33ff141ab4 100644 --- a/crates/environ/src/component/translate.rs +++ b/crates/environ/src/component/translate.rs @@ -1,198 +1,217 @@ use crate::component::*; -use crate::{ - EntityIndex, EntityType, ModuleEnvironment, ModuleTranslation, PrimaryMap, SignatureIndex, - Tunables, -}; +use crate::{EntityIndex, ModuleEnvironment, ModuleTranslation, PrimaryMap, Tunables}; use anyhow::{bail, Result}; -use indexmap::IndexMap; use std::collections::HashMap; use std::mem; use wasmparser::{Chunk, Encoding, Parser, Payload, Validator}; +mod inline; + /// Structure used to translate a component and parse it. pub struct Translator<'a, 'data> { + /// The current component being translated. + /// + /// This will get swapped out as translation traverses the body of a + /// component and a sub-component is entered or left. result: Translation<'data>, - validator: &'a mut Validator, - types: &'a mut ComponentTypesBuilder, - tunables: &'a Tunables, - parsers: Vec, + + /// Current state of parsing a binary component. Note that like `result` + /// this will change as the component is traversed. parser: Parser, + + /// Stack of lexical scopes that are in-progress but not finished yet. + /// + /// This is pushed to whenever a component is entered and popped from + /// whenever a component is left. Each lexical scope also contains + /// information about the variables that it is currently required to close + /// over which is threaded into the current in-progress translation of + /// the sub-component which pushed a scope here. + lexical_scopes: Vec>, + + /// The validator in use to verify that the raw input binary is a valid + /// component. + validator: &'a mut Validator, + + /// Type information shared for the entire component. + /// + /// This builder is also used for all core wasm modules found to intern + /// signatures across all modules. + types: &'a mut ComponentTypesBuilder, + + /// The compiler configuration provided by the embedder. + tunables: &'a Tunables, + + /// Completely translated core wasm modules that have been found so far. + /// + /// Note that this translation only involves learning about type + /// information and functions are not actually compiled here. + static_modules: PrimaryMap>, + + /// Completely translated components that have been found so far. + /// + /// As frames are popped from `lexical_scopes` their completed component + /// will be pushed onto this list. + static_components: PrimaryMap>, } -/// Result of translation of a component to contain all type information and -/// metadata about how to run the component. +/// Representation of the syntactic scope of a component meaning where it is +/// and what its state is at in the binary format. +/// +/// These scopes are pushed and popped when a sub-component starts being +/// parsed and finishes being parsed. The main purpose of this frame is to +/// have a `ClosedOverVars` field which encapsulates data that is inherited +/// from the scope specified into the component being translated just beneath +/// it. +/// +/// This structure exists to implement outer aliases to components and modules. +/// When a component or module is closed over then that means it needs to be +/// inherited in a sense to the component which actually had the alias. This is +/// achieved with a deceptively simple scheme where each parent of the +/// component with the alias will inherit the component from the desired +/// location. +/// +/// For example with a component structure that looks like: +/// +/// ```wasm +/// (component $A +/// (core module $M) +/// (component $B +/// (component $C +/// (alias outer $A $M (core module)) +/// ) +/// ) +/// ) +/// ``` +/// +/// here the `C` component is closing over `M` located in the root component +/// `A`. When `C` is being translated the `lexical_scopes` field will look like +/// `[A, B]`. When the alias is encountered (for module index 0) this will +/// place a `ClosedOverModule::Local(0)` entry into the `closure_args` field of +/// `A`'s frame. This will in turn give a `ModuleUpvarIndex` which is then +/// inserted into `closure_args` in `B`'s frame. This produces yet another +/// `ModuleUpvarIndex` which is finally inserted into `C`'s module index space +/// via `LocalInitializer::AliasModuleUpvar` with the last index. +/// +/// All of these upvar indices and such are interpreted in the "inline" phase +/// of compilation and not at runtime. This means that when `A` is being +/// instantiated one of its initializers will be +/// `LocalInitializer::ComponentStatic`. This starts to create `B` and the +/// variables captured for `B` are listed as local module 0, or `M`. This list +/// is then preserved in the definition of the component `B` and later reused +/// by `C` again to finally get access to the closed over component. +/// +/// Effectively the scopes are managed hierarchically where a reference to an +/// outer variable automatically injects references into all parents up to +/// where the reference is. This variable scopes are the processed during +/// inlining where a component definition is a reference to the static +/// component information (`Translation`) plus closed over variables +/// (`ComponentClosure` during inlining). +struct LexicalScope<'data> { + /// Current state of translating the `translation` below. + parser: Parser, + /// Current state of the component's translation as found so far. + translation: Translation<'data>, + /// List of captures that `translation` will need to process to create the + /// sub-component which is directly beneath this lexical scope. + closure_args: ClosedOverVars, +} + +/// A "local" translation of a component. +/// +/// This structure is used as a sort of in-progress translation of a component. +/// This is not `Component` which is the final form as consumed by Wasmtime +/// at runtime. Instead this is a fairly simple representation of a component +/// where almost everything is ordered as a list of initializers. The binary +/// format is translated to a list of initializers here which is later processed +/// during "inlining" to produce a final component with the final set of +/// initializers. #[derive(Default)] -pub struct Translation<'data> { - /// Final type of the component, intended to be persisted all the way to - /// runtime. - pub component: Component, - - /// List of "upvars" or closed over modules that `Component` would refer - /// to. This contains the core wasm results of translation and the indices - /// are referred to within types in `Component`. - pub upvars: PrimaryMap>, - - // Index spaces which are built-up during translation but do not persist to - // runtime. These are used to understand the structure of the component and - // where items come from but at this time these index spaces and their - // definitions are not required at runtime as they're effectively "erased" - // at the moment. - // - /// Modules and how they're defined (either closed-over or imported) - modules: PrimaryMap, - - /// Instances of components, either direct instantiations or "bundles of - /// exports". - component_instances: PrimaryMap>, - - /// Instances of core wasm modules, either direct instantiations or - /// "bundles of exports". - module_instances: PrimaryMap>, - - /// The core wasm function index space. - funcs: PrimaryMap>, - - /// The component function index space. - component_funcs: PrimaryMap>, - - /// Core wasm globals, always sourced from a previously module instance. - globals: PrimaryMap>, - - /// Core wasm memories, always sourced from a previously module instance. - memories: PrimaryMap>, - - /// Core wasm tables, always sourced from a previously module instance. - tables: PrimaryMap>, - - /// This is a list of pairs where the first element points to an index - /// within `component.initializers` to an `Initializer::LowerImport` entry. - /// After a component has finished translation and we have a - /// `wasmparser::Types` value to lookup type information within the type of - /// `FuncIndex`, within this component, will be used to fill in the - /// `LowerImport::canonical_abi` field. +struct Translation<'data> { + /// Instructions which form this component. /// - /// This avoids wasmtime having to duplicate the - /// interface-types-signature-to-core-wasm-signature lowering logic. - signatures_to_fill: Vec<(usize, FuncIndex)>, + /// There is one initializer for all members of each index space, and all + /// index spaces are incrementally built here as the initializer list is + /// processed. + initializers: Vec>, - /// Intern'd map of imports where `RuntimeImport` represents some - /// (optional) projection of imports from an original import and - /// `RuntimeImportIndex` is an array built at runtime used to instantiate - /// this component. - import_map: HashMap, + /// The list of exports from this component, as pairs of names and an + /// index into an index space of what's being exported. + exports: Vec<(&'data str, ComponentItem)>, - /// Intern'd map of exports to the memory index they're referred to by at - /// runtime, used when building `CanonicalOptions` to avoid storing the same - /// memory many times within a `VMComponentContext`. - memory_to_runtime: HashMap, RuntimeMemoryIndex>, - - /// Same as `memory_to_runtime` but an intern'd map for realloc functions - /// instead. - realloc_to_runtime: HashMap, + /// Type information from wasmparser about this component, available after + /// the component has been completely translated. + types: Option, } -/// How a module is defined within a component. -#[derive(Debug, Clone)] -enum ModuleDef { - /// This module is defined as an "upvar" or a closed over variable - /// implicitly available for the component. - /// - /// This means that the module was either defined within the component or a - /// module was aliased into this component which was known defined in the - /// parent component. +#[allow(missing_docs)] +enum LocalInitializer<'data> { + // imports + Import(&'data str, TypeDef), + + // canonical function sections + Lower(ComponentFuncIndex, LocalCanonicalOptions), + Lift(TypeFuncIndex, FuncIndex, LocalCanonicalOptions), + + // core wasm modules + ModuleStatic(StaticModuleIndex), + + // core wasm module instances + ModuleInstantiate(ModuleIndex, HashMap<&'data str, ModuleInstanceIndex>), + ModuleSynthetic(HashMap<&'data str, EntityIndex>), + + // components + ComponentStatic(StaticComponentIndex, ClosedOverVars), + + // component instances + ComponentInstantiate(ComponentIndex, HashMap<&'data str, ComponentItem>), + ComponentSynthetic(HashMap<&'data str, ComponentItem>), + + // alias section + AliasExportFunc(ModuleInstanceIndex, &'data str), + AliasExportTable(ModuleInstanceIndex, &'data str), + AliasExportGlobal(ModuleInstanceIndex, &'data str), + AliasExportMemory(ModuleInstanceIndex, &'data str), + AliasComponentExport(ComponentInstanceIndex, &'data str), + AliasModule(ClosedOverModule), + AliasComponent(ClosedOverComponent), +} + +/// The "closure environment" of components themselves. +/// +/// For more information see `LexicalScope`. +#[derive(Default)] +struct ClosedOverVars { + components: PrimaryMap, + modules: PrimaryMap, +} + +/// Description how a component is closed over when the closure variables for +/// a component are being created. +/// +/// For more information see `LexicalScope`. +enum ClosedOverComponent { + /// A closed over component is coming from the local component's index + /// space, meaning a previously defined component is being captured. + Local(ComponentIndex), + /// A closed over component is coming from our own component's list of + /// upvars. This list was passed to us by our enclosing component, which + /// will eventually have bottomed out in closing over a `Local` component + /// index for some parent component. + Upvar(ComponentUpvarIndex), +} + +/// Same as `ClosedOverComponent`, but for modules. +enum ClosedOverModule { + Local(ModuleIndex), Upvar(ModuleUpvarIndex), - - /// This module is defined as an import to the current component, so - /// nothing is known about it except for its type. The `import_index` - /// provided here indexes into the `Component`'s import list. - Import { - ty: TypeModuleIndex, - import: RuntimeImport, - }, } -/// Forms of creation of a core wasm module instance. -#[derive(Debug, Clone)] -enum ModuleInstanceDef<'data> { - /// A module instance created through the instantiation of a previous - /// module. - Instantiated { - /// The runtime index associated with this instance. - /// - /// Not to be confused with `InstanceIndex` which counts "synthetic" - /// instances as well. - instance: RuntimeInstanceIndex, - - /// The module that was instantiated. - module: ModuleIndex, - }, - - /// A "synthetic" module created as a bag of exports from other items - /// already defined within this component. - Synthetic(HashMap<&'data str, EntityIndex>), -} - -/// Forms of creation of a component instance. -#[derive(Debug, Clone)] -enum ComponentInstanceDef<'data> { - /// An instance which was imported from the host. - Import { - /// The type of the imported instance - ty: TypeComponentInstanceIndex, - /// The description of where this import came from. - import: RuntimeImport, - }, - - /// Same as `ModuleInstanceDef::Synthetic` except for component items. - Synthetic(HashMap<&'data str, ComponentItem>), -} - -/// Description of the function index space and how functions are defined. -#[derive(Clone)] -enum Func<'data> { - /// A core wasm function that's extracted from a core wasm instance. - Core(CoreSource<'data>), - /// A core wasm function created by lowering an imported host function. - /// - /// Note that `LoweredIndex` here refers to the nth - /// `Initializer::LowerImport`. - Lowered(LoweredIndex), -} - -/// Description of the function index space and how functions are defined. -#[derive(Clone)] -enum ComponentFunc<'data> { - /// A component function that is imported from the host. - Import(RuntimeImport), - - /// A component function that is lifted from core wasm function. - Lifted { - /// The resulting type of the lifted function - ty: TypeFuncIndex, - /// Which core wasm function is lifted, currently required to be an - /// instance export as opposed to a lowered import. - func: CoreSource<'data>, - /// The options specified when the function was lifted. - options: CanonicalOptions, - }, -} - -/// Source of truth for where a core wasm item comes from. -#[derive(Clone)] -enum CoreSource<'data> { - /// This item comes from an indexed entity within an instance. - /// - /// This is only available when the instance is statically known to be - /// defined within the original component itself so we know the exact - /// index. - Index(RuntimeInstanceIndex, EntityIndex), - - /// This item comes from an named entity within an instance. - /// - /// This must be used for instances of imported modules because we - /// otherwise don't know the internal structure of the module and which - /// index is being exported. - Export(RuntimeInstanceIndex, &'data str), +/// Representation of canonical ABI options. +struct LocalCanonicalOptions { + string_encoding: StringEncoding, + memory: Option, + realloc: Option, + post_return: Option, } enum Action { @@ -201,28 +220,6 @@ enum Action { Done, } -/// Pre-intern'd representation of a `RuntimeImportIndex`. -/// -/// When this is actually used within a component it will be committed into the -/// `import_map` to give it a `RuntimeImportIndex` via the -/// `runtime_import_index` function. -#[derive(Debug, Clone, Hash, PartialEq, Eq)] -struct RuntimeImport { - source: ImportIndex, - exports: Vec, -} - -impl RuntimeImport { - fn append(&self, name: &str) -> RuntimeImport { - let mut exports = self.exports.clone(); - exports.push(name.to_string()); - RuntimeImport { - source: self.source, - exports, - } - } -} - impl<'a, 'data> Translator<'a, 'data> { /// Creates a new translation state ready to translate a component. pub fn new( @@ -236,7 +233,9 @@ impl<'a, 'data> Translator<'a, 'data> { validator, types, parser: Parser::new(0), - parsers: Vec::new(), + lexical_scopes: Vec::new(), + static_components: Default::default(), + static_modules: Default::default(), } } @@ -246,7 +245,35 @@ impl<'a, 'data> Translator<'a, 'data> { /// `component` and create type information for Wasmtime and such. The /// `component` does not have to be valid and it will be validated during /// compilation. - pub fn translate(mut self, component: &'data [u8]) -> Result> { + /// + /// THe result of this function is a tuple of the final component's + /// description plus a list of core wasm modules found within the + /// component. The component's description actually erases internal + /// components, instances, etc, as much as it can. Instead `Component` + /// retains a flat list of initializers (no nesting) which was created + /// as part of compilation from the nested structure of the original + /// component. + /// + /// The list of core wasm modules found is provided to allow compiling + /// modules externally in parallel. Additionally initializers in + /// `Component` may refer to the modules in the map returned by index. + /// + /// # Errors + /// + /// This function will return an error if the `component` provided is + /// invalid. + pub fn translate( + mut self, + component: &'data [u8], + ) -> Result<( + Component, + PrimaryMap>, + )> { + // First up wasmparser is used to actually perform the translation and + // validation of this component. This will produce a list of core wasm + // modules in addition to components which are found during the + // translation process. When doing this only a `Translation` is created + // which is a simple representation of a component. let mut remaining = component; loop { let payload = match self.parser.parse(remaining, true)? { @@ -263,7 +290,26 @@ impl<'a, 'data> Translator<'a, 'data> { Action::Done => break, } } - Ok(self.result) + assert!(remaining.is_empty()); + assert!(self.lexical_scopes.is_empty()); + + // ... after translation initially finishes the next pass is performed + // which we're calling "inlining". This will "instantiate" the root + // component, following nested component instantiations, creating a + // global list of initializers along the way. This phase uses the simple + // initializers in each component to track dataflow of host imports and + // internal references to items throughout a component at compile-time. + // The produce initializers in the final `Component` are intended to be + // much simpler than the original component and more efficient for + // Wasmtime to process at runtime as well (e.g. no string lookups as + // most everything is done through indices instead). + let component = inline::run( + &mut self.types, + &self.result, + &self.static_modules, + &self.static_components, + )?; + Ok((component, self.static_modules)) } fn translate_payload( @@ -292,34 +338,33 @@ impl<'a, 'data> Translator<'a, 'data> { } Payload::End(offset) => { - let types = self.validator.end(offset)?; - - // With type information in hand fill in the canonical abi type - // of lowered functions. - for (idx, func) in self.result.signatures_to_fill.drain(..) { - let i = match &mut self.result.component.initializers[idx] { - Initializer::LowerImport(i) => i, - _ => unreachable!(), - }; - assert!(i.canonical_abi.as_u32() == 0); - i.canonical_abi = self.types.module_types_builder().wasm_func_type( - types - .function_at(func.as_u32()) - .expect("should be in-bounds") - .clone() - .try_into()?, - ); - } + // Record type information for this component now that we'll + // have it from wasmparser. + self.result.types = Some(self.validator.end(offset)?); // When leaving a module be sure to pop the types scope to // ensure that when we go back to the previous module outer // type alias indices work correctly again. self.types.pop_type_scope(); - match self.parsers.pop() { - Some(p) => self.parser = p, + // Exit the current lexical scope. If there is no parent (no + // frame currently on the stack) then translation is finished. + // Otherwise that means that a nested component has been + // completed and is recorded as such. + let LexicalScope { + parser, + translation, + closure_args, + } = match self.lexical_scopes.pop() { + Some(frame) => frame, None => return Ok(Action::Done), - } + }; + self.parser = parser; + let component = mem::replace(&mut self.result, translation); + let static_idx = self.static_components.push(component); + self.result + .initializers + .push(LocalInitializer::ComponentStatic(static_idx, closure_args)); } // When we see a type section the types are validated and then @@ -345,50 +390,22 @@ impl<'a, 'data> Translator<'a, 'data> { } } + // Processing the import section at this point is relatively simple + // which is to simply record the name of the import and the type + // information associated with it. Payload::ComponentImportSection(s) => { self.validator.component_import_section(&s)?; for import in s { let import = import?; let ty = self.types.component_type_ref(&import.ty); - // Record the `ImportIndex` to be associated with this - // import and create the `RuntimeImport` representing the - // "root" where it has no extra `exports` - let source = self - .result - .component - .import_types - .push((import.name.to_string(), ty)); - let import = RuntimeImport { - source, - exports: Vec::new(), - }; - match ty { - TypeDef::Module(ty) => { - self.result.modules.push(ModuleDef::Import { ty, import }); - } - TypeDef::ComponentInstance(ty) => { - self.result - .component_instances - .push(ComponentInstanceDef::Import { ty, import }); - } - TypeDef::ComponentFunc(_ty) => { - self.result - .component_funcs - .push(ComponentFunc::Import(import)); - } - TypeDef::Component(_) => { - unimplemented!("imports of components"); - } - TypeDef::Interface(_) => { - unimplemented!("imports of types"); - } - - // not possible with a valid component - TypeDef::CoreFunc(_ty) => unreachable!(), - } + self.result + .initializers + .push(LocalInitializer::Import(import.name, ty)); } } + // Entries in the canonical section will get initializers recorded + // with the listed options for lifting/lowering. Payload::ComponentCanonicalSection(s) => { self.validator.component_canonical_section(&s)?; for func in s { @@ -399,17 +416,26 @@ impl<'a, 'data> Translator<'a, 'data> { options, } => { let ty = ComponentTypeIndex::from_u32(type_index); + let ty = match self.types.component_outer_type(0, ty) { + TypeDef::ComponentFunc(ty) => ty, + // should not be possible after validation + _ => unreachable!(), + }; let func = FuncIndex::from_u32(core_func_index); - let func = self.lift_function(ty, func, &options); - self.result.component_funcs.push(func); + let options = self.canonical_options(&options); + self.result + .initializers + .push(LocalInitializer::Lift(ty, func, options)); } wasmparser::CanonicalFunction::Lower { func_index, options, } => { let func = ComponentFuncIndex::from_u32(func_index); - let func = self.lower_function(func, &options); - self.result.funcs.push(func); + let options = self.canonical_options(&options); + self.result + .initializers + .push(LocalInitializer::Lower(func, options)); } } } @@ -431,56 +457,79 @@ impl<'a, 'data> Translator<'a, 'data> { self.types.module_types_builder(), ) .translate(parser, &component[range.start..range.end])?; - let upvar_idx = self.result.upvars.push(translation); - self.result.modules.push(ModuleDef::Upvar(upvar_idx)); + let static_idx = self.static_modules.push(translation); + self.result + .initializers + .push(LocalInitializer::ModuleStatic(static_idx)); return Ok(Action::Skip(range.end - range.start)); } + // When a sub-component is found then the current translation state + // is pushed onto the `lexical_scopes` stack. This will subsequently + // get popped as part of `Payload::End` processing above. + // + // Note that the set of closure args for this new lexical scope + // starts empty since it will only get populated if translation of + // the nested component ends up aliasing some outer module or + // component. Payload::ComponentSection { parser, range } => { self.validator.component_section(&range)?; - let old_parser = mem::replace(&mut self.parser, parser); - self.parsers.push(old_parser); - unimplemented!("component section"); + self.lexical_scopes.push(LexicalScope { + parser: mem::replace(&mut self.parser, parser), + translation: mem::take(&mut self.result), + closure_args: ClosedOverVars::default(), + }); } + // Both core wasm instances and component instances record + // initializers of what form of instantiation is performed which + // largely just records the arguments given from wasmparser into a + // `HashMap` for processing later during inlining. Payload::InstanceSection(s) => { self.validator.instance_section(&s)?; for instance in s { - let instance = match instance? { + let init = match instance? { wasmparser::Instance::Instantiate { module_index, args } => { - self.instantiate_module(ModuleIndex::from_u32(module_index), &args) + let index = ModuleIndex::from_u32(module_index); + self.instantiate_module(index, &args) } wasmparser::Instance::FromExports(exports) => { self.instantiate_module_from_exports(&exports) } }; - self.result.module_instances.push(instance); + self.result.initializers.push(init); } } Payload::ComponentInstanceSection(s) => { self.validator.component_instance_section(&s)?; for instance in s { - let instance = match instance? { + let init = match instance? { wasmparser::ComponentInstance::Instantiate { component_index, args, } => { let index = ComponentIndex::from_u32(component_index); - drop((index, args)); - unimplemented!("instantiating a component"); + self.instantiate_component(index, &args) } wasmparser::ComponentInstance::FromExports(exports) => { self.instantiate_component_from_exports(&exports) } }; - self.result.component_instances.push(instance); + self.result.initializers.push(init); } } + // Exports don't actually fill out the `initializers` array but + // instead fill out the one other field in a `Translation`, the + // `exports` field (as one might imagine). This for now simply + // records the index of what's exported and that's tracked further + // later during inlining. Payload::ComponentExportSection(s) => { self.validator.component_export_section(&s)?; for export in s { - self.export(&export?); + let export = export?; + let item = self.kind_to_item(export.kind, export.index); + self.result.exports.push((export.name, item)); } } @@ -489,38 +538,44 @@ impl<'a, 'data> Translator<'a, 'data> { unimplemented!("component start section"); } + // Aliases of instance exports (either core or component) will be + // recorded as an initializer of the appropriate type with outer + // aliases handled specially via upvars and type processing. Payload::AliasSection(s) => { self.validator.alias_section(&s)?; for alias in s { - match alias? { + let init = match alias? { wasmparser::Alias::InstanceExport { kind, instance_index, name, } => { let instance = ModuleInstanceIndex::from_u32(instance_index); - self.alias_module_instance_export(kind, instance, name); + self.alias_module_instance_export(kind, instance, name) } - } + }; + self.result.initializers.push(init); } } - Payload::ComponentAliasSection(s) => { self.validator.component_alias_section(&s)?; for alias in s { - match alias? { + let init = match alias? { wasmparser::ComponentAlias::InstanceExport { kind, instance_index, name, } => { let instance = ComponentInstanceIndex::from_u32(instance_index); - self.alias_component_instance_export(kind, instance, name); + drop(kind); + LocalInitializer::AliasComponentExport(instance, name) } wasmparser::ComponentAlias::Outer { kind, count, index } => { self.alias_component_outer(kind, count, index); + continue; } - } + }; + self.result.initializers.push(init); } } @@ -547,128 +602,18 @@ impl<'a, 'data> Translator<'a, 'data> { fn instantiate_module( &mut self, module: ModuleIndex, - args: &[wasmparser::InstantiationArg<'data>], - ) -> ModuleInstanceDef<'data> { - // Map the flat list of `args` to instead a name-to-instance index. - let mut instance_by_name = HashMap::new(); - for arg in args { + raw_args: &[wasmparser::InstantiationArg<'data>], + ) -> LocalInitializer<'data> { + let mut args = HashMap::with_capacity(raw_args.len()); + for arg in raw_args { match arg.kind { wasmparser::InstantiationArgKind::Instance => { let idx = ModuleInstanceIndex::from_u32(arg.index); - instance_by_name.insert(arg.name, idx); + args.insert(arg.name, idx); } } } - - let instantiate = match self.result.modules[module].clone() { - // A module defined within this component is being instantiated - // which means we statically know the structure of the module. The - // list of imports required is ordered by the actual list of imports - // listed on the module itself (which wasmtime later requires during - // instantiation). - ModuleDef::Upvar(upvar_idx) => { - let args = self.result.upvars[upvar_idx] - .module - .imports() - .map(|(m, n, _)| (m.to_string(), n.to_string())) - .collect::>() - .iter() - .map(|(module, name)| { - self.lookup_core_def(instance_by_name[module.as_str()], name) - }) - .collect(); - InstantiateModule::Upvar(upvar_idx, args) - } - - // For imported modules the list of arguments is built to match the - // order of the imports listed in the declared type of the module. - // Note that this will need to be reshuffled at runtime since the - // actual module being instantiated may originally have required - // imports in a different order. - ModuleDef::Import { ty, import } => { - let import = self.runtime_import_index(import); - let mut args = IndexMap::new(); - let imports = self.types[ty].imports.keys().cloned().collect::>(); - for (module, name) in imports { - let def = self.lookup_core_def(instance_by_name[module.as_str()], &name); - let prev = args - .entry(module) - .or_insert(IndexMap::new()) - .insert(name, def); - assert!(prev.is_none()); - } - InstantiateModule::Import(import, args) - } - }; - self.result - .component - .initializers - .push(Initializer::InstantiateModule(instantiate)); - - let instance = RuntimeInstanceIndex::from_u32(self.result.component.num_runtime_instances); - self.result.component.num_runtime_instances += 1; - ModuleInstanceDef::Instantiated { instance, module } - } - - /// Calculate the `CoreDef`, a definition of a core wasm item, corresponding - /// to the export `name` of the `instance` specified. - /// - /// This classifies the export of the instance as one which we - /// statically know by index within an instantiated module (because - /// we know the module), one that must be referred to by name since the - /// module isn't known, or it's a synthesized lowering or adapter of a - /// component function. - fn lookup_core_def(&mut self, instance: ModuleInstanceIndex, name: &str) -> CoreDef { - match &self.result.module_instances[instance] { - ModuleInstanceDef::Instantiated { module, instance } => { - let (src, _ty) = self.lookup_core_source_in_module(*instance, *module, name); - src.to_core_def() - } - - ModuleInstanceDef::Synthetic(defs) => match defs[&name] { - EntityIndex::Function(f) => match self.result.funcs[f].clone() { - Func::Core(c) => c.to_core_def(), - Func::Lowered(i) => CoreDef::Lowered(i), - }, - EntityIndex::Global(g) => self.result.globals[g].to_core_def(), - EntityIndex::Table(t) => self.result.tables[t].to_core_def(), - EntityIndex::Memory(m) => self.result.memories[m].to_core_def(), - }, - } - } - - /// Calculates the `CoreSource` associated with the export `name` as an - /// instance of the instantiated `module` specified. - /// - /// The `instance` index here represents the runtime instance index that - /// we're looking up within. - fn lookup_core_source_in_module<'b>( - &self, - instance: RuntimeInstanceIndex, - module: ModuleIndex, - name: &'b str, - ) -> (CoreSource<'b>, EntityType) { - match self.result.modules[module] { - // The module instantiated is one that we statically know the - // structure of. This means that `name` points to an exact index of - // an item within the module which we lookup here and record. - ModuleDef::Upvar(upvar_idx) => { - let trans = &self.result.upvars[upvar_idx]; - let idx = trans.module.exports[name]; - let src = CoreSource::Index(instance, idx); - let ty = trans.module.type_of(idx); - (src, ty) - } - - // The module instantiated is imported so we don't statically know - // its structure. This means that the export must be identified by - // name. - ModuleDef::Import { ty, .. } => { - let src = CoreSource::Export(instance, name); - let ty = self.types[ty].exports[name].clone(); - (src, ty) - } - } + LocalInitializer::ModuleInstantiate(module, args) } /// Creates a synthetic module from the list of items currently in the @@ -676,7 +621,7 @@ impl<'a, 'data> Translator<'a, 'data> { fn instantiate_module_from_exports( &mut self, exports: &[wasmparser::Export<'data>], - ) -> ModuleInstanceDef<'data> { + ) -> LocalInitializer<'data> { let mut map = HashMap::with_capacity(exports.len()); for export in exports { let idx = match export.kind { @@ -702,7 +647,21 @@ impl<'a, 'data> Translator<'a, 'data> { }; map.insert(export.name, idx); } - ModuleInstanceDef::Synthetic(map) + LocalInitializer::ModuleSynthetic(map) + } + + fn instantiate_component( + &mut self, + component: ComponentIndex, + raw_args: &[wasmparser::ComponentInstantiationArg<'data>], + ) -> LocalInitializer<'data> { + let mut args = HashMap::with_capacity(raw_args.len()); + for arg in raw_args { + let idx = self.kind_to_item(arg.kind, arg.index); + args.insert(arg.name, idx); + } + + LocalInitializer::ComponentInstantiate(component, args) } /// Creates a synthetic module from the list of items currently in the @@ -710,110 +669,40 @@ impl<'a, 'data> Translator<'a, 'data> { fn instantiate_component_from_exports( &mut self, exports: &[wasmparser::ComponentExport<'data>], - ) -> ComponentInstanceDef<'data> { + ) -> LocalInitializer<'data> { let mut map = HashMap::with_capacity(exports.len()); for export in exports { - let idx = match &export.kind { - wasmparser::ComponentExternalKind::Func => { - let index = FuncIndex::from_u32(export.index); - ComponentItem::Func(index) - } - wasmparser::ComponentExternalKind::Module => { - let index = ModuleIndex::from_u32(export.index); - ComponentItem::Module(index) - } - wasmparser::ComponentExternalKind::Instance => { - let index = ComponentInstanceIndex::from_u32(export.index); - ComponentItem::ComponentInstance(index) - } - wasmparser::ComponentExternalKind::Component => { - let index = ComponentIndex::from_u32(export.index); - ComponentItem::Component(index) - } - wasmparser::ComponentExternalKind::Value => { - unimplemented!("component values"); - } - wasmparser::ComponentExternalKind::Type => { - unimplemented!("component type export"); - } - }; + let idx = self.kind_to_item(export.kind, export.index); map.insert(export.name, idx); } - ComponentInstanceDef::Synthetic(map) + LocalInitializer::ComponentSynthetic(map) } - fn export(&mut self, export: &wasmparser::ComponentExport<'data>) { - let name = export.name; - let export = match export.kind { - wasmparser::ComponentExternalKind::Module => { - let idx = ModuleIndex::from_u32(export.index); - let init = match self.result.modules[idx].clone() { - ModuleDef::Upvar(idx) => Initializer::SaveModuleUpvar(idx), - ModuleDef::Import { import, .. } => { - Initializer::SaveModuleImport(self.runtime_import_index(import)) - } - }; - self.result.component.initializers.push(init); - let runtime_index = - RuntimeModuleIndex::from_u32(self.result.component.num_runtime_modules); - self.result.component.num_runtime_modules += 1; - Export::Module(runtime_index) + fn kind_to_item(&self, kind: wasmparser::ComponentExternalKind, index: u32) -> ComponentItem { + match kind { + wasmparser::ComponentExternalKind::Func => { + let index = ComponentFuncIndex::from_u32(index); + ComponentItem::Func(index) } - wasmparser::ComponentExternalKind::Component => { - let idx = ComponentIndex::from_u32(export.index); - drop(idx); - unimplemented!("exporting a component"); + wasmparser::ComponentExternalKind::Module => { + let index = ModuleIndex::from_u32(index); + ComponentItem::Module(index) } wasmparser::ComponentExternalKind::Instance => { - let idx = ComponentInstanceIndex::from_u32(export.index); - drop(idx); - unimplemented!("exporting an instance"); + let index = ComponentInstanceIndex::from_u32(index); + ComponentItem::ComponentInstance(index) } - wasmparser::ComponentExternalKind::Func => { - let idx = ComponentFuncIndex::from_u32(export.index); - match self.result.component_funcs[idx].clone() { - ComponentFunc::Lifted { ty, func, options } => Export::LiftedFunction { - ty, - func: func.to_core_export(|i| match i { - EntityIndex::Function(i) => i, - _ => unreachable!(), - }), - options, - }, - - // TODO: Not 100% clear what to do about this. Given the - // expected implementation of host functions there's not a - // great way to actually invoke a host function after it's - // been wrapped up in a `Func` (or similar). One of the - // major issues here is that the callee expects the - // canonical-abi format but the caller has host-rust format, - // and bridging that gap is expected to be nontrivial. - // - // This may be solvable with like a temporary arena to lower - // into which is discarded after the call finishes? Or... - // something like that? This may not be too important to - // support in terms of perf so if it's not the fastest thing - // in the world that's probably alright. - // - // Nevertheless this shouldn't panic, eventually when the - // component model implementation is finished this should do - // something reasonable. - ComponentFunc::Import { .. } => unimplemented!("exporting an import"), - } + wasmparser::ComponentExternalKind::Component => { + let index = ComponentIndex::from_u32(index); + ComponentItem::Component(index) } wasmparser::ComponentExternalKind::Value => { - unimplemented!("exporting a value"); + unimplemented!("component values"); } wasmparser::ComponentExternalKind::Type => { - let idx = TypeIndex::from_u32(export.index); - drop(idx); - unimplemented!("exporting a type"); + unimplemented!("component type export"); } - }; - self.result - .component - .exports - .insert(name.to_string(), export); + } } fn alias_module_instance_export( @@ -821,117 +710,15 @@ impl<'a, 'data> Translator<'a, 'data> { kind: wasmparser::ExternalKind, instance: ModuleInstanceIndex, name: &'data str, - ) { - match &self.result.module_instances[instance] { - // The `instance` points to an instantiated module, meaning we can - // lookup the `CoreSource` associated with it and use the type - // information to insert it into the appropriate namespace. - ModuleInstanceDef::Instantiated { instance, module } => { - let (src, ty) = self.lookup_core_source_in_module(*instance, *module, name); - match ty { - EntityType::Function(_) => { - assert_eq!(kind, wasmparser::ExternalKind::Func); - self.result.funcs.push(Func::Core(src)); - } - EntityType::Global(_) => { - assert_eq!(kind, wasmparser::ExternalKind::Global); - self.result.globals.push(src); - } - EntityType::Memory(_) => { - assert_eq!(kind, wasmparser::ExternalKind::Memory); - self.result.memories.push(src); - } - EntityType::Table(_) => { - assert_eq!(kind, wasmparser::ExternalKind::Table); - self.result.tables.push(src); - } - EntityType::Tag(_) => unimplemented!("wasm exceptions"), - } + ) -> LocalInitializer<'data> { + match kind { + wasmparser::ExternalKind::Func => LocalInitializer::AliasExportFunc(instance, name), + wasmparser::ExternalKind::Memory => LocalInitializer::AliasExportMemory(instance, name), + wasmparser::ExternalKind::Table => LocalInitializer::AliasExportTable(instance, name), + wasmparser::ExternalKind::Global => LocalInitializer::AliasExportGlobal(instance, name), + wasmparser::ExternalKind::Tag => { + unimplemented!("wasm exceptions"); } - - // ... and like above for synthetic components aliasing exports from - // synthetic modules is also just copying around the identifying - // information. - ModuleInstanceDef::Synthetic(exports) => match exports[&name] { - EntityIndex::Function(i) => { - assert_eq!(kind, wasmparser::ExternalKind::Func); - self.result.funcs.push(self.result.funcs[i].clone()); - } - EntityIndex::Global(i) => { - assert_eq!(kind, wasmparser::ExternalKind::Global); - self.result.globals.push(self.result.globals[i].clone()); - } - EntityIndex::Table(i) => { - assert_eq!(kind, wasmparser::ExternalKind::Table); - self.result.tables.push(self.result.tables[i].clone()); - } - EntityIndex::Memory(i) => { - assert_eq!(kind, wasmparser::ExternalKind::Memory); - self.result.memories.push(self.result.memories[i].clone()); - } - }, - } - } - - fn alias_component_instance_export( - &mut self, - kind: wasmparser::ComponentExternalKind, - instance: ComponentInstanceIndex, - name: &'data str, - ) { - match &self.result.component_instances[instance] { - // The `instance` points to an imported component instance, meaning - // that the item we're pushing into our index spaces is effectively - // another form of import. The `name` is appended to the `import` - // found here and then the appropriate namespace of an import is - // recorded as well. - ComponentInstanceDef::Import { import, ty } => { - let import = import.append(name); - match self.types[*ty].exports[name] { - TypeDef::Module(ty) => { - assert_eq!(kind, wasmparser::ComponentExternalKind::Module); - self.result.modules.push(ModuleDef::Import { import, ty }); - } - TypeDef::ComponentInstance(ty) => { - assert_eq!(kind, wasmparser::ComponentExternalKind::Instance); - self.result - .component_instances - .push(ComponentInstanceDef::Import { import, ty }); - } - TypeDef::ComponentFunc(_ty) => { - assert_eq!(kind, wasmparser::ComponentExternalKind::Func); - self.result - .component_funcs - .push(ComponentFunc::Import(import)); - } - TypeDef::Interface(_) => unimplemented!("alias type export"), - TypeDef::Component(_) => unimplemented!("alias component export"), - - // not possible with valid components - TypeDef::CoreFunc(_ty) => unreachable!(), - } - } - - // For synthetic component/module instances we can just copy the - // definition of the original item into a new slot as well to record - // that the index describes the same item. - ComponentInstanceDef::Synthetic(exports) => match exports[&name] { - ComponentItem::Func(i) => { - assert_eq!(kind, wasmparser::ComponentExternalKind::Func); - self.result.funcs.push(self.result.funcs[i].clone()); - } - ComponentItem::Module(i) => { - assert_eq!(kind, wasmparser::ComponentExternalKind::Module); - self.result.modules.push(self.result.modules[i].clone()); - } - ComponentItem::ComponentInstance(i) => { - assert_eq!(kind, wasmparser::ComponentExternalKind::Instance); - self.result - .component_instances - .push(self.result.component_instances[i].clone()); - } - ComponentItem::Component(_) => unimplemented!("aliasing a component export"), - }, } } @@ -961,81 +748,51 @@ impl<'a, 'data> Translator<'a, 'data> { self.types.push_component_typedef(ty); } + // For more information about the implementation of outer aliases + // see the documentation of `LexicalScope`. Otherwise though the + // main idea here is that the data to close over starts as `Local` + // and then transitions to `Upvar` as its inserted into the parents + // in order from target we're aliasing back to the current + // component. wasmparser::ComponentOuterAliasKind::CoreModule => { - unimplemented!("outer alias to module"); + let index = ModuleIndex::from_u32(index); + let mut module = ClosedOverModule::Local(index); + let depth = self.lexical_scopes.len() - (count as usize); + for frame in self.lexical_scopes[depth..].iter_mut() { + module = ClosedOverModule::Upvar(frame.closure_args.modules.push(module)); + } + + // If the `module` is still `Local` then the `depth` was 0 and + // it's an alias into our own space. Otherwise it's switched to + // an upvar and will index into the upvar space. Either way + // it's just plumbed directly into the initializer. + self.result + .initializers + .push(LocalInitializer::AliasModule(module)); } wasmparser::ComponentOuterAliasKind::Component => { - unimplemented!("outer alias to component"); - } - } - } + let index = ComponentIndex::from_u32(index); + let mut component = ClosedOverComponent::Local(index); + let depth = self.lexical_scopes.len() - (count as usize); + for frame in self.lexical_scopes[depth..].iter_mut() { + component = + ClosedOverComponent::Upvar(frame.closure_args.components.push(component)); + } - fn lift_function( - &mut self, - ty: ComponentTypeIndex, - func: FuncIndex, - options: &[wasmparser::CanonicalOption], - ) -> ComponentFunc<'data> { - let ty = match self.types.component_outer_type(0, ty) { - TypeDef::ComponentFunc(ty) => ty, - // should not be possible after validation - _ => unreachable!(), - }; - let func = match &self.result.funcs[func] { - Func::Core(core) => core.clone(), - - // TODO: it's not immediately obvious how to implement this. Once - // lowered imports are fully implemented it may be the case that - // implementing this "just falls out" of the same implementation. - // This technically is valid and basically just result in leaking - // memory into core wasm (since nothing is around to call - // deallocation/free functions). - Func::Lowered(_) => unimplemented!("lifting a lowered function"), - }; - let options = self.canonical_options(options); - ComponentFunc::Lifted { ty, func, options } - } - - fn lower_function( - &mut self, - func: ComponentFuncIndex, - options: &[wasmparser::CanonicalOption], - ) -> Func<'data> { - let options = self.canonical_options(options); - match self.result.component_funcs[func].clone() { - ComponentFunc::Import(import) => { - let import = self.runtime_import_index(import); - let index = LoweredIndex::from_u32(self.result.component.num_lowerings); - self.result.component.num_lowerings += 1; - let fill_idx = self.result.component.initializers.len(); self.result - .component .initializers - .push(Initializer::LowerImport(LowerImport { - index, - import, - options, - // This is filled after the component is finished when - // we have wasmparser's type information available, so - // leave a dummy for now to get filled in. - canonical_abi: SignatureIndex::from_u32(0), - })); - self.result - .signatures_to_fill - .push((fill_idx, self.result.funcs.next_key())); - Func::Lowered(index) + .push(LocalInitializer::AliasComponent(component)); } - - // TODO: From reading the spec, this technically should create a - // function that lifts the arguments and then afterwards - // unconditionally traps. That would mean that this validates the - // arguments within the context of `options` and then traps. - ComponentFunc::Lifted { .. } => unimplemented!("lower a lifted function"), } } - fn canonical_options(&mut self, opts: &[wasmparser::CanonicalOption]) -> CanonicalOptions { - let mut ret = CanonicalOptions::default(); + fn canonical_options(&mut self, opts: &[wasmparser::CanonicalOption]) -> LocalCanonicalOptions { + let mut ret = LocalCanonicalOptions { + string_encoding: StringEncoding::Utf8, + memory: None, + realloc: None, + post_return: None, + }; for opt in opts { match opt { wasmparser::CanonicalOption::UTF8 => { @@ -1049,93 +806,18 @@ impl<'a, 'data> Translator<'a, 'data> { } wasmparser::CanonicalOption::Memory(idx) => { let idx = MemoryIndex::from_u32(*idx); - let memory = self.result.memories[idx].to_core_export(|i| match i { - EntityIndex::Memory(i) => i, - _ => unreachable!(), - }); - let memory = self.runtime_memory(memory); - ret.memory = Some(memory); + ret.memory = Some(idx); } wasmparser::CanonicalOption::Realloc(idx) => { let idx = FuncIndex::from_u32(*idx); - let realloc = self.result.funcs[idx].to_core_def(); - let realloc = self.runtime_realloc(realloc); - ret.realloc = Some(realloc); + ret.realloc = Some(idx); } - wasmparser::CanonicalOption::PostReturn(_) => { - unimplemented!("post-return"); + wasmparser::CanonicalOption::PostReturn(idx) => { + let idx = FuncIndex::from_u32(*idx); + ret.post_return = Some(idx); } } } return ret; } - - fn runtime_import_index(&mut self, import: RuntimeImport) -> RuntimeImportIndex { - if let Some(idx) = self.result.import_map.get(&import) { - return *idx; - } - let idx = self - .result - .component - .imports - .push((import.source, import.exports.clone())); - self.result.import_map.insert(import, idx); - return idx; - } - - fn runtime_memory(&mut self, export: CoreExport) -> RuntimeMemoryIndex { - if let Some(idx) = self.result.memory_to_runtime.get(&export) { - return *idx; - } - let index = RuntimeMemoryIndex::from_u32(self.result.component.num_runtime_memories); - self.result.component.num_runtime_memories += 1; - self.result.memory_to_runtime.insert(export.clone(), index); - self.result - .component - .initializers - .push(Initializer::ExtractMemory { index, export }); - index - } - - fn runtime_realloc(&mut self, def: CoreDef) -> RuntimeReallocIndex { - if let Some(idx) = self.result.realloc_to_runtime.get(&def) { - return *idx; - } - let index = RuntimeReallocIndex::from_u32(self.result.component.num_runtime_reallocs); - self.result.component.num_runtime_reallocs += 1; - self.result.realloc_to_runtime.insert(def.clone(), index); - self.result - .component - .initializers - .push(Initializer::ExtractRealloc { index, def }); - index - } -} - -impl CoreSource<'_> { - fn to_core_export(&self, get_index: impl FnOnce(EntityIndex) -> T) -> CoreExport { - match self { - CoreSource::Index(instance, index) => CoreExport { - instance: *instance, - item: ExportItem::Index(get_index(*index)), - }, - CoreSource::Export(instance, name) => CoreExport { - instance: *instance, - item: ExportItem::Name(name.to_string()), - }, - } - } - - fn to_core_def(&self) -> CoreDef { - self.to_core_export(|i| i).into() - } -} - -impl Func<'_> { - fn to_core_def(&self) -> CoreDef { - match self { - Func::Core(src) => src.to_core_def(), - Func::Lowered(idx) => CoreDef::Lowered(*idx), - } - } } diff --git a/crates/environ/src/component/translate/inline.rs b/crates/environ/src/component/translate/inline.rs new file mode 100644 index 0000000000..bac194ea5c --- /dev/null +++ b/crates/environ/src/component/translate/inline.rs @@ -0,0 +1,927 @@ +//! Implementation of "inlining" a component into a flat list of initializers. +//! +//! After the first phase of compiling a component we're left with a single +//! root `Translation` for the original component along with a "static" list of +//! child components. Each `Translation` has a list of `LocalInitializer` items +//! inside of it which is a primitive representation of how the component +//! should be constructed with effectively one initializer per item in the +//! index space of a component. This "local initializer" list would be +//! relatively inefficient to process at runtime and more importantly doesn't +//! convey enough information to understand what trampolines need to be +//! compiled or what fused adapters need to be generated. This consequently is +//! the motivation for this file. +//! +//! The second phase of compilation, inlining here, will in a sense interpret +//! the initializers, at compile time, into a new list of `GlobalInitializer` entries +//! which are a sort of "global initializer". The generated `GlobalInitializer` is +//! much more specific than the `LocalInitializer` and additionally far fewer +//! `GlobalInitializer` structures are generated (in theory) than there are local +//! initializers. +//! +//! The "inlining" portion of the name of this module indicates how the +//! instantiation of a component is interpreted as calling a function. The +//! function's arguments are the imports provided to the instantiation of a +//! component, and further nested function calls happen on a stack when a +//! nested component is instantiated. The inlining then refers to how this +//! stack of instantiations is flattened to one list of `GlobalInitializer` +//! entries to represent the process of instantiating a component graph, +//! similar to how function inlining removes call instructions and creates one +//! giant function for a call graph. Here there are no inlining heuristics or +//! anything like that, we simply inline everything into the root component's +//! list of initializers. +//! +//! Another primary task this module performs is a form of dataflow analysis +//! to represent items in each index space with their definition rather than +//! references of relative indices. These definitions (all the `*Def` types in +//! this module) are not local to any one nested component and instead +//! represent state available at runtime tracked in the final `Component` +//! produced. +//! +//! With all this pieced together the general idea is relatively +//! straightforward. All of a component's initializers are processed in sequence +//! where instantiating a nested component pushes a "frame" onto a stack to +//! start executing and we resume at the old one when we're done. Items are +//! tracked where they come from and at the end after processing only the +//! side-effectful initializers are emitted to the `GlobalInitializer` list in the +//! final `Component`. + +use crate::component::translate::*; +use crate::{ModuleTranslation, PrimaryMap}; +use indexmap::IndexMap; + +pub(super) fn run( + types: &mut ComponentTypesBuilder, + result: &Translation<'_>, + nested_modules: &PrimaryMap>, + nested_components: &PrimaryMap>, +) -> Result { + let mut inliner = Inliner { + types, + nested_modules, + nested_components, + result: Component::default(), + import_path_interner: Default::default(), + runtime_realloc_interner: Default::default(), + runtime_memory_interner: Default::default(), + }; + + // The initial arguments to the root component are all host imports. This + // means that they're all using the `ComponentItemDef::Host` variant. Here + // an `ImportIndex` is allocated for each item and then the argument is + // recorded. + // + // Note that this is represents the abstract state of a host import of an + // item since we don't know the precise structure of the host import. + let mut args = HashMap::with_capacity(result.exports.len()); + for init in result.initializers.iter() { + let (name, ty) = match *init { + LocalInitializer::Import(name, ty) => (name, ty), + _ => continue, + }; + let index = inliner.result.import_types.push((name.to_string(), ty)); + let path = ImportPath::root(index); + args.insert( + name, + match ty { + TypeDef::Module(ty) => ComponentItemDef::Module(ModuleDef::Import(path, ty)), + TypeDef::ComponentInstance(ty) => { + ComponentItemDef::Instance(ComponentInstanceDef::Import(path, ty)) + } + TypeDef::ComponentFunc(_ty) => { + ComponentItemDef::Func(ComponentFuncDef::Import(path)) + } + // FIXME(#4283) should commit one way or another to how this + // should be treated. + TypeDef::Component(_ty) => bail!("root-level component imports are not supported"), + TypeDef::Interface(_ty) => unimplemented!("import of a type"), + TypeDef::CoreFunc(_ty) => unreachable!(), + }, + ); + } + + // This will run the inliner to completion after being seeded with the + // initial frame. When the inliner finishes it will return the exports of + // the root frame which are then used for recording the exports of the + // component. + let mut frames = vec![InlinerFrame::new(result, ComponentClosure::default(), args)]; + let exports = inliner.run(&mut frames)?; + assert!(frames.is_empty()); + + for (name, def) in exports { + let export = match def { + // Exported modules are currently saved in a `PrimaryMap`, at + // runtime, so an index (`RuntimeModuleIndex`) is assigned here and + // then an initializer is recorded about where the module comes + // from. + ComponentItemDef::Module(module) => { + let index = RuntimeModuleIndex::from_u32(inliner.result.num_runtime_modules); + inliner.result.num_runtime_modules += 1; + let init = match module { + ModuleDef::Static(idx) => GlobalInitializer::SaveStaticModule(idx), + ModuleDef::Import(path, _) => { + GlobalInitializer::SaveModuleImport(inliner.runtime_import(&path)) + } + }; + inliner.result.initializers.push(init); + Export::Module(index) + } + + // Currently only exported functions through liftings are supported + // which simply record the various lifting options here which get + // processed at runtime. + ComponentItemDef::Func(func) => match func { + ComponentFuncDef::Lifted { ty, func, options } => { + Export::LiftedFunction { ty, func, options } + } + ComponentFuncDef::Import(_) => unimplemented!("reexporting a function import"), + }, + + ComponentItemDef::Instance(_) => unimplemented!("exporting an instance to the host"), + + // FIXME(#4283) should make an official decision on whether this is + // the final treatment of this or not. + ComponentItemDef::Component(_) => { + bail!("exporting a component from the root component is not supported") + } + }; + + inliner.result.exports.insert(name.to_string(), export); + } + + Ok(inliner.result) +} + +struct Inliner<'a> { + /// Global type information for the entire component. + /// + /// Note that the mutability is used here to register a `SignatureIndex` for + /// the wasm function signature of lowered imports. + types: &'a mut ComponentTypesBuilder, + + /// The list of static modules that were found during initial translation of + /// the component. + /// + /// This is used during the instantiation of these modules to ahead-of-time + /// order the arguments precisely according to what the module is defined as + /// needing which avoids the need to do string lookups or permute arguments + /// at runtime. + nested_modules: &'a PrimaryMap>, + + /// The list of static components that were found during initial translation of + /// the component. + /// + /// This is used when instantiating nested components to push a new + /// `InlinerFrame` with the `Translation`s here. + nested_components: &'a PrimaryMap>, + + /// The final `Component` that is being constructed and returned from this + /// inliner. + result: Component, + + // Maps used to "intern" various runtime items to only save them once at + // runtime instead of multiple times. + import_path_interner: HashMap, RuntimeImportIndex>, + runtime_realloc_interner: HashMap, + runtime_memory_interner: HashMap, RuntimeMemoryIndex>, +} + +/// A "stack frame" as part of the inlining process, or the progress through +/// instantiating a component. +/// +/// All instantiations of a component will create an `InlinerFrame` and are +/// incrementally processed via the `initializers` list here. Note that the +/// inliner frames are stored on the heap to avoid recursion based on user +/// input. +struct InlinerFrame<'a> { + /// The remaining initializers to process when instantiating this component. + initializers: std::slice::Iter<'a, LocalInitializer<'a>>, + + /// The component being instantiated. + translation: &'a Translation<'a>, + + /// The "closure arguments" to this component, or otherwise the maps indexed + /// by `ModuleUpvarIndex` and `ComponentUpvarIndex`. This is created when + /// a component is created and stored as part of a component's state during + /// inlining. + closure: ComponentClosure<'a>, + + /// The arguments to the creation of this component. + /// + /// At the root level these are all imports from the host and between + /// components this otherwise tracks how all the arguments are defined. + args: HashMap<&'a str, ComponentItemDef<'a>>, + + // core wasm index spaces + funcs: PrimaryMap, + memories: PrimaryMap>, + tables: PrimaryMap>, + globals: PrimaryMap>, + modules: PrimaryMap>, + + // component model index spaces + component_funcs: PrimaryMap>, + module_instances: PrimaryMap>, + component_instances: PrimaryMap>, + components: PrimaryMap>, +} + +/// "Closure state" for a component which is resolved from the `ClosedOverVars` +/// state that was calculated during translation. +// +// FIXME: this is cloned quite a lot and given the internal maps if this is a +// perf issue we may want to `Rc` these fields. Note that this is only a perf +// hit at compile-time though which we in general don't pay too too much +// attention to. +#[derive(Default, Clone)] +struct ComponentClosure<'a> { + modules: PrimaryMap>, + components: PrimaryMap>, +} + +/// Representation of a "path" into an import. +/// +/// Imports from the host at this time are one of three things: +/// +/// * Functions +/// * Core wasm modules +/// * "Instances" of these three items +/// +/// The "base" values are functions and core wasm modules, but the abstraction +/// of an instance allows embedding functions/modules deeply within other +/// instances. This "path" represents optionally walking through a host instance +/// to get to the final desired item. At runtime instances are just maps of +/// values and so this is used to ensure that we primarily only deal with +/// individual functions and modules instead of synthetic instances. +#[derive(Clone, PartialEq, Hash, Eq)] +struct ImportPath<'a> { + index: ImportIndex, + path: Vec<&'a str>, +} + +/// Representation of all items which can be defined within a component. +/// +/// This is the "value" of an item defined within a component and is used to +/// represent both imports and exports. +#[derive(Clone)] +enum ComponentItemDef<'a> { + Component(ComponentDef<'a>), + Instance(ComponentInstanceDef<'a>), + Func(ComponentFuncDef<'a>), + Module(ModuleDef<'a>), +} + +#[derive(Clone)] +enum ModuleDef<'a> { + /// A core wasm module statically defined within the original component. + /// + /// The `StaticModuleIndex` indexes into the `static_modules` map in the + /// `Inliner`. + Static(StaticModuleIndex), + + /// A core wasm module that was imported from the host. + Import(ImportPath<'a>, TypeModuleIndex), +} + +// Note that unlike all other `*Def` types which are not allowed to have local +// indices this type does indeed have local indices. That is represented with +// the lack of a `Clone` here where once this is created it's never moved across +// components because module instances always stick within one component. +enum ModuleInstanceDef<'a> { + /// A core wasm module instance was created through the instantiation of a + /// module. + /// + /// The `RuntimeInstanceIndex` was the index allocated as this was the + /// `n`th instantiation and the `ModuleIndex` points into an + /// `InlinerFrame`'s local index space. + Instantiated(RuntimeInstanceIndex, ModuleIndex), + + /// A "synthetic" core wasm module which is just a bag of named indices. + /// + /// Note that this can really only be used for passing as an argument to + /// another module's instantiation and is used to rename arguments locally. + Synthetic(&'a HashMap<&'a str, EntityIndex>), +} + +#[derive(Clone)] +enum ComponentFuncDef<'a> { + /// A host-imported component function. + Import(ImportPath<'a>), + + /// A core wasm function was lifted into a component function. + Lifted { + ty: TypeFuncIndex, + func: CoreExport, + options: CanonicalOptions, + }, +} + +#[derive(Clone)] +enum ComponentInstanceDef<'a> { + /// A host-imported instance. + /// + /// This typically means that it's "just" a map of named values. It's not + /// actually supported to take a `wasmtime::component::Instance` and pass it + /// to another instance at this time. + Import(ImportPath<'a>, TypeComponentInstanceIndex), + + /// A concrete map of values. + /// + /// This is used for both instantiated components as well as "synthetic" + /// components. This variant can be used for both because both are + /// represented by simply a bag of items within the entire component + /// instantiation process. + // + // FIXME: same as the issue on `ComponentClosure` where this is cloned a lot + // and may need `Rc`. + Items(IndexMap<&'a str, ComponentItemDef<'a>>), +} + +#[derive(Clone)] +struct ComponentDef<'a> { + index: StaticComponentIndex, + closure: ComponentClosure<'a>, +} + +impl<'a> Inliner<'a> { + fn run( + &mut self, + frames: &mut Vec>, + ) -> Result>> { + // This loop represents the execution of the instantiation of a + // component. This is an iterative process which is finished once all + // initializers are processed. Currently this is modeled as an infinite + // loop which drives the top-most iterator of the `frames` stack + // provided as an argument to this function. + loop { + let frame = frames.last_mut().unwrap(); + match frame.initializers.next() { + // Process the initializer and if it started the instantiation + // of another component then we push that frame on the stack to + // continue onwards. + Some(init) => match self.initializer(frame, init)? { + Some(new_frame) => frames.push(new_frame), + None => {} + }, + + // If there are no more initializers for this frame then the + // component it represents has finished instantiation. The + // exports of the component are collected and then the entire + // frame is discarded. The exports are then either pushed in the + // parent frame, if any, as a new component instance or they're + // returned from this function for the root set of exports. + None => { + let exports = frame + .translation + .exports + .iter() + .map(|(name, item)| (*name, frame.item(*item))) + .collect(); + frames.pop(); + match frames.last_mut() { + Some(parent) => { + parent + .component_instances + .push(ComponentInstanceDef::Items(exports)); + } + None => break Ok(exports), + } + } + } + } + } + + fn initializer( + &mut self, + frame: &mut InlinerFrame<'a>, + initializer: &'a LocalInitializer, + ) -> Result>> { + use LocalInitializer::*; + + match initializer { + // When a component imports an item the actual definition of the + // item is looked up here (not at runtime) via its name. The + // arguments provided in our `InlinerFrame` describe how each + // argument was defined, so we simply move it from there into the + // correct index space. + // + // Note that for the root component this will add `*::Import` items + // but for sub-components this will do resolution to connect what + // was provided as an import at the instantiation-site to what was + // needed during the component's instantiation. + Import(name, _ty) => match &frame.args[name] { + ComponentItemDef::Module(i) => { + frame.modules.push(i.clone()); + } + ComponentItemDef::Component(i) => { + frame.components.push(i.clone()); + } + ComponentItemDef::Instance(i) => { + frame.component_instances.push(i.clone()); + } + ComponentItemDef::Func(i) => { + frame.component_funcs.push(i.clone()); + } + }, + + // Lowering a component function to a core wasm function is + // generally what "triggers compilation". Here various metadata is + // recorded and then the final component gets an initializer + // recording the lowering. + // + // NB: at this time only lowered imported functions are supported. + Lower(func, options) => { + // Assign this lowering a unique index and determine the core + // wasm function index we're defining. + let index = LoweredIndex::from_u32(self.result.num_lowerings); + self.result.num_lowerings += 1; + let func_index = frame.funcs.push(CoreDef::Lowered(index)); + + // Use the type information from `wasmparser` to lookup the core + // wasm function signature of the lowered function. This avoids + // us having to reimplement the + // translate-interface-types-to-the-canonical-abi logic. The + // type of the function is then intern'd to get a + // `SignatureIndex` which is later used at runtime for a + // `VMSharedSignatureIndex`. + let lowered_function_type = frame + .translation + .types + .as_ref() + .unwrap() + .function_at(func_index.as_u32()) + .expect("should be in-bounds"); + let canonical_abi = self + .types + .module_types_builder() + .wasm_func_type(lowered_function_type.clone().try_into()?); + + let options = self.canonical_options(frame, options); + match &frame.component_funcs[*func] { + // If this component function was originally a host import + // then this is a lowered host function which needs a + // trampoline to enter WebAssembly. That's recorded here + // with all relevant information. + ComponentFuncDef::Import(path) => { + let import = self.runtime_import(path); + self.result + .initializers + .push(GlobalInitializer::LowerImport(LowerImport { + canonical_abi, + import, + index, + options, + })); + } + + // TODO: Lowering a lift function could mean one of two + // things: + // + // * This could mean that a "fused adapter" was just + // identified. If the lifted function here comes from a + // different component than we're lowering into then we + // have identified the fusion location of two components + // talking to each other. Metadata needs to be recorded + // here about the fusion to get something generated by + // Cranelift later on. + // + // * Otherwise if the lifted function is in the same + // component that we're lowering into then that means + // something "funky" is happening. This needs to be + // carefully implemented with respect to the + // may_{enter,leave} flags as specified with the canonical + // ABI. The careful consideration for how to do this has + // not yet happened. + // + // In general this is almost certainly going to require some + // new variant of `GlobalInitializer` in one form or another. + ComponentFuncDef::Lifted { .. } => { + unimplemented!("lowering a lifted function") + } + } + } + + // Lifting a core wasm function is relatively easy for now in that + // some metadata about the lifting is simply recorded. This'll get + // plumbed through to exports or a fused adapter later on. + Lift(ty, func, options) => { + let options = self.canonical_options(frame, options); + frame.component_funcs.push(ComponentFuncDef::Lifted { + ty: *ty, + func: match frame.funcs[*func].clone() { + CoreDef::Export(e) => e.map_index(|i| match i { + EntityIndex::Function(i) => i, + _ => unreachable!("not possible in valid components"), + }), + + // TODO: lifting a lowered function only happens within + // one component so this runs afoul of "someone needs to + // really closely interpret the may_{enter,leave} flags" + // in the component model spec. That has not currently + // been done so this is left to panic. + CoreDef::Lowered(_) => unimplemented!("lifting a lowered function"), + }, + options, + }); + } + + ModuleStatic(idx) => { + frame.modules.push(ModuleDef::Static(*idx)); + } + + // Instantiation of a module is one of the meatier initializers that + // we'll generate. The main magic here is that for a statically + // known module we can order the imports as a list to exactly what + // the static module needs to be instantiated. For imported modules, + // however, the runtime string resolution must happen at runtime so + // that is deferred here by organizing the arguments as a two-layer + // `IndexMap` of what we're providing. + // + // In both cases though a new `RuntimeInstanceIndex` is allocated + // and an initializer is recorded to indicate that it's being + // instantiated. + ModuleInstantiate(module, args) => { + let init = match &frame.modules[*module] { + ModuleDef::Static(idx) => { + let mut defs = Vec::new(); + for (module, name, _ty) in self.nested_modules[*idx].module.imports() { + let instance = args[module]; + defs.push( + self.core_def_of_module_instance_export(frame, instance, name), + ); + } + InstantiateModule::Static(*idx, defs.into()) + } + ModuleDef::Import(path, ty) => { + let mut defs = IndexMap::new(); + for ((module, name), _) in self.types[*ty].imports.iter() { + let instance = args[module.as_str()]; + let def = + self.core_def_of_module_instance_export(frame, instance, name); + defs.entry(module.to_string()) + .or_insert(IndexMap::new()) + .insert(name.to_string(), def); + } + let index = self.runtime_import(path); + InstantiateModule::Import(index, defs) + } + }; + + let idx = RuntimeInstanceIndex::from_u32(self.result.num_runtime_instances); + self.result.num_runtime_instances += 1; + self.result + .initializers + .push(GlobalInitializer::InstantiateModule(init)); + frame + .module_instances + .push(ModuleInstanceDef::Instantiated(idx, *module)); + } + + ModuleSynthetic(map) => { + frame + .module_instances + .push(ModuleInstanceDef::Synthetic(map)); + } + + // This is one of the stages of the "magic" of implementing outer + // aliases to components and modules. For more information on this + // see the documentation on `LexicalScope`. This stage of the + // implementation of outer aliases is where the `ClosedOverVars` is + // transformed into a `ComponentClosure` state using the current + // `InlinerFrame`'s state. This will capture the "runtime" state of + // outer components and upvars and such naturally as part of the + // inlining process. + ComponentStatic(index, vars) => { + frame.components.push(ComponentDef { + index: *index, + closure: ComponentClosure { + modules: vars + .modules + .iter() + .map(|(_, m)| frame.closed_over_module(m)) + .collect(), + components: vars + .components + .iter() + .map(|(_, m)| frame.closed_over_component(m)) + .collect(), + }, + }); + } + + // Like module instantiation is this is a "meaty" part, and don't be + // fooled by the relative simplicity of this case. This is + // implemented primarily by the `Inliner` structure and the design + // of this entire module, so the "easy" step here is to simply + // create a new inliner frame and return it to get pushed onto the + // stack. + ComponentInstantiate(component, args) => { + let component: &ComponentDef<'a> = &frame.components[*component]; + let frame = InlinerFrame::new( + &self.nested_components[component.index], + component.closure.clone(), + args.iter() + .map(|(name, item)| (*name, frame.item(*item))) + .collect(), + ); + return Ok(Some(frame)); + } + + ComponentSynthetic(map) => { + let items = map + .iter() + .map(|(name, index)| (*name, frame.item(*index))) + .collect(); + frame + .component_instances + .push(ComponentInstanceDef::Items(items)); + } + + // Core wasm aliases, this and the cases below, are creating + // `CoreExport` items primarily to insert into the index space so we + // can create a unique identifier pointing to each core wasm export + // with the instance and relevant index/name as necessary. + AliasExportFunc(instance, name) => { + frame + .funcs + .push(self.core_def_of_module_instance_export(frame, *instance, *name)); + } + + AliasExportTable(instance, name) => { + frame.tables.push( + match self.core_def_of_module_instance_export(frame, *instance, *name) { + CoreDef::Export(e) => e, + CoreDef::Lowered(_) => unreachable!(), + }, + ); + } + + AliasExportGlobal(instance, name) => { + frame.globals.push( + match self.core_def_of_module_instance_export(frame, *instance, *name) { + CoreDef::Export(e) => e, + CoreDef::Lowered(_) => unreachable!(), + }, + ); + } + + AliasExportMemory(instance, name) => { + frame.memories.push( + match self.core_def_of_module_instance_export(frame, *instance, *name) { + CoreDef::Export(e) => e, + CoreDef::Lowered(_) => unreachable!(), + }, + ); + } + + AliasComponentExport(instance, name) => { + match &frame.component_instances[*instance] { + // Aliasing an export from an imported instance means that + // we're extending the `ImportPath` by one name, represented + // with the clone + push here. Afterwards an appropriate + // item is then pushed in the relevant index space. + ComponentInstanceDef::Import(path, ty) => { + let mut path = path.clone(); + path.path.push(name); + match self.types[*ty].exports[*name] { + TypeDef::ComponentFunc(_) => { + frame.component_funcs.push(ComponentFuncDef::Import(path)); + } + TypeDef::ComponentInstance(ty) => { + frame + .component_instances + .push(ComponentInstanceDef::Import(path, ty)); + } + TypeDef::Module(ty) => { + frame.modules.push(ModuleDef::Import(path, ty)); + } + TypeDef::Component(_) => { + unimplemented!("aliasing component export of component import") + } + TypeDef::Interface(_) => { + unimplemented!("aliasing type export of component import") + } + + // not possible with valid components + TypeDef::CoreFunc(_) => unreachable!(), + } + } + + // Given a component instance which was either created + // through instantiation of a component or through a + // synthetic renaming of items we just schlep around the + // definitions of various items here. + ComponentInstanceDef::Items(map) => match &map[*name] { + ComponentItemDef::Func(i) => { + frame.component_funcs.push(i.clone()); + } + ComponentItemDef::Module(i) => { + frame.modules.push(i.clone()); + } + ComponentItemDef::Component(i) => { + frame.components.push(i.clone()); + } + ComponentItemDef::Instance(i) => { + let instance = i.clone(); + frame.component_instances.push(instance); + } + }, + } + } + + // For more information on these see `LexicalScope` but otherwise + // this is just taking a closed over variable and inserting the + // actual definition into the local index space since this + // represents an outer alias to a module/component + AliasModule(idx) => { + frame.modules.push(frame.closed_over_module(idx)); + } + AliasComponent(idx) => { + frame.components.push(frame.closed_over_component(idx)); + } + } + + Ok(None) + } + + /// "Commits" a path of an import to an actual index which is something that + /// will be calculated at runtime. + /// + /// Note that the cost of calculating an item for a `RuntimeImportIndex` at + /// runtime is amortized with an `InstancePre` which represents "all the + /// runtime imports are lined up" and after that no more name resolution is + /// necessary. + fn runtime_import(&mut self, path: &ImportPath<'a>) -> RuntimeImportIndex { + *self + .import_path_interner + .entry(path.clone()) + .or_insert_with(|| { + self.result.imports.push(( + path.index, + path.path.iter().map(|s| s.to_string()).collect(), + )) + }) + } + + /// Returns the `CoreDef`, the canonical definition for a core wasm item, + /// for the export `name` of `instance` within `frame`. + fn core_def_of_module_instance_export( + &self, + frame: &InlinerFrame<'a>, + instance: ModuleInstanceIndex, + name: &'a str, + ) -> CoreDef { + match &frame.module_instances[instance] { + // Instantiations of a statically known module means that we can + // refer to the exported item by a precise index, skipping name + // lookups at runtime. + // + // Instantiations of an imported module, however, must do name + // lookups at runtime since we don't know the structure ahead of + // time here. + ModuleInstanceDef::Instantiated(instance, module) => { + let item = match frame.modules[*module] { + ModuleDef::Static(idx) => { + let entity = self.nested_modules[idx].module.exports[name]; + ExportItem::Index(entity) + } + ModuleDef::Import(..) => ExportItem::Name(name.to_string()), + }; + CoreExport { + instance: *instance, + item, + } + .into() + } + + // This is a synthetic instance so the canonical definition of the + // original item is returned. + ModuleInstanceDef::Synthetic(instance) => match instance[name] { + EntityIndex::Function(i) => frame.funcs[i].clone(), + EntityIndex::Table(i) => frame.tables[i].clone().into(), + EntityIndex::Global(i) => frame.globals[i].clone().into(), + EntityIndex::Memory(i) => frame.memories[i].clone().into(), + }, + } + } + + /// Translates a `LocalCanonicalOptions` which indexes into the `frame` + /// specified into a runtime representation. + /// + /// This will "intern" repeatedly reused memories or functions to avoid + /// storing them in multiple locations at runtime. + fn canonical_options( + &mut self, + frame: &InlinerFrame<'a>, + options: &LocalCanonicalOptions, + ) -> CanonicalOptions { + let memory = options.memory.map(|i| { + let export = frame.memories[i].clone().map_index(|i| match i { + EntityIndex::Memory(i) => i, + _ => unreachable!(), + }); + *self + .runtime_memory_interner + .entry(export.clone()) + .or_insert_with(|| { + let index = RuntimeMemoryIndex::from_u32(self.result.num_runtime_memories); + self.result.num_runtime_memories += 1; + self.result + .initializers + .push(GlobalInitializer::ExtractMemory(ExtractMemory { + index, + export, + })); + index + }) + }); + let realloc = options.realloc.map(|i| { + let def = frame.funcs[i].clone(); + *self + .runtime_realloc_interner + .entry(def.clone()) + .or_insert_with(|| { + let index = RuntimeReallocIndex::from_u32(self.result.num_runtime_reallocs); + self.result.num_runtime_reallocs += 1; + self.result + .initializers + .push(GlobalInitializer::ExtractRealloc(ExtractRealloc { + index, + def, + })); + index + }) + }); + if options.post_return.is_some() { + unimplemented!("post-return handling"); + } + CanonicalOptions { + string_encoding: options.string_encoding, + memory, + realloc, + } + } +} + +impl<'a> InlinerFrame<'a> { + fn new( + translation: &'a Translation<'a>, + closure: ComponentClosure<'a>, + args: HashMap<&'a str, ComponentItemDef<'a>>, + ) -> Self { + // FIXME: should iterate over the initializers of `translation` and + // calculate the size of each index space to use `with_capacity` for + // all the maps below. Given that doing such would be wordy and compile + // time is otherwise not super crucial it's not done at this time. + InlinerFrame { + translation, + closure, + args, + initializers: translation.initializers.iter(), + + funcs: Default::default(), + memories: Default::default(), + tables: Default::default(), + globals: Default::default(), + + component_instances: Default::default(), + component_funcs: Default::default(), + module_instances: Default::default(), + components: Default::default(), + modules: Default::default(), + } + } + + fn item(&self, index: ComponentItem) -> ComponentItemDef<'a> { + match index { + ComponentItem::Func(i) => ComponentItemDef::Func(self.component_funcs[i].clone()), + ComponentItem::Component(i) => ComponentItemDef::Component(self.components[i].clone()), + ComponentItem::ComponentInstance(i) => { + ComponentItemDef::Instance(self.component_instances[i].clone()) + } + ComponentItem::Module(i) => ComponentItemDef::Module(self.modules[i].clone()), + } + } + + fn closed_over_module(&self, index: &ClosedOverModule) -> ModuleDef<'a> { + match *index { + ClosedOverModule::Local(i) => self.modules[i].clone(), + ClosedOverModule::Upvar(i) => self.closure.modules[i].clone(), + } + } + + fn closed_over_component(&self, index: &ClosedOverComponent) -> ComponentDef<'a> { + match *index { + ClosedOverComponent::Local(i) => self.components[i].clone(), + ClosedOverComponent::Upvar(i) => self.closure.components[i].clone(), + } + } +} + +impl<'a> ImportPath<'a> { + fn root(index: ImportIndex) -> ImportPath<'a> { + ImportPath { + index, + path: Vec::new(), + } + } +} diff --git a/crates/environ/src/component/types.rs b/crates/environ/src/component/types.rs index d447984337..26fcc1f160 100644 --- a/crates/environ/src/component/types.rs +++ b/crates/environ/src/component/types.rs @@ -93,6 +93,25 @@ indices! { /// `Result`) pub struct TypeExpectedIndex(u32); + // ======================================================================== + // Index types used to identify modules and components during compliation. + + /// Index into a "closed over variables" list for components used to + /// implement outer aliases. For more information on this see the + /// documentation for the `LexicalScope` structure. + pub struct ModuleUpvarIndex(u32); + + /// Same as `ModuleUpvarIndex` but for components. + pub struct ComponentUpvarIndex(u32); + + /// Index into the global list of modules found within an entire component. + /// Module translations are saved on the side to get fully compiled after + /// the original component has finished being translated. + pub struct StaticModuleIndex(u32); + + /// Same as `StaticModuleIndex` but for components. + pub struct StaticComponentIndex(u32); + // ======================================================================== // These indices are actually used at runtime when managing a component at // this time. @@ -103,14 +122,6 @@ indices! { /// refer back to previously created instances for exports and such. pub struct RuntimeInstanceIndex(u32); - /// Index that represents a closed-over-module for a component. - /// - /// Components which embed modules or otherwise refer to module (such as - /// through `alias` annotations) pull in items in to the list of closed over - /// modules, and this index indexes, at runtime, which of the upvars is - /// referenced. - pub struct ModuleUpvarIndex(u32); - /// Used to index imports into a `Component` /// /// This does not correspond to anything in the binary format for the @@ -143,6 +154,9 @@ indices! { /// Same as `RuntimeMemoryIndex` except for the `realloc` function. pub struct RuntimeReallocIndex(u32); + /// Same as `RuntimeMemoryIndex` except for the `post-return` function. + pub struct RuntimePostReturnIndex(u32); + /// Index that represents an exported module from a component since that's /// currently the only use for saving the entire module state at runtime. pub struct RuntimeModuleIndex(u32); @@ -154,10 +168,10 @@ pub use crate::{FuncIndex, GlobalIndex, MemoryIndex, TableIndex, TypeIndex}; /// Equivalent of `EntityIndex` but for the component model instead of core /// wasm. -#[derive(Debug, Clone, Copy)] +#[derive(Debug, Clone, Copy, Deserialize, Serialize)] #[allow(missing_docs)] pub enum ComponentItem { - Func(FuncIndex), + Func(ComponentFuncIndex), Module(ModuleIndex), Component(ComponentIndex), ComponentInstance(ComponentInstanceIndex), diff --git a/crates/wasmtime/src/component/component.rs b/crates/wasmtime/src/component/component.rs index 350ace22f2..c2f57c7694 100644 --- a/crates/wasmtime/src/component/component.rs +++ b/crates/wasmtime/src/component/component.rs @@ -7,8 +7,7 @@ use std::path::Path; use std::ptr::NonNull; use std::sync::Arc; use wasmtime_environ::component::{ - ComponentTypes, Initializer, LoweredIndex, ModuleUpvarIndex, TrampolineInfo, Translation, - Translator, + ComponentTypes, GlobalInitializer, LoweredIndex, StaticModuleIndex, TrampolineInfo, Translator, }; use wasmtime_environ::PrimaryMap; use wasmtime_jit::CodeMemory; @@ -28,7 +27,7 @@ struct ComponentInner { /// Core wasm modules that the component defined internally, indexed by the /// compile-time-assigned `ModuleUpvarIndex`. - upvars: PrimaryMap, + static_modules: PrimaryMap, /// Registered core wasm signatures of this component, or otherwise the /// mapping of the component-local `SignatureIndex` to the engine-local @@ -111,29 +110,26 @@ impl Component { let mut validator = wasmparser::Validator::new_with_features(engine.config().features.clone()); let mut types = Default::default(); - let translation = Translator::new(tunables, &mut validator, &mut types) + let (component, modules) = Translator::new(tunables, &mut validator, &mut types) .translate(binary) .context("failed to parse WebAssembly module")?; let types = Arc::new(types.finish()); - let Translation { - component, upvars, .. - } = translation; - let (upvars, trampolines) = engine.join_maybe_parallel( + let (static_modules, trampolines) = engine.join_maybe_parallel( // In one (possibly) parallel task all the modules found within this // component are compiled. Note that this will further parallelize // function compilation internally too. || -> Result<_> { - let upvars = upvars.into_iter().map(|(_, t)| t).collect::>(); + let upvars = modules.into_iter().map(|(_, t)| t).collect::>(); let modules = engine.run_maybe_parallel(upvars, |module| { let (mmap, info) = Module::compile_functions(engine, module, types.module_types())?; - // FIXME: the `SignatureCollection` here is re-registering the - // entire list of wasm types within `types` on each invocation. - // That's ok semantically but is quite slow to do so. This - // should build up a mapping from `SignatureIndex` to - // `VMSharedSignatureIndex` once and then reuse that for each - // module somehow. + // FIXME: the `SignatureCollection` here is re-registering + // the entire list of wasm types within `types` on each + // invocation. That's ok semantically but is quite slow to + // do so. This should build up a mapping from + // `SignatureIndex` to `VMSharedSignatureIndex` once and + // then reuse that for each module somehow. Module::from_parts(engine, mmap, info, types.clone()) })?; @@ -146,7 +142,7 @@ impl Component { .initializers .iter() .filter_map(|init| match init { - Initializer::LowerImport(i) => Some(i), + GlobalInitializer::LowerImport(i) => Some(i), _ => None, }) .collect::>(); @@ -162,7 +158,7 @@ impl Component { Ok((trampolines, wasmtime_jit::mmap_vec_from_obj(obj)?)) }, ); - let upvars = upvars?; + let static_modules = static_modules?; let (trampolines, trampoline_obj) = trampolines?; let mut trampoline_obj = CodeMemory::new(trampoline_obj); let code = trampoline_obj.publish()?; @@ -180,12 +176,12 @@ impl Component { Ok(Component { inner: Arc::new(ComponentInner { component, - upvars, + static_modules, types, - trampolines, + signatures, trampoline_obj, text, - signatures, + trampolines, }), }) } @@ -194,8 +190,8 @@ impl Component { &self.inner.component } - pub(crate) fn upvar(&self, idx: ModuleUpvarIndex) -> &Module { - &self.inner.upvars[idx] + pub(crate) fn static_module(&self, idx: StaticModuleIndex) -> &Module { + &self.inner.static_modules[idx] } pub(crate) fn types(&self) -> &Arc { diff --git a/crates/wasmtime/src/component/instance.rs b/crates/wasmtime/src/component/instance.rs index c18d0714e4..b0fe5b836e 100644 --- a/crates/wasmtime/src/component/instance.rs +++ b/crates/wasmtime/src/component/instance.rs @@ -7,11 +7,11 @@ use anyhow::{anyhow, Context, Result}; use std::marker; use std::sync::Arc; use wasmtime_environ::component::{ - ComponentTypes, CoreDef, CoreExport, Export, ExportItem, Initializer, InstantiateModule, - LowerImport, RuntimeImportIndex, RuntimeInstanceIndex, RuntimeMemoryIndex, RuntimeModuleIndex, - RuntimeReallocIndex, + ComponentTypes, CoreDef, CoreExport, Export, ExportItem, ExtractMemory, ExtractRealloc, + GlobalInitializer, InstantiateModule, LowerImport, RuntimeImportIndex, RuntimeInstanceIndex, + RuntimeModuleIndex, }; -use wasmtime_environ::{EntityIndex, MemoryIndex, PrimaryMap}; +use wasmtime_environ::{EntityIndex, PrimaryMap}; use wasmtime_runtime::component::{ComponentInstance, OwnedComponentInstance}; /// An instantiated component. @@ -167,6 +167,16 @@ impl InstanceData { let instance = store.instance_mut(id); let idx = match &item.item { ExportItem::Index(idx) => (*idx).into(), + + // FIXME: ideally at runtime we don't actually do any name lookups + // here. This will only happen when the host supplies an imported + // module so while the structure can't be known at compile time we + // do know at `InstancePre` time, for example, what all the host + // imports are. In theory we should be able to, as part of + // `InstancePre` construction, perform all name=>index mappings + // during that phase so the actual instantiation of an `InstancePre` + // skips all string lookups. This should probably only be + // investigated if this becomes a performance issue though. ExportItem::Name(name) => instance.module().exports[name], }; instance.get_export_by_index(idx) @@ -220,13 +230,13 @@ impl<'a> Instantiator<'a> { let env_component = self.component.env_component(); for initializer in env_component.initializers.iter() { match initializer { - Initializer::InstantiateModule(m) => { + GlobalInitializer::InstantiateModule(m) => { let module; let imports = match m { // Since upvars are statically know we know that the // `args` list is already in the right order. - InstantiateModule::Upvar(idx, args) => { - module = self.component.upvar(*idx); + InstantiateModule::Static(idx, args) => { + module = self.component.static_module(*idx); self.build_imports(store.0, module, args.iter()) } @@ -234,6 +244,10 @@ impl<'a> Instantiator<'a> { // lookups with strings to determine the order of the // imports since it's whatever the actual module // requires. + // + // FIXME: see the note in `ExportItem::Name` handling + // above for how we ideally shouldn't do string lookup + // here. InstantiateModule::Import(idx, args) => { module = match &self.imports[*idx] { RuntimeImport::Module(m) => m, @@ -255,23 +269,21 @@ impl<'a> Instantiator<'a> { self.data.instances.push(i); } - Initializer::LowerImport(import) => self.lower_import(import), + GlobalInitializer::LowerImport(import) => self.lower_import(import), - Initializer::ExtractMemory { index, export } => { - self.extract_memory(store.0, *index, export) + GlobalInitializer::ExtractMemory(mem) => self.extract_memory(store.0, mem), + + GlobalInitializer::ExtractRealloc(realloc) => { + self.extract_realloc(store.0, realloc) } - Initializer::ExtractRealloc { index, def } => { - self.extract_realloc(store.0, *index, def) - } - - Initializer::SaveModuleUpvar(idx) => { + GlobalInitializer::SaveStaticModule(idx) => { self.data .exported_modules - .push(self.component.upvar(*idx).clone()); + .push(self.component.static_module(*idx).clone()); } - Initializer::SaveModuleImport(idx) => { + GlobalInitializer::SaveModuleImport(idx) => { self.data.exported_modules.push(match &self.imports[*idx] { RuntimeImport::Module(m) => m.clone(), _ => unreachable!(), @@ -307,30 +319,22 @@ impl<'a> Instantiator<'a> { self.data.funcs.push(func.clone()); } - fn extract_memory( - &mut self, - store: &mut StoreOpaque, - index: RuntimeMemoryIndex, - export: &CoreExport, - ) { - let memory = match self.data.lookup_export(store, export) { + fn extract_memory(&mut self, store: &mut StoreOpaque, memory: &ExtractMemory) { + let mem = match self.data.lookup_export(store, &memory.export) { wasmtime_runtime::Export::Memory(m) => m, _ => unreachable!(), }; - self.data.state.set_runtime_memory(index, memory.definition); + self.data + .state + .set_runtime_memory(memory.index, mem.definition); } - fn extract_realloc( - &mut self, - store: &mut StoreOpaque, - index: RuntimeReallocIndex, - def: &CoreDef, - ) { - let anyfunc = match self.data.lookup_def(store, def) { + fn extract_realloc(&mut self, store: &mut StoreOpaque, realloc: &ExtractRealloc) { + let anyfunc = match self.data.lookup_def(store, &realloc.def) { wasmtime_runtime::Export::Function(f) => f.anyfunc, _ => unreachable!(), }; - self.data.state.set_runtime_realloc(index, anyfunc); + self.data.state.set_runtime_realloc(realloc.index, anyfunc); } fn build_imports<'b>( diff --git a/crates/wasmtime/src/component/linker.rs b/crates/wasmtime/src/component/linker.rs index feb87a791d..fd5397661f 100644 --- a/crates/wasmtime/src/component/linker.rs +++ b/crates/wasmtime/src/component/linker.rs @@ -101,7 +101,7 @@ impl Linker { /// [`Component`] specified with the items defined within this linker. /// /// This method will perform as much work as possible short of actually - /// instnatiating an instance. Internally this will use the names defined + /// instantiating an instance. Internally this will use the names defined /// within this linker to satisfy the imports of the [`Component`] provided. /// Additionally this will perform type-checks against the component's /// imports against all items defined within this linker. @@ -215,7 +215,7 @@ impl LinkerInstance<'_, T> { /// first parameter. /// /// Note that `func` must be an `Fn` and must also be `Send + Sync + - /// 'static`. Shared state within a func is typically accesed with the `T` + /// 'static`. Shared state within a func is typically accessed with the `T` /// type parameter from [`Store`](crate::Store) which is accessible /// through the leading [`StoreContextMut<'_, T>`](crate::StoreContextMut) /// argument which can be provided to the `func` given here. @@ -248,7 +248,7 @@ impl LinkerInstance<'_, T> { self.as_mut().into_instance(name) } - /// Same as [`LinkerInstance::instance`] except with different liftime + /// Same as [`LinkerInstance::instance`] except with different lifetime /// parameters. pub fn into_instance(mut self, name: &str) -> Result { let name = self.strings.intern(name); diff --git a/crates/wast/src/spectest.rs b/crates/wast/src/spectest.rs index 1b672ee626..74905d02dc 100644 --- a/crates/wast/src/spectest.rs +++ b/crates/wast/src/spectest.rs @@ -44,3 +44,25 @@ pub fn link_spectest(linker: &mut Linker, store: &mut Store) -> Result< Ok(()) } + +#[cfg(feature = "component-model")] +pub fn link_component_spectest(linker: &mut component::Linker) -> Result<()> { + let engine = linker.engine().clone(); + linker.root().func_wrap("host-return-two", || Ok(2u32))?; + let mut i = linker.instance("host")?; + i.func_wrap("return-three", || Ok(3u32))?; + i.instance("nested")? + .func_wrap("return-four", || Ok(4u32))?; + + let module = Module::new( + &engine, + r#" + (module + (global (export "g") i32 i32.const 100) + (func (export "f") (result i32) i32.const 101) + ) + "#, + )?; + i.module("simple-module", &module)?; + Ok(()) +} diff --git a/crates/wast/src/wast.rs b/crates/wast/src/wast.rs index 0f1203d37d..e318f87bbb 100644 --- a/crates/wast/src/wast.rs +++ b/crates/wast/src/wast.rs @@ -1,4 +1,4 @@ -use crate::spectest::link_spectest; +use crate::spectest::*; use anyhow::{anyhow, bail, Context as _, Result}; use std::fmt::{Display, LowerHex}; use std::path::Path; @@ -125,6 +125,8 @@ impl WastContext { /// Register "spectest" which is used by the spec testsuite. pub fn register_spectest(&mut self) -> Result<()> { link_spectest(&mut self.core_linker, &mut self.store)?; + #[cfg(feature = "component-model")] + link_component_spectest(&mut self.component_linker)?; Ok(()) } diff --git a/tests/all/component_model.rs b/tests/all/component_model.rs index 87c39ca80e..3ea3021483 100644 --- a/tests/all/component_model.rs +++ b/tests/all/component_model.rs @@ -4,6 +4,7 @@ use wasmtime::{Config, Engine}; mod func; mod import; +mod nested; // A simple bump allocator which can be used with modules const REALLOC_AND_FREE: &str = r#" diff --git a/tests/all/component_model/nested.rs b/tests/all/component_model/nested.rs new file mode 100644 index 0000000000..acc04a2048 --- /dev/null +++ b/tests/all/component_model/nested.rs @@ -0,0 +1,172 @@ +use super::REALLOC_AND_FREE; +use anyhow::Result; +use wasmtime::component::*; +use wasmtime::{Module, Store, StoreContextMut}; + +#[test] +fn top_level_instance_two_level() -> Result<()> { + let component = r#" +(component + (import "c" (instance $i + (export "c" (instance + (export "m" (core module + (export "g" (global i32)) + )) + )) + )) + (component $c1 + (import "c" (instance $i + (export "c" (instance + (export "m" (core module + (export "g" (global i32)) + )) + )) + )) + (core module $verify + (import "" "g" (global i32)) + (func $start + global.get 0 + i32.const 101 + i32.ne + if unreachable end + ) + + (start $start) + ) + (core instance $m (instantiate (module $i "c" "m"))) + (core instance (instantiate $verify (with "" (instance $m)))) + ) + (instance (instantiate $c1 (with "c" (instance $i)))) +) + "#; + let module = r#" +(module + (global (export "g") i32 i32.const 101) +) + "#; + + let engine = super::engine(); + let module = Module::new(&engine, module)?; + let component = Component::new(&engine, component)?; + let mut store = Store::new(&engine, ()); + let mut linker = Linker::new(&engine); + linker.instance("c")?.instance("c")?.module("m", &module)?; + linker.instantiate(&mut store, &component)?; + Ok(()) +} + +#[test] +fn nested_many_instantiations() -> Result<()> { + let component = r#" +(component + (import "count" (func $count)) + (component $c1 + (import "count" (func $count)) + (core func $count_lower (canon lower (func $count))) + (core module $m + (import "" "" (func $count)) + (start $count) + ) + (core instance (instantiate $m (with "" (instance (export "" (func $count_lower)))))) + (core instance (instantiate $m (with "" (instance (export "" (func $count_lower)))))) + ) + (component $c2 + (import "count" (func $count)) + (instance (instantiate $c1 (with "count" (func $count)))) + (instance (instantiate $c1 (with "count" (func $count)))) + ) + (component $c3 + (import "count" (func $count)) + (instance (instantiate $c2 (with "count" (func $count)))) + (instance (instantiate $c2 (with "count" (func $count)))) + ) + (component $c4 + (import "count" (func $count)) + (instance (instantiate $c3 (with "count" (func $count)))) + (instance (instantiate $c3 (with "count" (func $count)))) + ) + + (instance (instantiate $c4 (with "count" (func $count)))) +) + "#; + let engine = super::engine(); + let component = Component::new(&engine, component)?; + let mut store = Store::new(&engine, 0); + let mut linker = Linker::new(&engine); + linker + .root() + .func_wrap("count", |mut store: StoreContextMut<'_, u32>| { + *store.data_mut() += 1; + Ok(()) + })?; + linker.instantiate(&mut store, &component)?; + assert_eq!(*store.data(), 16); + Ok(()) +} + +#[test] +fn thread_options_through_inner() -> Result<()> { + let component = format!( + r#" +(component + (import "hostfn" (func $host (param u32) (result string))) + + (component $c + (import "hostfn" (func $host (param u32) (result string))) + + (core module $libc + (memory (export "memory") 1) + {REALLOC_AND_FREE} + ) + (core instance $libc (instantiate $libc)) + + (core func $host_lower + (canon lower + (func $host) + (memory $libc "memory") + (realloc (func $libc "realloc")) + ) + ) + + (core module $m + (import "" "host" (func $host (param i32 i32))) + (import "libc" "memory" (memory 1)) + (func (export "run") (param i32) (result i32) + i32.const 42 + i32.const 100 + call $host + i32.const 100 + ) + (export "memory" (memory 0)) + ) + (core instance $m (instantiate $m + (with "" (instance (export "host" (func $host_lower)))) + (with "libc" (instance $libc)) + )) + + (func (export "run") (param u32) (result string) + (canon lift + (core func $m "run") + (memory $m "memory") + ) + ) + ) + (instance $c (instantiate $c (with "hostfn" (func $host)))) + (export "run" (func $c "run")) +) + "# + ); + let engine = super::engine(); + let component = Component::new(&engine, component)?; + let mut store = Store::new(&engine, 0); + let mut linker = Linker::new(&engine); + linker + .root() + .func_wrap("hostfn", |param: u32| Ok(param.to_string()))?; + let instance = linker.instantiate(&mut store, &component)?; + let result = instance + .get_typed_func::<(u32,), WasmStr, _>(&mut store, "run")? + .call(&mut store, (43,))?; + assert_eq!(result.to_str(&store)?, "42"); + Ok(()) +} diff --git a/tests/misc_testsuite/component-model/instance.wast b/tests/misc_testsuite/component-model/instance.wast index efd239dd24..071523f0ac 100644 --- a/tests/misc_testsuite/component-model/instance.wast +++ b/tests/misc_testsuite/component-model/instance.wast @@ -119,3 +119,119 @@ )) )) ) + +;; indirect references through a synthetic instance +(component + (core module $m + (func (export "a")) + (table (export "b") 1 funcref) + (memory (export "c") 1) + (global (export "d") i32 i32.const 1) + ) + (core instance $i (instantiate $m)) + (core instance $i2 + (export "a1" (func $i "a")) + (export "a2" (table $i "b")) + (export "a3" (memory $i "c")) + (export "a4" (global $i "d")) + ) + + (core module $m2 + (import "" "1" (func $f)) + (import "" "2" (table 1 funcref)) + (import "" "3" (memory 1)) + (import "" "4" (global $g i32)) + ) + (core instance (instantiate $m2 + (with "" (instance + (export "1" (func $i2 "a1")) + (export "2" (table $i2 "a2")) + (export "3" (memory $i2 "a3")) + (export "4" (global $i2 "a4")) + )) + )) +) + +(component + (import "host" (instance $i (export "return-three" (func (result u32))))) + + (core module $m + (import "host" "return-three" (func $three (result i32))) + (func $start + call $three + i32.const 3 + i32.ne + if unreachable end + ) + (start $start) + ) + (core func $three_lower + (canon lower (func $i "return-three")) + ) + (core instance (instantiate $m + (with "host" (instance (export "return-three" (func $three_lower)))) + )) +) + +(component + (import "host" (instance $i + (export "nested" (instance + (export "return-four" (func (result u32))) + )) + )) + + (core module $m + (import "host" "return-three" (func $three (result i32))) + (func $start + call $three + i32.const 4 + i32.ne + if unreachable end + ) + (start $start) + ) + (core func $three_lower + (canon lower (func $i "nested" "return-four")) + ) + (core instance (instantiate $m + (with "host" (instance (export "return-three" (func $three_lower)))) + )) +) + +(component + (import "host" (instance $i + (export "simple-module" (core module)) + )) + + (core instance (instantiate (module $i "simple-module"))) +) + +(component + (import "host" (instance $i + (export "simple-module" (core module + (export "f" (func (result i32))) + (export "g" (global i32)) + )) + )) + + (core instance $i (instantiate (module $i "simple-module"))) + (core module $verify + (import "host" "f" (func $f (result i32))) + (import "host" "g" (global $g i32)) + + (func $start + call $f + i32.const 101 + i32.ne + if unreachable end + + global.get $g + i32.const 100 + i32.ne + if unreachable end + ) + (start $start) + ) + + (core instance (instantiate $verify (with "host" (instance $i)))) +) diff --git a/tests/misc_testsuite/component-model/nested.wast b/tests/misc_testsuite/component-model/nested.wast new file mode 100644 index 0000000000..6373c287e3 --- /dev/null +++ b/tests/misc_testsuite/component-model/nested.wast @@ -0,0 +1,451 @@ +;; simple nested component +(component + (component) +) + +;; simple nested component with a nested module +(component + (component + (core module) + ) +) + +;; simple instantiation of a nested component +(component + (component $c) + (instance (instantiate $c)) + (instance (instantiate $c + (with "x" (component $c)) + )) +) + +;; instantiate a module during a nested component, and also instantiate it +;; as an export of the nested component +(component + (component $c + (core module $m) + (core instance (instantiate $m)) + (export "m" (core module $m)) + ) + (instance $i (instantiate $c)) + (core instance $i (instantiate (module $i "m"))) +) + +;; instantiate an inner exported module with two different modules and +;; verify imports match +(component + (component $c + (core module $m + (import "" "g" (global $g i32)) + (import "" "f" (func $f (result i32))) + + (func $start + call $f + global.get $g + i32.ne + if unreachable end) + + (start $start) + ) + + (core module $m2 + (global (export "g") i32 i32.const 1) + (func (export "f") (result i32) i32.const 1) + ) + (core instance $i2 (instantiate $m2)) + (core instance (instantiate $m (with "" (instance $i2)))) + + (export "m" (core module $m)) + ) + (instance $i (instantiate $c)) + (core module $m2 + (global (export "g") i32 i32.const 5) + (func (export "f") (result i32) i32.const 5) + ) + (core instance $i2 (instantiate $m2)) + (core instance (instantiate (module $i "m") (with "" (instance $i2)))) +) + +;; instantiate an inner component with a module import +(component + (component $c + (import "m" (core module $m + (export "g" (global i32)) + )) + + (core instance $i (instantiate $m)) + + (core module $verify + (import "" "g" (global $g i32)) + + (func $start + global.get $g + i32.const 2 + i32.ne + if unreachable end + ) + + (start $start) + ) + (core instance (instantiate $verify (with "" (instance $i)))) + ) + + (core module $m + (global (export "g") i32 (i32.const 2)) + ) + (instance (instantiate $c (with "m" (core module $m)))) +) + +;; instantiate an inner component with a module import that itself has imports +(component + (component $c + (import "m" (core module $m + (import "" "g" (global i32)) + )) + (core module $m2 + (global (export "g") i32 i32.const 2100) + ) + (core instance $m2 (instantiate $m2)) + (core instance (instantiate $m (with "" (instance $m2)))) + ) + + (core module $verify + (import "" "g" (global $g i32)) + + (func $start + global.get $g + i32.const 2100 + i32.ne + if unreachable end + ) + + (start $start) + ) + (instance (instantiate $c (with "m" (core module $verify)))) +) + +;; instantiate an inner component with an export from the outer component +(component $c + (core module (export "m") + (import "" "g1" (global $g1 i32)) + (import "" "g2" (global $g2 i32)) + + (func $start + global.get $g1 + i32.const 10000 + i32.ne + if unreachable end + + global.get $g2 + i32.const 20000 + i32.ne + if unreachable end + ) + + (start $start) + ) +) + +(component + (import "c" (instance $i + (export "m" (core module + (import "" "g2" (global i32)) + (import "" "g1" (global i32)) + )) + )) + + (component $c + (import "m" (core module $verify + (import "" "g2" (global i32)) + (import "" "g1" (global i32)) + )) + + (core module $m + (global (export "g1") i32 i32.const 10000) + (global (export "g2") i32 i32.const 20000) + ) + (core instance $m (instantiate $m)) + (core instance (instantiate $verify (with "" (instance $m)))) + ) + + (instance (instantiate $c (with "m" (core module $i "m")))) +) + +;; instantiate a reexported module +(component + (core module $m + (global (export "g") i32 i32.const 7) + ) + (component $c + (import "i" (instance $i + (export "m" (core module + (import "" "" (func)) + (export "g" (global i32)) + )) + )) + + (export "m" (core module $i "m")) + ) + + (instance $c (instantiate $c (with "i" (instance (export "m" (core module $m)))))) + (core module $dummy + (func (export "")) + ) + (core instance $dummy (instantiate $dummy)) + + (core instance $m (instantiate (module $c "m") (with "" (instance $dummy)))) + + (core module $verify + (import "" "g" (global i32)) + (func $start + global.get 0 + i32.const 7 + i32.ne + if unreachable end + ) + + (start $start) + ) + (core instance (instantiate $verify (with "" (instance $m)))) +) + +;; module must be found through a few layers of imports +(component $c + (core module (export "m") + (global (export "g") i32 i32.const 101) + ) +) + +(component + (import "c" (instance $i + (export "m" (core module + (export "g" (global i32)) + )) + )) + (component $c1 + (import "c" (instance $i + (export "m" (core module + (export "g" (global i32)) + )) + )) + (core module $verify + (import "" "g" (global i32)) + (func $start + global.get 0 + i32.const 101 + i32.ne + if unreachable end + ) + + (start $start) + ) + (core instance $m (instantiate (module $i "m"))) + (core instance (instantiate $verify (with "" (instance $m)))) + ) + (instance (instantiate $c1 (with "c" (instance $i)))) +) + +;; instantiate outer alias to self +(component $C + (core module $m) + (alias outer $C $m (core module $other_m)) + (core instance (instantiate $other_m)) +) + +(component $C + (component $m) + (alias outer $C $m (component $other_m)) + (instance (instantiate $other_m)) +) + + +;; closing over an outer alias which is actually an argument to some +;; instantiation +(component + (component $c + (import "c" (core module $c + (export "a" (global i32)) + )) + + (component (export "c") + (export "m" (core module $c)) + ) + ) + + (core module $m1 (global (export "a") i32 i32.const 1)) + (core module $m2 (global (export "a") i32 i32.const 2)) + + (instance $c1 (instantiate $c (with "c" (core module $m1)))) + (instance $c2 (instantiate $c (with "c" (core module $m2)))) + + (instance $m1_container (instantiate (component $c1 "c"))) + (instance $m2_container (instantiate (component $c2 "c"))) + + (core instance $core1 (instantiate (module $m1_container "m"))) + (core instance $core2 (instantiate (module $m2_container "m"))) + + (core module $verify + (import "core1" "a" (global $a i32)) + (import "core2" "a" (global $b i32)) + + (func $start + global.get $a + i32.const 1 + i32.ne + if unreachable end + + global.get $b + i32.const 2 + i32.ne + if unreachable end + ) + + (start $start) + ) + (core instance (instantiate $verify + (with "core1" (instance $core1)) + (with "core2" (instance $core2)) + )) +) + +;; simple importing of a component +(component + (component $C) + (component $other + (import "x" (component $c)) + (instance (instantiate $c)) + ) + (instance (instantiate $other (with "x" (component $C)))) +) + +;; deep nesting +(component $C + (core module $m + (global (export "g") i32 (i32.const 1)) + ) + (component $c + (core module (export "m") + (global (export "g") i32 (i32.const 2)) + ) + ) + + (component $c1 + (component $c2 (export "") + (component $c3 (export "") + (alias outer $C $m (core module $my_module)) + (alias outer $C $c (component $my_component)) + + (export "m" (core module $my_module)) + (export "c" (component $my_component)) + ) + ) + ) + + (instance $i1 (instantiate $c1)) + (instance $i2 (instantiate (component $i1 ""))) + (instance $i3 (instantiate (component $i2 ""))) + + (core instance $m1 (instantiate (module $i3 "m"))) + (instance $c (instantiate (component $i3 "c"))) + (core instance $m2 (instantiate (module $c "m"))) + + (core module $verify + (import "m1" "g" (global $m1 i32)) + (import "m2" "g" (global $m2 i32)) + + (func $start + global.get $m1 + i32.const 1 + i32.ne + if unreachable end + + global.get $m2 + i32.const 2 + i32.ne + if unreachable end + ) + (start $start) + ) + (core instance (instantiate $verify (with "m1" (instance $m1)) (with "m2" (instance $m2)))) +) + +;; Try threading through component instantiation arguments as various forms of +;; instances. +(component + (component $c + (core module $m (export "m")) + (component $c (export "c") + (core module (export "m")) + ) + (instance $i (instantiate $c)) + (instance $i2 + (export "m" (core module $m)) + (export "c" (component $c)) + (export "i" (instance $i)) + ) + (export "i" (instance $i)) + (export "i2" (instance $i2)) + ) + (instance $i (instantiate $c)) + + (component $another + (import "host" (instance + (export "m" (core module)) + (export "c" (component)) + (export "i" (instance)) + )) + ) + (instance (instantiate $another (with "host" (instance $i)))) + (instance (instantiate $another (with "host" (instance $i "i2")))) + + (instance $reexport + (export "c" (component $i "c")) + (export "m" (core module $i "m")) + (export "i" (instance $i "i")) + ) + (instance (instantiate $another (with "host" (instance $reexport)))) +) + +;; thread host functions around +(component + (import "host-return-two" (func $import (result u32))) + + ;; thread the host function through an instance + (component $c + (import "" (func $f (result u32))) + (export "f" (func $f)) + ) + (instance $c (instantiate $c (with "" (func $import)))) + (alias export $c "f" (func $import2)) + + ;; thread the host function into a nested component + (component $c2 + (import "host" (instance $i (export "return-two" (func (result u32))))) + + (core module $m + (import "host" "return-two" (func $host (result i32))) + (func $start + call $host + i32.const 2 + i32.ne + if unreachable end + ) + (start $start) + ) + + (core func $return_two + (canon lower (func $i "return-two")) + ) + (core instance (instantiate $m + (with "host" (instance + (export "return-two" (func $return_two)) + )) + )) + ) + + (instance (instantiate $c2 + (with "host" (instance + (export "return-two" (func $import2)) + )) + )) +) diff --git a/tests/misc_testsuite/component-model/simple.wast b/tests/misc_testsuite/component-model/simple.wast index af65e7b1d1..7901fb87f4 100644 --- a/tests/misc_testsuite/component-model/simple.wast +++ b/tests/misc_testsuite/component-model/simple.wast @@ -20,3 +20,15 @@ (func (export "d") (result f64) f64.const 0) ) ) + +(assert_invalid + (component + (import "" (component)) + ) + "root-level component imports are not supported") + +(assert_invalid + (component + (component (export "")) + ) + "exporting a component from the root component is not supported")