Add dataflow processing to component translation for imports (#4205)

This commit enhances the processing of components to track all the
dataflow for the processing of `canon.lower`'d functions. At the same
time this fills out a few other missing details to component processing
such as aliasing from some kinds of component instances and similar.

The major changes contained within this are the updates the `info`
submodule which has the AST of component type information. This has been
significantly refactored to prepare for representing lowered functions
and implementing those. The major change is from an `Instantiation` list
to an `Initializer` list which abstractly represents a few other
initialization actions.

This work is split off from my main work to implement component imports
of host functions. This is incomplete in the sense that it doesn't
actually finish everything necessary to define host functions and import
them into components. Instead this is only the changes necessary at the
translation layer (so far). Consequently this commit does not have tests
and also namely doesn't actually include the `VMComponentContext`
initialization and usage. The full body of work is still a bit too messy
to PR just yet so I'm hoping that this is a slimmed-down-enough piece to
adequately be reviewed.
This commit is contained in:
Alex Crichton
2022-06-01 16:27:49 -05:00
committed by GitHub
parent f638b390b6
commit 0cf0230432
5 changed files with 952 additions and 324 deletions

View File

@@ -20,7 +20,7 @@
// everything except imported core wasm modules.
use crate::component::*;
use crate::{EntityIndex, PrimaryMap};
use crate::{EntityIndex, PrimaryMap, SignatureIndex};
use indexmap::IndexMap;
use serde::{Deserialize, Serialize};
@@ -35,46 +35,228 @@ use serde::{Deserialize, Serialize};
/// this is going to undergo a lot of churn.
#[derive(Default, Debug, Serialize, Deserialize)]
pub struct Component {
/// A list of typed values that this component imports, indexed by either
/// the import's position or the name of the import.
pub imports: IndexMap<String, TypeDef>,
/// A list of typed values that this component imports.
///
/// Note that each name is given an `ImportIndex` here for the next map to
/// refer back to.
pub import_types: PrimaryMap<ImportIndex, (String, TypeDef)>,
/// A list of "flattened" imports that are used by this instance.
///
/// This import map represents extracting imports, as necessary, from the
/// general imported types by this component. The flattening here refers to
/// extracting items from instances. Currently the flat imports are either a
/// host function or a core wasm module.
///
/// For example if `ImportIndex(0)` pointed to an instance then this import
/// map represent extracting names from that map, for example extracting an
/// exported module or an exported function.
///
/// Each import item is keyed by a `RuntimeImportIndex` which is referred to
/// by types below whenever something refers to an import. The value for
/// each `RuntimeImportIndex` in this map is the `ImportIndex` for where
/// this items comes from (which can be associated with a name above in the
/// `import_types` array) as well as the list of export names if
/// `ImportIndex` refers to an instance. The export names array represents
/// recursively fetching names within an instance.
//
// TODO: this is probably a lot of `String` storage and may be something
// that needs optimization in the future. For example instead of lots of
// different `String` allocations this could instead be a pointer/length
// into one large string allocation for the entire component. Alternatively
// strings could otherwise be globally intern'd via some other mechanism to
// avoid `Linker`-specific intern-ing plus intern-ing here. Unsure what the
// best route is or whether such an optimization is even necessary here.
pub imports: PrimaryMap<RuntimeImportIndex, (ImportIndex, Vec<String>)>,
/// A list of this component's exports, indexed by either position or name.
pub exports: IndexMap<String, Export>,
/// The list of instances that this component creates during instantiation.
/// Initializers that must be processed when instantiating this component.
///
/// Note that this is flattened/resolved from the original component to
/// the point where alias annotations and such are not required. Instead
/// the list of arguments to instantiate each module is provided as exports
/// of prior instantiations.
pub instances: PrimaryMap<RuntimeInstanceIndex, Instantiation>,
/// This list of initializers does not correspond directly to the component
/// itself. The general goal with this is that the recursive nature of
/// components is "flattened" with an array like this which is a linear
/// sequence of instructions of how to instantiate a component. This will
/// have instantiations, for example, in addition to entries which
/// initialize `VMComponentContext` fields with previously instantiated
/// instances.
///
/// NB: at this time recursive components are not supported, and that may
/// change this somewhat significantly.
pub initializers: Vec<Initializer>,
/// The number of runtime instances (maximum `RuntimeInstanceIndex`) created
/// when instantiating this component.
pub num_runtime_instances: u32,
/// The number of runtime memories (maximum `RuntimeMemoryIndex`) needed to
/// instantiate this component.
///
/// Note that this many memories will be stored in the `VMComponentContext`
/// and each memory is intended to be unique (e.g. the same memory isn't
/// stored in two different locations).
pub num_runtime_memories: u32,
/// The number of runtime reallocs (maximum `RuntimeReallocIndex`) needed to
/// instantiate this component.
///
/// Note that this many function pointers will be stored in the
/// `VMComponentContext`.
pub num_runtime_reallocs: u32,
/// The number of lowered host functions (maximum `LoweredIndex`) needed to
/// instantiate this component.
pub num_lowerings: u32,
}
/// Different ways to instantiate a module at runtime.
/// Initializer instructions to get processed when instantiating a component
///
/// The variants of this enum are processed during the instantiation phase of
/// a component in-order from front-to-back. These are otherwise emitted as a
/// component is parsed and read and translated.
///
/// NB: at this time recursive components are not supported, and that may
/// change this somewhat significantly.
///
//
// FIXME(#2639) if processing this list is ever a bottleneck we could
// theoretically use cranelift to compile an initialization function which
// performs all of these duties for us and skips the overhead of interpreting
// all of these instructions.
#[derive(Debug, Serialize, Deserialize)]
pub enum Instantiation {
/// A module "upvar" is being instantiated which is a closed-over module
/// that is known at runtime by index.
ModuleUpvar {
/// The module index which is being instantiated.
module: ModuleUpvarIndex,
/// The flat list of arguments to the module's instantiation.
args: Box<[CoreExport<EntityIndex>]>,
pub enum Initializer {
/// A core was module is being instantiated.
///
/// This will result in a new core wasm instance being created, which may
/// involve running the `start` function of the instance as well if it's
/// specified. This largely delegates to the same standard instantiation
/// process as the rest of the core wasm machinery already uses.
InstantiateModule {
/// The instance of the index that's being created.
///
/// This is guaranteed to be the `n`th `InstantiateModule` instruction
/// if the index is `n`.
instance: RuntimeInstanceIndex,
/// The module that's being instantiated, either an "upvar" or an
/// imported module.
module: ModuleToInstantiate,
/// The arguments to instantiation and where they're loaded from.
///
/// Note that this is a flat list. For "upvars" this list is sorted by
/// the actual concrete imports needed by the upvar so the items can be
/// passed directly to instantiation. For imports this list is sorted
/// by the order of the import names on the type of the module
/// declaration in this component.
///
/// Each argument is a `CoreDef` which represents that it's either, at
/// this time, a lowered imported function or a core wasm item from
/// another previously instantiated instance.
args: Box<[CoreDef]>,
},
/// A module import is being instantiated.
/// A host function is being lowered, creating a core wasm function.
///
/// NB: this is not implemented in the runtime yet so this is a little less
/// fleshed out than the above case. For example it's not entirely clear how
/// the import will be referred to here (here a `usize` is used but unsure
/// if that will work out).
ModuleImport {
/// Which module import is being instantiated.
import_index: usize,
/// The flat list of arguments to the module's instantiation.
args: Box<[CoreExport<EntityIndex>]>,
/// This initializer entry is intended to be used to fill out the
/// `VMComponentContext` and information about this lowering such as the
/// cranelift-compiled trampoline function pointer, the host function
/// pointer the trampline calls, and the canonical ABI options.
LowerImport(LowerImport),
/// A core wasm linear memory is going to be saved into the
/// `VMComponentContext`.
///
/// This instruction indicates that the `index`th core wasm linear memory
/// needs to be extracted from the `export` specified, a pointer to a
/// previously created module instance, and stored into the
/// `VMComponentContext` at the `index` specified. This lowering is then
/// used in the future by pointers from `CanonicalOptions`.
ExtractMemory {
/// The index of the memory we're storing.
///
/// This is guaranteed to be the `n`th `ExtractMemory` instruction
/// if the index is `n`.
index: RuntimeMemoryIndex,
/// The source of the memory that is stored.
export: CoreExport<MemoryIndex>,
},
/// Same as `ExtractMemory`, except it's extracting a function pointer to be
/// used as a `realloc` function.
ExtractRealloc {
/// The index of the realloc function we're storing.
///
/// This is guaranteed to be the `n`th `ExtractRealloc` instruction
/// if the index is `n`.
index: RuntimeReallocIndex,
/// The source of the function pointer that is stored.
def: CoreDef,
},
}
/// Indicator used to refer to what module is being instantiated when
/// `Initializer::InstantiateModule` is used.
#[derive(Debug, Serialize, Deserialize)]
pub enum ModuleToInstantiate {
/// An "upvar", or a module defined within a component, is being used.
///
/// The index here is correlated with the `Translation::upvars` map that's
/// created during translation of a component.
Upvar(ModuleUpvarIndex),
/// An imported core wasm module is being instantiated.
///
/// It's guaranteed that this `RuntimeImportIndex` points to a module.
Import(RuntimeImportIndex),
}
/// Description of a lowered import used in conjunction with
/// `Initializer::LowerImport`.
#[derive(Debug, Serialize, Deserialize)]
pub struct LowerImport {
/// The index of the lowered function that's being created.
///
/// This is guaranteed to be the `n`th `LowerImport` instruction
/// if the index is `n`.
pub index: LoweredIndex,
/// The index of the imported host function that is being lowered.
///
/// It's guaranteed that this `RuntimeImportIndex` points to a function.
pub import: RuntimeImportIndex,
/// The core wasm signature of the function that's being created.
pub canonical_abi: SignatureIndex,
/// The canonical ABI options used when lowering this function specified in
/// the original component.
pub options: CanonicalOptions,
}
/// Definition of a core wasm item and where it can come from within a
/// component.
///
/// Note that this is sort of a result of data-flow-like analysis on a component
/// during compile time of the component itself. References to core wasm items
/// are "compiled" to either referring to a previous instance or to some sort of
/// lowered host import.
#[derive(Debug, Clone, Serialize, Deserialize, Hash, Eq, PartialEq)]
pub enum CoreDef {
/// This item refers to an export of a previously instantiated core wasm
/// instance.
Export(CoreExport<EntityIndex>),
/// This item is a core wasm function with the index specified here. Note
/// that this `LoweredIndex` corresponds to the nth
/// `Initializer::LowerImport` instruction.
Lowered(LoweredIndex),
}
impl From<CoreExport<EntityIndex>> for CoreDef {
fn from(export: CoreExport<EntityIndex>) -> CoreDef {
CoreDef::Export(export)
}
}
/// Identifier of an exported item from a core WebAssembly module instance.
@@ -82,7 +264,7 @@ pub enum Instantiation {
/// Note that the `T` here is the index type for exports which can be
/// identified by index. The `T` is monomorphized with types like
/// [`EntityIndex`] or [`FuncIndex`].
#[derive(Debug, Clone, Serialize, Deserialize)]
#[derive(Debug, Clone, Serialize, Deserialize, Hash, Eq, PartialEq)]
pub struct CoreExport<T> {
/// The instance that this item is located within.
///
@@ -96,7 +278,7 @@ pub struct CoreExport<T> {
}
/// An index at which to find an item within a runtime instance.
#[derive(Debug, Clone, Serialize, Deserialize)]
#[derive(Debug, Clone, Serialize, Deserialize, Hash, Eq, PartialEq)]
pub enum ExportItem<T> {
/// An exact index that the target can be found at.
///
@@ -119,65 +301,50 @@ pub enum ExportItem<T> {
pub enum Export {
/// A lifted function being exported which is an adaptation of a core wasm
/// function.
LiftedFunction(LiftedFunction),
LiftedFunction {
/// The component function type of the function being created.
ty: FuncTypeIndex,
/// Which core WebAssembly export is being lifted.
func: CoreExport<FuncIndex>,
/// Any options, if present, associated with this lifting.
options: CanonicalOptions,
},
}
/// Description of a lifted function.
///
/// This represents how a function was lifted, what options were used to lift
/// it, and how it's all processed at runtime.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct LiftedFunction {
/// The component function type of the function being created.
pub ty: FuncTypeIndex,
/// Which core WebAssembly export is being lifted.
pub func: CoreExport<FuncIndex>,
/// Any options, if present, associated with this lifting.
pub options: CanonicalOptions,
}
/// Canonical ABI options associated with a lifted function.
/// Canonical ABI options associated with a lifted or lowered function.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct CanonicalOptions {
/// The encoding used for strings.
pub string_encoding: StringEncoding,
/// Representation of the `into` option where intrinsics are peeled out and
/// identified from an instance.
pub intrinsics: Option<Intrinsics>,
/// The memory used by these options, if specified.
pub memory: Option<RuntimeMemoryIndex>,
/// The realloc function used by these options, if specified.
pub realloc: Option<RuntimeReallocIndex>,
// TODO: need to represent post-return here as well
}
impl Default for CanonicalOptions {
fn default() -> CanonicalOptions {
CanonicalOptions {
string_encoding: StringEncoding::Utf8,
intrinsics: None,
memory: None,
realloc: None,
}
}
}
/// Possible encodings of strings within the component model.
//
// Note that the `repr(u8)` is load-bearing here since this is used in an
// `extern "C" fn()` function argument which is called from cranelift-compiled
// code so we must know the representation of this.
#[derive(Debug, Clone, Copy, Serialize, Deserialize)]
#[allow(missing_docs)]
#[repr(u8)]
pub enum StringEncoding {
Utf8,
Utf16,
CompactUtf16,
}
/// Intrinsics required with the `(into $instance)` option specified in
/// `canon.lift`.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Intrinsics {
/// The linear memory that the module exports which we're reading/writing
/// from.
pub memory: CoreExport<MemoryIndex>,
/// A memory allocation, and reallocation, function.
pub canonical_abi_realloc: CoreExport<FuncIndex>,
/// A memory deallocation function.
///
/// NB: this will probably be replaced with a per-export-destructor rather
/// than a general memory deallocation function.
pub canonical_abi_free: CoreExport<FuncIndex>,
}