Implement strings in adapter modules (#4623)

* Implement strings in adapter modules

This commit is a hefty addition to Wasmtime's support for the component
model. This implements the final remaining type (in the current type
hierarchy) unimplemented in adapter module trampolines: strings. Strings
are the most complicated type to implement in adapter trampolines
because they are highly structured chunks of data in memory (according
to specific encodings). Additionally each lift/lower operation can
choose its own encoding for strings meaning that Wasmtime, the host, may
have to convert between any pairwise ordering of string encodings.

The `CanonicalABI.md` in the component-model repo in general specifies
all the fiddly bits of string encoding so there's not a ton of wiggle
room for Wasmtime to get creative. This PR largely "just" implements
that. The high-level architecture of this implementation is:

* Fused adapters are first identified to determine src/dst string
  encodings. This statically fixes what transcoding operation is being
  performed.

* The generated adapter will be responsible for managing calls to
  `realloc` and performing bounds checks. The adapter itself does not
  perform memory copies or validation of string contents, however.
  Instead each transcoding operation is modeled as an imported function
  into the adapter module.  This means that the adapter module
  dynamically, during compile time, determines what string transcoders
  are needed. Note that an imported transcoder is not only parameterized
  over the transcoding operation but additionally which memory is the
  source and which is the destination.

* The imported core wasm functions are modeled as a new
  `CoreDef::Transcoder` structure. These transcoders end up being small
  Cranelift-compiled trampolines. The Cranelift-compiled trampoline will
  load the actual base pointer of memory and add it to the relative
  pointers passed as function arguments. This trampoline then calls a
  transcoder "libcall" which enters Rust-defined functions for actual
  transcoding operations.

* Each possible transcoding operation is implemented in Rust with a
  unique name and a unique signature depending on the needs of the
  transcoder. I've tried to document inline what each transcoder does.

This means that the `Module::translate_string` in adapter modules is by
far the largest translation method. The main reason for this is due to
the management around calling the imported transcoder functions in the
face of validating string pointer/lengths and performing the dance of
`realloc`-vs-transcode at the right time. I've tried to ensure that each
individual case in transcoding is documented well enough to understand
what's going on as well.

Additionally in this PR is a full implementation in the host for the
`latin1+utf16` encoding which means that both lifting and lowering host
strings now works with this encoding.

Currently the implementation of each transcoder function is likely far
from optimal. Where possible I've leaned on the standard library itself
and for latin1-related things I'm leaning on the `encoding_rs` crate. I
initially tried to implement everything with `encoding_rs` but was
unable to uniformly do so easily. For now I settled on trying to get a
known-correct (even in the face of endianness) implementation for all of
these transcoders. If an when performance becomes an issue it should be
possible to implement more optimized versions of each of these
transcoding operations.

Testing this commit has been somewhat difficult and my general plan,
like with the `(list T)` type, is to rely heavily on fuzzing to cover
the various cases here. In this PR though I've added a simple test that
pushes some statically known strings through all the pairs of encodings
between source and destination. I've attempted to pick "interesting"
strings that one way or another stress the various paths in each
transcoding operation to ideally get full branch coverage there.
Additionally a suite of "negative" tests have also been added to ensure
that validity of encoding is actually checked.

* Fix a temporarily commented out case

* Fix wasmtime-runtime tests

* Update deny.toml configuration

* Add `BSD-3-Clause` for the `encoding_rs` crate
* Remove some unused licenses

* Add an exemption for `encoding_rs` for now

* Split up the `translate_string` method

Move out all the closures and package up captured state into smaller
lists of arguments.

* Test out-of-bounds for zero-length strings
This commit is contained in:
Alex Crichton
2022-08-08 11:01:57 -05:00
committed by GitHub
parent e6d339b6ac
commit 650979ae40
33 changed files with 3239 additions and 190 deletions

View File

@@ -1,5 +1,6 @@
use crate::component::{
Component, ComponentTypes, LowerImport, LoweredIndex, RuntimeAlwaysTrapIndex,
RuntimeTranscoderIndex, Transcoder,
};
use crate::{PrimaryMap, SignatureIndex, Trampoline, WasmFuncType};
use anyhow::Result;
@@ -61,6 +62,24 @@ pub trait ComponentCompiler: Send + Sync {
/// `canon lift`'d function immediately being `canon lower`'d.
fn compile_always_trap(&self, ty: &WasmFuncType) -> Result<Box<dyn Any + Send>>;
/// Compiles a trampoline to implement string transcoding from adapter
/// modules.
///
/// The generated trampoline will invoke the `transcoder.op` libcall with
/// the various memory configuration provided in `transcoder`. This is used
/// to pass raw pointers to host functions to avoid the host having to deal
/// with base pointers, offsets, memory32-vs-64, etc.
///
/// Note that all bounds checks for memories are present in adapters
/// themselves, and the host libcalls simply assume that the pointers are
/// valid.
fn compile_transcoder(
&self,
component: &Component,
transcoder: &Transcoder,
types: &ComponentTypes,
) -> Result<Box<dyn Any + Send>>;
/// Emits the `lowerings` and `trampolines` specified into the in-progress
/// ELF object specified by `obj`.
///
@@ -73,11 +92,13 @@ pub trait ComponentCompiler: Send + Sync {
&self,
lowerings: PrimaryMap<LoweredIndex, Box<dyn Any + Send>>,
always_trap: PrimaryMap<RuntimeAlwaysTrapIndex, Box<dyn Any + Send>>,
transcoders: PrimaryMap<RuntimeTranscoderIndex, Box<dyn Any + Send>>,
tramplines: Vec<(SignatureIndex, Box<dyn Any + Send>)>,
obj: &mut Object<'static>,
) -> Result<(
PrimaryMap<LoweredIndex, FunctionInfo>,
PrimaryMap<RuntimeAlwaysTrapIndex, AlwaysTrapInfo>,
PrimaryMap<RuntimeTranscoderIndex, FunctionInfo>,
Vec<Trampoline>,
)>;
}

View File

@@ -71,6 +71,9 @@ pub struct ComponentDfg {
/// out of the inlining pass of translation.
pub adapters: Intern<AdapterId, Adapter>,
/// Metadata about string transcoders needed by adapter modules.
pub transcoders: Intern<TranscoderId, Transcoder>,
/// Metadata about all known core wasm instances created.
///
/// This is mostly an ordered list and is not deduplicated based on contents
@@ -125,6 +128,7 @@ id! {
pub struct PostReturnId(u32);
pub struct AlwaysTrapId(u32);
pub struct AdapterModuleId(u32);
pub struct TranscoderId(u32);
}
/// Same as `info::InstantiateModule`
@@ -158,6 +162,7 @@ pub enum CoreDef {
Lowered(LowerImportId),
AlwaysTrap(AlwaysTrapId),
InstanceFlags(RuntimeComponentInstanceIndex),
Transcoder(TranscoderId),
/// This is a special variant not present in `info::CoreDef` which
/// represents that this definition refers to a fused adapter function. This
@@ -220,6 +225,18 @@ pub struct CanonicalOptions {
pub post_return: Option<PostReturnId>,
}
/// Same as `info::Transcoder`
#[derive(Clone, Hash, Eq, PartialEq)]
#[allow(missing_docs)]
pub struct Transcoder {
pub op: Transcode,
pub from: MemoryId,
pub from64: bool,
pub to: MemoryId,
pub to64: bool,
pub signature: SignatureIndex,
}
/// A helper structure to "intern" and deduplicate values of type `V` with an
/// identifying key `K`.
///
@@ -292,6 +309,7 @@ impl ComponentDfg {
runtime_instances: Default::default(),
runtime_always_trap: Default::default(),
runtime_lowerings: Default::default(),
runtime_transcoders: Default::default(),
};
// First the instances are all processed for instantiation. This will,
@@ -324,6 +342,7 @@ impl ComponentDfg {
num_runtime_instances: linearize.runtime_instances.len() as u32,
num_always_trap: linearize.runtime_always_trap.len() as u32,
num_lowerings: linearize.runtime_lowerings.len() as u32,
num_transcoders: linearize.runtime_transcoders.len() as u32,
imports: self.imports,
import_types: self.import_types,
@@ -342,6 +361,7 @@ struct LinearizeDfg<'a> {
runtime_instances: HashMap<RuntimeInstance, RuntimeInstanceIndex>,
runtime_always_trap: HashMap<AlwaysTrapId, RuntimeAlwaysTrapIndex>,
runtime_lowerings: HashMap<LowerImportId, LoweredIndex>,
runtime_transcoders: HashMap<TranscoderId, RuntimeTranscoderIndex>,
}
#[derive(Copy, Clone, Hash, Eq, PartialEq)]
@@ -460,6 +480,7 @@ impl LinearizeDfg<'_> {
CoreDef::Lowered(id) => info::CoreDef::Lowered(self.runtime_lowering(*id)),
CoreDef::InstanceFlags(i) => info::CoreDef::InstanceFlags(*i),
CoreDef::Adapter(id) => info::CoreDef::Export(self.adapter(*id)),
CoreDef::Transcoder(id) => info::CoreDef::Transcoder(self.runtime_transcoder(*id)),
}
}
@@ -497,6 +518,35 @@ impl LinearizeDfg<'_> {
)
}
fn runtime_transcoder(&mut self, id: TranscoderId) -> RuntimeTranscoderIndex {
self.intern(
id,
|me| &mut me.runtime_transcoders,
|me, id| {
let info = &me.dfg.transcoders[id];
(
info.op,
me.runtime_memory(info.from),
info.from64,
me.runtime_memory(info.to),
info.to64,
info.signature,
)
},
|index, (op, from, from64, to, to64, signature)| {
GlobalInitializer::Transcoder(info::Transcoder {
index,
op,
from,
from64,
to,
to64,
signature,
})
},
)
}
fn core_export<T>(&mut self, export: &CoreExport<T>) -> info::CoreExport<T>
where
T: Clone,

View File

@@ -147,6 +147,10 @@ pub struct Component {
/// The number of functions which "always trap" used to implement
/// `canon.lower` of `canon.lift`'d functions within the same component.
pub num_always_trap: u32,
/// The number of host transcoder functions needed for strings in adapter
/// modules.
pub num_transcoders: u32,
}
/// GlobalInitializer instructions to get processed when instantiating a component
@@ -207,6 +211,11 @@ pub enum GlobalInitializer {
/// Same as `SaveModuleUpvar`, but for imports.
SaveModuleImport(RuntimeImportIndex),
/// Similar to `ExtractMemory` and friends and indicates that a
/// `VMCallerCheckedAnyfunc` needs to be initialized for a transcoder
/// function and this will later be used to instantiate an adapter module.
Transcoder(Transcoder),
}
/// Metadata for extraction of a memory of what's being extracted and where it's
@@ -316,6 +325,9 @@ pub enum CoreDef {
/// This is a reference to a wasm global which represents the
/// runtime-managed flags for a wasm instance.
InstanceFlags(RuntimeComponentInstanceIndex),
/// This refers to a cranelift-generated trampoline which calls to a
/// host-defined transcoding function.
Transcoder(RuntimeTranscoderIndex),
}
impl<T> From<CoreExport<T>> for CoreDef
@@ -433,3 +445,42 @@ pub enum StringEncoding {
Utf16,
CompactUtf16,
}
/// Information about a string transcoding function required by an adapter
/// module.
///
/// A transcoder is used when strings are passed between adapter modules,
/// optionally changing string encodings at the same time. The transcoder is
/// implemented in a few different layers:
///
/// * Each generated adapter module has some glue around invoking the transcoder
/// represented by this item. This involves bounds-checks and handling
/// `realloc` for example.
/// * Each transcoder gets a cranelift-generated trampoline which has the
/// appropriate signature for the adapter module in question. Existence of
/// this initializer indicates that this should be compiled by Cranelift.
/// * The cranelift-generated trampoline will invoke a "transcoder libcall"
/// which is implemented natively in Rust that has a signature independent of
/// memory64 configuration options for example.
#[derive(Debug, Clone, Serialize, Deserialize, Hash, Eq, PartialEq)]
pub struct Transcoder {
/// The index of the transcoder being defined and initialized.
///
/// This indicates which `VMCallerCheckedAnyfunc` slot is written to in a
/// `VMComponentContext`.
pub index: RuntimeTranscoderIndex,
/// The transcoding operation being performed.
pub op: Transcode,
/// The linear memory that the string is being read from.
pub from: RuntimeMemoryIndex,
/// Whether or not the source linear memory is 64-bit or not.
pub from64: bool,
/// The linear memory that the string is being written to.
pub to: RuntimeMemoryIndex,
/// Whether or not the destination linear memory is 64-bit or not.
pub to64: bool,
/// The wasm signature of the cranelift-generated trampoline.
pub signature: SignatureIndex,
}
pub use crate::fact::{FixedEncoding, Transcode};

View File

@@ -116,7 +116,8 @@
//! created.
use crate::component::translate::*;
use crate::fact::Module;
use crate::fact;
use crate::EntityType;
use std::collections::HashSet;
use wasmparser::WasmFeatures;
@@ -183,7 +184,7 @@ impl<'data> Translator<'_, 'data> {
// the module using standard core wasm translation, and then fills out
// the dfg metadata for each adapter.
for (module_id, adapter_module) in state.adapter_modules.iter() {
let mut module = Module::new(
let mut module = fact::Module::new(
self.types.component_types(),
self.tunables.debug_adapter_modules,
);
@@ -194,7 +195,7 @@ impl<'data> Translator<'_, 'data> {
names.push(name);
}
let wasm = module.encode();
let args = module.imports().to_vec();
let imports = module.imports().to_vec();
// Extend the lifetime of the owned `wasm: Vec<u8>` on the stack to
// a higher scope defined by our original caller. That allows to
@@ -240,6 +241,12 @@ impl<'data> Translator<'_, 'data> {
// module is also recorded in the dfg. This metadata will be used
// to generate `GlobalInitializer` entries during the linearization
// final phase.
assert_eq!(imports.len(), translation.module.imports().len());
let args = imports
.iter()
.zip(translation.module.imports())
.map(|(arg, (_, _, ty))| fact_import_to_core_def(component, arg, ty))
.collect::<Vec<_>>();
let static_index = self.static_modules.push(translation);
let id = component.adapter_modules.push((static_index, args.into()));
assert_eq!(id, module_id);
@@ -247,6 +254,47 @@ impl<'data> Translator<'_, 'data> {
}
}
fn fact_import_to_core_def(
dfg: &mut dfg::ComponentDfg,
import: &fact::Import,
ty: EntityType,
) -> dfg::CoreDef {
match import {
fact::Import::CoreDef(def) => def.clone(),
fact::Import::Transcode {
op,
from,
from64,
to,
to64,
} => {
fn unwrap_memory(def: &dfg::CoreDef) -> dfg::CoreExport<MemoryIndex> {
match def {
dfg::CoreDef::Export(e) => e.clone().map_index(|i| match i {
EntityIndex::Memory(i) => i,
_ => unreachable!(),
}),
_ => unreachable!(),
}
}
let from = dfg.memories.push_uniq(unwrap_memory(from));
let to = dfg.memories.push_uniq(unwrap_memory(to));
dfg::CoreDef::Transcoder(dfg.transcoders.push_uniq(dfg::Transcoder {
op: *op,
from,
from64: *from64,
to,
to64: *to64,
signature: match ty {
EntityType::Function(signature) => signature,
_ => unreachable!(),
},
}))
}
}
}
#[derive(Default)]
struct PartitionAdapterModules {
/// The next adapter module that's being created. This may be empty.
@@ -336,6 +384,9 @@ impl PartitionAdapterModules {
dfg::CoreDef::Lowered(_)
| dfg::CoreDef::AlwaysTrap(_)
| dfg::CoreDef::InstanceFlags(_) => {}
// should not be in the dfg yet
dfg::CoreDef::Transcoder(_) => unreachable!(),
}
}

View File

@@ -166,6 +166,13 @@ indices! {
/// Index that represents an exported module from a component since that's
/// currently the only use for saving the entire module state at runtime.
pub struct RuntimeModuleIndex(u32);
/// Index into the list of transcoders identified during compilation.
///
/// This is used to index the `VMCallerCheckedAnyfunc` slots reserved for
/// string encoders which reference linear memories defined within a
/// component.
pub struct RuntimeTranscoderIndex(u32);
}
// Reexport for convenience some core-wasm indices which are also used in the

View File

@@ -2,11 +2,13 @@
//
// struct VMComponentContext {
// magic: u32,
// transcode_libcalls: &'static VMBuiltinTranscodeArray,
// store: *mut dyn Store,
// limits: *const VMRuntimeLimits,
// flags: [VMGlobalDefinition; component.num_runtime_component_instances],
// lowering_anyfuncs: [VMCallerCheckedAnyfunc; component.num_lowerings],
// always_trap_anyfuncs: [VMCallerCheckedAnyfunc; component.num_always_trap],
// transcoder_anyfuncs: [VMCallerCheckedAnyfunc; component.num_transcoders],
// lowerings: [VMLowering; component.num_lowerings],
// memories: [*mut VMMemoryDefinition; component.num_memories],
// reallocs: [*mut VMCallerCheckedAnyfunc; component.num_reallocs],
@@ -15,7 +17,7 @@
use crate::component::{
Component, LoweredIndex, RuntimeAlwaysTrapIndex, RuntimeComponentInstanceIndex,
RuntimeMemoryIndex, RuntimePostReturnIndex, RuntimeReallocIndex,
RuntimeMemoryIndex, RuntimePostReturnIndex, RuntimeReallocIndex, RuntimeTranscoderIndex,
};
use crate::PtrSize;
@@ -57,14 +59,18 @@ pub struct VMComponentOffsets<P> {
/// Number of "always trap" functions which have their
/// `VMCallerCheckedAnyfunc` stored inline in the `VMComponentContext`.
pub num_always_trap: u32,
/// Number of transcoders needed for string conversion.
pub num_transcoders: u32,
// precalculated offsets of various member fields
magic: u32,
transcode_libcalls: u32,
store: u32,
limits: u32,
flags: u32,
lowering_anyfuncs: u32,
always_trap_anyfuncs: u32,
transcoder_anyfuncs: u32,
lowerings: u32,
memories: u32,
reallocs: u32,
@@ -93,12 +99,15 @@ impl<P: PtrSize> VMComponentOffsets<P> {
.try_into()
.unwrap(),
num_always_trap: component.num_always_trap,
num_transcoders: component.num_transcoders,
magic: 0,
transcode_libcalls: 0,
store: 0,
limits: 0,
flags: 0,
lowering_anyfuncs: 0,
always_trap_anyfuncs: 0,
transcoder_anyfuncs: 0,
lowerings: 0,
memories: 0,
reallocs: 0,
@@ -133,6 +142,7 @@ impl<P: PtrSize> VMComponentOffsets<P> {
fields! {
size(magic) = 4u32,
align(u32::from(ret.ptr.size())),
size(transcode_libcalls) = ret.ptr.size(),
size(store) = cmul(2, ret.ptr.size()),
size(limits) = ret.ptr.size(),
align(16),
@@ -140,6 +150,7 @@ impl<P: PtrSize> VMComponentOffsets<P> {
align(u32::from(ret.ptr.size())),
size(lowering_anyfuncs) = cmul(ret.num_lowerings, ret.ptr.size_of_vmcaller_checked_anyfunc()),
size(always_trap_anyfuncs) = cmul(ret.num_always_trap, ret.ptr.size_of_vmcaller_checked_anyfunc()),
size(transcoder_anyfuncs) = cmul(ret.num_transcoders, ret.ptr.size_of_vmcaller_checked_anyfunc()),
size(lowerings) = cmul(ret.num_lowerings, ret.ptr.size() * 2),
size(memories) = cmul(ret.num_runtime_memories, ret.ptr.size()),
size(reallocs) = cmul(ret.num_runtime_reallocs, ret.ptr.size()),
@@ -168,6 +179,12 @@ impl<P: PtrSize> VMComponentOffsets<P> {
self.magic
}
/// The offset of the `transcode_libcalls` field.
#[inline]
pub fn transcode_libcalls(&self) -> u32 {
self.transcode_libcalls
}
/// The offset of the `flags` field.
#[inline]
pub fn instance_flags(&self, index: RuntimeComponentInstanceIndex) -> u32 {
@@ -215,6 +232,20 @@ impl<P: PtrSize> VMComponentOffsets<P> {
+ index.as_u32() * u32::from(self.ptr.size_of_vmcaller_checked_anyfunc())
}
/// The offset of the `transcoder_anyfuncs` field.
#[inline]
pub fn transcoder_anyfuncs(&self) -> u32 {
self.transcoder_anyfuncs
}
/// The offset of `VMCallerCheckedAnyfunc` for the `index` specified.
#[inline]
pub fn transcoder_anyfunc(&self, index: RuntimeTranscoderIndex) -> u32 {
assert!(index.as_u32() < self.num_transcoders);
self.transcoder_anyfuncs()
+ index.as_u32() * u32::from(self.ptr.size_of_vmcaller_checked_anyfunc())
}
/// The offset of the `lowerings` field.
#[inline]
pub fn lowerings(&self) -> u32 {