Further minor optimizations to instantiation (#3791)

* Shrink the size of `FuncData`

Before this commit on a 64-bit system the `FuncData` type had a size of
88 bytes and after this commit it has a size of 32 bytes. A `FuncData`
is required for all host functions in a store, including those inserted
from a `Linker` into a store used during linking. This means that
instantiation ends up creating a nontrivial number of these types and
pushing them into the store. Looking at some profiles there were some
surprisingly expensive movements of `FuncData` from the stack to a
vector for moves-by-value generated by Rust. Shrinking this type enables
more efficient code to be generated and additionally means less storage
is needed in a store's function array.

For instantiating the spidermonkey and rustpython modules this improves
instantiation by 10% since they each import a fair number of host
functions and the speedup here is relative to the number of items
imported.

* Use `ptr::copy_nonoverlapping` during initialization

Prevoiusly `ptr::copy` was used for copying imports into place which
translates to `memmove`, but `ptr::copy_nonoverlapping` can be used here
since it's statically known these areas don't overlap. While this
doesn't end up having a performance difference it's something I kept
noticing while looking at the disassembly of `initialize_vmcontext` so I
figured I'd go ahead and implement.

* Indirect shared signature ids in the VMContext

This commit is a small improvement for the instantiation time of modules
by avoiding copying a list of `VMSharedSignatureIndex` entries into each
`VMContext`, instead building one inside of a module and sharing that
amongst all instances. This involves less lookups at instantiation time
and less movement of data during instantiation. The downside is that
type-checks on `call_indirect` now involve an additionally load, but I'm
assuming that these are somewhat pessimized enough as-is that the
runtime impact won't be much there.

For instantiation performance this is a 5-10% win with
rustpyhon/spidermonky instantiation. This should also reduce the size of
each `VMContext` for an instantiation since signatures are no longer
stored inline but shared amongst all instances with one module.

Note that one subtle change here is that the array of
`VMSharedSignatureIndex` was previously indexed by `TypeIndex`, and now
it's indexed by `SignaturedIndex` which is a deduplicated form of
`TypeIndex`. This is done because we already had a list of those lying
around in `Module`, so it was easier to reuse that than to build a
separate array and store it somewhere.

* Reserve space in `Store<T>` with `InstancePre`

This commit updates the instantiation process to reserve space in a
`Store<T>` for the functions that an `InstancePre<T>`, as part of
instantiation, will insert into it. Using an `InstancePre<T>` to
instantiate allows pre-computing the number of host functions that will
be inserted into a store, and by pre-reserving space we can avoid costly
reallocations during instantiation by ensuring the function vector has
enough space to fit everything during the instantiation process.

Overall this makes instantiation of rustpython/spidermonkey about 8%
faster locally.

* Fix tests

* Use checked arithmetic
This commit is contained in:
Alex Crichton
2022-02-11 09:55:08 -06:00
committed by GitHub
parent c0c368d151
commit b438617e12
10 changed files with 85 additions and 56 deletions

View File

@@ -188,7 +188,10 @@ pub(crate) struct FuncData {
// optimized use cases (e.g. `TypedFunc`) it's not actually needed or it's
// only needed rarely. To handle that this is an optionally-contained field
// which is lazily loaded into as part of `Func::call`.
ty: Option<FuncType>,
//
// Also note that this is intentionally placed behind a poiner to keep it
// small as `FuncData` instances are often inserted into a `Store`.
ty: Option<Box<FuncType>>,
}
/// The three ways that a function can be created and referenced from within a
@@ -216,7 +219,12 @@ enum FuncKind {
/// `Func::new` or similar APIs. The `HostFunc` internally owns the
/// `InstanceHandle` and that will get dropped when this `HostFunc` itself
/// is dropped.
Host(HostFunc),
///
/// Note that this is intentionally placed behind a `Box` to minimize the
/// size of this enum since the most common variant for high-peformance
/// situations is `SharedHost` and `StoreOwned`, so this ideally isn't
/// larger than those two.
Host(Box<HostFunc>),
}
macro_rules! for_each_function_signature {
@@ -711,7 +719,7 @@ impl Func {
// this time.
if store.store_data()[self.0].ty.is_none() {
let ty = self.load_ty(store);
store.store_data_mut()[self.0].ty = Some(ty);
store.store_data_mut()[self.0].ty = Some(Box::new(ty));
}
(store.store_data()[self.0].ty.as_ref().unwrap(), store)
@@ -2091,7 +2099,7 @@ impl HostFunc {
/// Same as [`HostFunc::to_func`], different ownership.
unsafe fn into_func(self, store: &mut StoreOpaque) -> Func {
self.validate_store(store);
Func::from_func_kind(FuncKind::Host(self), store)
Func::from_func_kind(FuncKind::Host(Box::new(self)), store)
}
fn validate_store(&self, store: &mut StoreOpaque) {

View File

@@ -902,6 +902,9 @@ impl<'a> ImportsBuilder<'a> {
pub struct InstancePre<T> {
module: Module,
items: Vec<Definition>,
/// A count of `Definition::HostFunc` entries in `items` above to
/// preallocate space in a `Store` up front for all entries to be inserted.
host_funcs: usize,
_marker: std::marker::PhantomData<fn() -> T>,
}
@@ -911,6 +914,7 @@ impl<T> Clone for InstancePre<T> {
Self {
module: self.module.clone(),
items: self.items.clone(),
host_funcs: self.host_funcs,
_marker: self._marker,
}
}
@@ -923,9 +927,17 @@ impl<T> InstancePre<T> {
items: Vec<Definition>,
) -> Result<InstancePre<T>> {
typecheck_defs(store, module, &items)?;
let host_funcs = items
.iter()
.filter(|i| match i {
Definition::HostFunc(_) => true,
_ => false,
})
.count();
Ok(InstancePre {
module: module.clone(),
items,
host_funcs,
_marker: std::marker::PhantomData,
})
}
@@ -952,7 +964,7 @@ impl<T> InstancePre<T> {
// passed in.
let mut store = store.as_context_mut();
let mut instantiator = unsafe {
self.ensure_comes_from_same_store(&store.0)?;
self.verify_store_and_reserve_space(&mut store.0)?;
Instantiator::new(
store.0,
&self.module,
@@ -984,7 +996,7 @@ impl<T> InstancePre<T> {
// For the unsafety here see above
let mut store = store.as_context_mut();
let mut i = unsafe {
self.ensure_comes_from_same_store(&store.0)?;
self.verify_store_and_reserve_space(&mut store.0)?;
Instantiator::new(
store.0,
&self.module,
@@ -1002,12 +1014,17 @@ impl<T> InstancePre<T> {
.await?
}
fn ensure_comes_from_same_store(&self, store: &StoreOpaque) -> Result<()> {
fn verify_store_and_reserve_space(&self, store: &mut StoreOpaque) -> Result<()> {
for import in self.items.iter() {
if !import.comes_from_same_store(store) {
bail!("cross-`Store` instantiation is not currently supported");
}
}
// Any linker-defined function of the `Definition::HostFunc` variant
// will insert a function into the store automatically as part of
// instantiation, so reserve space here to make insertion more efficient
// as it won't have to realloc during the instantiation.
store.store_data_mut().reserve_funcs(self.host_funcs);
Ok(())
}
}

View File

@@ -1055,6 +1055,10 @@ impl wasmtime_runtime::ModuleRuntimeInfo for ModuleInner {
fn wasm_data(&self) -> &[u8] {
self.module.wasm_data()
}
fn signature_ids(&self) -> &[VMSharedSignatureIndex] {
self.signatures.as_module_map().values().as_slice()
}
}
/// A barebones implementation of ModuleRuntimeInfo that is useful for
@@ -1144,4 +1148,11 @@ impl wasmtime_runtime::ModuleRuntimeInfo for BareModuleInfo {
fn wasm_data(&self) -> &[u8] {
&[]
}
fn signature_ids(&self) -> &[VMSharedSignatureIndex] {
match &self.one_signature {
Some((_, id)) => std::slice::from_ref(id),
None => &[],
}
}
}

View File

@@ -103,6 +103,10 @@ impl StoreData {
pub(crate) fn funcs(&self) -> impl Iterator<Item = &crate::func::FuncData> {
self.funcs.iter()
}
pub(crate) fn reserve_funcs(&mut self, count: usize) {
self.funcs.reserve(count);
}
}
impl<T> Index<Stored<T>> for StoreData