Further minor optimizations to instantiation (#3791)
* Shrink the size of `FuncData` Before this commit on a 64-bit system the `FuncData` type had a size of 88 bytes and after this commit it has a size of 32 bytes. A `FuncData` is required for all host functions in a store, including those inserted from a `Linker` into a store used during linking. This means that instantiation ends up creating a nontrivial number of these types and pushing them into the store. Looking at some profiles there were some surprisingly expensive movements of `FuncData` from the stack to a vector for moves-by-value generated by Rust. Shrinking this type enables more efficient code to be generated and additionally means less storage is needed in a store's function array. For instantiating the spidermonkey and rustpython modules this improves instantiation by 10% since they each import a fair number of host functions and the speedup here is relative to the number of items imported. * Use `ptr::copy_nonoverlapping` during initialization Prevoiusly `ptr::copy` was used for copying imports into place which translates to `memmove`, but `ptr::copy_nonoverlapping` can be used here since it's statically known these areas don't overlap. While this doesn't end up having a performance difference it's something I kept noticing while looking at the disassembly of `initialize_vmcontext` so I figured I'd go ahead and implement. * Indirect shared signature ids in the VMContext This commit is a small improvement for the instantiation time of modules by avoiding copying a list of `VMSharedSignatureIndex` entries into each `VMContext`, instead building one inside of a module and sharing that amongst all instances. This involves less lookups at instantiation time and less movement of data during instantiation. The downside is that type-checks on `call_indirect` now involve an additionally load, but I'm assuming that these are somewhat pessimized enough as-is that the runtime impact won't be much there. For instantiation performance this is a 5-10% win with rustpyhon/spidermonky instantiation. This should also reduce the size of each `VMContext` for an instantiation since signatures are no longer stored inline but shared amongst all instances with one module. Note that one subtle change here is that the array of `VMSharedSignatureIndex` was previously indexed by `TypeIndex`, and now it's indexed by `SignaturedIndex` which is a deduplicated form of `TypeIndex`. This is done because we already had a list of those lying around in `Module`, so it was easier to reuse that than to build a separate array and store it somewhere. * Reserve space in `Store<T>` with `InstancePre` This commit updates the instantiation process to reserve space in a `Store<T>` for the functions that an `InstancePre<T>`, as part of instantiation, will insert into it. Using an `InstancePre<T>` to instantiate allows pre-computing the number of host functions that will be inserted into a store, and by pre-reserving space we can avoid costly reallocations during instantiation by ensuring the function vector has enough space to fit everything during the instantiation process. Overall this makes instantiation of rustpython/spidermonkey about 8% faster locally. * Fix tests * Use checked arithmetic
This commit is contained in:
@@ -7,7 +7,7 @@
|
||||
// interrupts: *const VMInterrupts,
|
||||
// externref_activations_table: *mut VMExternRefActivationsTable,
|
||||
// store: *mut dyn Store,
|
||||
// signature_ids: [VMSharedSignatureIndex; module.num_signature_ids],
|
||||
// signature_ids: *const VMSharedSignatureIndex,
|
||||
// imported_functions: [VMFunctionImport; module.num_imported_functions],
|
||||
// imported_tables: [VMTableImport; module.num_imported_tables],
|
||||
// imported_memories: [VMMemoryImport; module.num_imported_memories],
|
||||
@@ -21,7 +21,7 @@
|
||||
|
||||
use crate::{
|
||||
DefinedGlobalIndex, DefinedMemoryIndex, DefinedTableIndex, FuncIndex, GlobalIndex, MemoryIndex,
|
||||
Module, TableIndex, TypeIndex,
|
||||
Module, TableIndex,
|
||||
};
|
||||
use more_asserts::assert_lt;
|
||||
use std::convert::TryFrom;
|
||||
@@ -52,8 +52,6 @@ fn align(offset: u32, width: u32) -> u32 {
|
||||
pub struct VMOffsets<P> {
|
||||
/// The size in bytes of a pointer on the target.
|
||||
pub ptr: P,
|
||||
/// The number of signature declarations in the module.
|
||||
pub num_signature_ids: u32,
|
||||
/// The number of imported functions in the module.
|
||||
pub num_imported_functions: u32,
|
||||
/// The number of imported tables in the module.
|
||||
@@ -117,8 +115,6 @@ impl PtrSize for u8 {
|
||||
pub struct VMOffsetsFields<P> {
|
||||
/// The size in bytes of a pointer on the target.
|
||||
pub ptr: P,
|
||||
/// The number of signature declarations in the module.
|
||||
pub num_signature_ids: u32,
|
||||
/// The number of imported functions in the module.
|
||||
pub num_imported_functions: u32,
|
||||
/// The number of imported tables in the module.
|
||||
@@ -142,7 +138,6 @@ impl<P: PtrSize> VMOffsets<P> {
|
||||
pub fn new(ptr: P, module: &Module) -> Self {
|
||||
VMOffsets::from(VMOffsetsFields {
|
||||
ptr,
|
||||
num_signature_ids: cast_to_u32(module.types.len()),
|
||||
num_imported_functions: cast_to_u32(module.num_imported_funcs),
|
||||
num_imported_tables: cast_to_u32(module.num_imported_tables),
|
||||
num_imported_memories: cast_to_u32(module.num_imported_memories),
|
||||
@@ -165,7 +160,6 @@ impl<P: PtrSize> From<VMOffsetsFields<P>> for VMOffsets<P> {
|
||||
fn from(fields: VMOffsetsFields<P>) -> VMOffsets<P> {
|
||||
let mut ret = Self {
|
||||
ptr: fields.ptr,
|
||||
num_signature_ids: fields.num_signature_ids,
|
||||
num_imported_functions: fields.num_imported_functions,
|
||||
num_imported_tables: fields.num_imported_tables,
|
||||
num_imported_memories: fields.num_imported_memories,
|
||||
@@ -210,12 +204,7 @@ impl<P: PtrSize> From<VMOffsetsFields<P>> for VMOffsets<P> {
|
||||
.unwrap();
|
||||
ret.imported_functions = ret
|
||||
.signature_ids
|
||||
.checked_add(
|
||||
fields
|
||||
.num_signature_ids
|
||||
.checked_mul(u32::from(ret.size_of_vmshared_signature_index()))
|
||||
.unwrap(),
|
||||
)
|
||||
.checked_add(u32::from(ret.ptr.size()))
|
||||
.unwrap();
|
||||
ret.imported_tables = ret
|
||||
.imported_functions
|
||||
@@ -535,9 +524,9 @@ impl<P: PtrSize> VMOffsets<P> {
|
||||
self.store
|
||||
}
|
||||
|
||||
/// The offset of the `signature_ids` array.
|
||||
/// The offset of the `signature_ids` array pointer.
|
||||
#[inline]
|
||||
pub fn vmctx_signature_ids_begin(&self) -> u32 {
|
||||
pub fn vmctx_signature_ids_array(&self) -> u32 {
|
||||
self.signature_ids
|
||||
}
|
||||
|
||||
@@ -603,14 +592,6 @@ impl<P: PtrSize> VMOffsets<P> {
|
||||
self.size
|
||||
}
|
||||
|
||||
/// Return the offset to `VMSharedSignatureId` index `index`.
|
||||
#[inline]
|
||||
pub fn vmctx_vmshared_signature_id(&self, index: TypeIndex) -> u32 {
|
||||
assert_lt!(index.as_u32(), self.num_signature_ids);
|
||||
self.vmctx_signature_ids_begin()
|
||||
+ index.as_u32() * u32::from(self.size_of_vmshared_signature_index())
|
||||
}
|
||||
|
||||
/// Return the offset to `VMFunctionImport` index `index`.
|
||||
#[inline]
|
||||
pub fn vmctx_vmfunction_import(&self, index: FuncIndex) -> u32 {
|
||||
|
||||
Reference in New Issue
Block a user