Don't copy VMBuiltinFunctionsArray into each VMContext (#3741)

* Don't copy `VMBuiltinFunctionsArray` into each `VMContext`

This is another PR along the lines of "let's squeeze all possible
performance we can out of instantiation". Before this PR we would copy,
by value, the contents of `VMBuiltinFunctionsArray` into each
`VMContext` allocated. This array of function pointers is modestly-sized
but growing over time as we add various intrinsics. Additionally it's
the exact same for all `VMContext` allocations.

This PR attempts to speed up instantiation slightly by instead storing
an indirection to the function array. This means that calling a builtin
intrinsic is a tad bit slower since it requires two loads instead of one
(one to get the base pointer, another to get the actual address).
Otherwise though `VMContext` initialization is now simply setting one
pointer instead of doing a `memcpy` from one location to another.

With some macro-magic this commit also replaces the previous
implementation with one that's more `const`-friendly which also gets us
compile-time type-checks of libcalls as well as compile-time
verification that all libcalls are defined.

Overall, as with #3739, the win is very modest here. Locally I measured
a speedup from 1.9us to 1.7us taken to instantiate an empty module with
one function. While small at these scales it's still a 10% improvement!

* Review comments
This commit is contained in:
Alex Crichton
2022-01-28 16:24:34 -06:00
committed by GitHub
parent 2f494240f8
commit a25f7bdba5
5 changed files with 87 additions and 111 deletions

View File

@@ -16,12 +16,12 @@
// memories: [VMMemoryDefinition; module.num_defined_memories],
// globals: [VMGlobalDefinition; module.num_defined_globals],
// anyfuncs: [VMCallerCheckedAnyfunc; module.num_imported_functions + module.num_defined_functions],
// builtins: VMBuiltinFunctionsArray,
// builtins: *mut VMBuiltinFunctionsArray,
// }
use crate::{
BuiltinFunctionIndex, DefinedGlobalIndex, DefinedMemoryIndex, DefinedTableIndex, FuncIndex,
GlobalIndex, MemoryIndex, Module, TableIndex, TypeIndex,
DefinedGlobalIndex, DefinedMemoryIndex, DefinedTableIndex, FuncIndex, GlobalIndex, MemoryIndex,
Module, TableIndex, TypeIndex,
};
use more_asserts::assert_lt;
use std::convert::TryFrom;
@@ -287,11 +287,7 @@ impl<P: PtrSize> From<VMOffsetsFields<P>> for VMOffsets<P> {
.unwrap();
ret.size = ret
.builtin_functions
.checked_add(
BuiltinFunctionIndex::builtin_functions_total_number()
.checked_mul(u32::from(ret.pointer_size()))
.unwrap(),
)
.checked_add(u32::from(ret.pointer_size()))
.unwrap();
return ret;
@@ -597,7 +593,7 @@ impl<P: PtrSize> VMOffsets<P> {
/// The offset of the builtin functions array.
#[inline]
pub fn vmctx_builtin_functions_begin(&self) -> u32 {
pub fn vmctx_builtin_functions(&self) -> u32 {
self.builtin_functions
}
@@ -739,12 +735,6 @@ impl<P: PtrSize> VMOffsets<P> {
pub fn vmctx_vmglobal_import_from(&self, index: GlobalIndex) -> u32 {
self.vmctx_vmglobal_import(index) + u32::from(self.vmglobal_import_from())
}
/// Return the offset to builtin function in `VMBuiltinFunctionsArray` index `index`.
#[inline]
pub fn vmctx_builtin_function(&self, index: BuiltinFunctionIndex) -> u32 {
self.vmctx_builtin_functions_begin() + index.index() * u32::from(self.pointer_size())
}
}
/// Offsets for `VMExternData`.