Don't copy VMBuiltinFunctionsArray into each VMContext (#3741)

* Don't copy `VMBuiltinFunctionsArray` into each `VMContext`

This is another PR along the lines of "let's squeeze all possible
performance we can out of instantiation". Before this PR we would copy,
by value, the contents of `VMBuiltinFunctionsArray` into each
`VMContext` allocated. This array of function pointers is modestly-sized
but growing over time as we add various intrinsics. Additionally it's
the exact same for all `VMContext` allocations.

This PR attempts to speed up instantiation slightly by instead storing
an indirection to the function array. This means that calling a builtin
intrinsic is a tad bit slower since it requires two loads instead of one
(one to get the base pointer, another to get the actual address).
Otherwise though `VMContext` initialization is now simply setting one
pointer instead of doing a `memcpy` from one location to another.

With some macro-magic this commit also replaces the previous
implementation with one that's more `const`-friendly which also gets us
compile-time type-checks of libcalls as well as compile-time
verification that all libcalls are defined.

Overall, as with #3739, the win is very modest here. Locally I measured
a speedup from 1.9us to 1.7us taken to instantiate an empty module with
one function. While small at these scales it's still a 10% improvement!

* Review comments
This commit is contained in:
Alex Crichton
2022-01-28 16:24:34 -06:00
committed by GitHub
parent 2f494240f8
commit a25f7bdba5
5 changed files with 87 additions and 111 deletions

View File

@@ -281,10 +281,15 @@ impl<'module_environment> FuncEnvironment<'module_environment> {
let mut mem_flags = ir::MemFlags::trusted();
mem_flags.set_readonly();
// Load the base of the array of builtin functions
let array_offset = i32::try_from(self.offsets.vmctx_builtin_functions()).unwrap();
let array_addr = pos.ins().load(pointer_type, mem_flags, base, array_offset);
// Load the callee address.
let body_offset =
i32::try_from(self.offsets.vmctx_builtin_function(callee_func_idx)).unwrap();
let func_addr = pos.ins().load(pointer_type, mem_flags, base, body_offset);
let body_offset = i32::try_from(callee_func_idx.index() * pointer_type.bytes()).unwrap();
let func_addr = pos
.ins()
.load(pointer_type, mem_flags, array_addr, body_offset);
(base, func_addr)
}