Further minor optimizations to instantiation (#3791)
* Shrink the size of `FuncData` Before this commit on a 64-bit system the `FuncData` type had a size of 88 bytes and after this commit it has a size of 32 bytes. A `FuncData` is required for all host functions in a store, including those inserted from a `Linker` into a store used during linking. This means that instantiation ends up creating a nontrivial number of these types and pushing them into the store. Looking at some profiles there were some surprisingly expensive movements of `FuncData` from the stack to a vector for moves-by-value generated by Rust. Shrinking this type enables more efficient code to be generated and additionally means less storage is needed in a store's function array. For instantiating the spidermonkey and rustpython modules this improves instantiation by 10% since they each import a fair number of host functions and the speedup here is relative to the number of items imported. * Use `ptr::copy_nonoverlapping` during initialization Prevoiusly `ptr::copy` was used for copying imports into place which translates to `memmove`, but `ptr::copy_nonoverlapping` can be used here since it's statically known these areas don't overlap. While this doesn't end up having a performance difference it's something I kept noticing while looking at the disassembly of `initialize_vmcontext` so I figured I'd go ahead and implement. * Indirect shared signature ids in the VMContext This commit is a small improvement for the instantiation time of modules by avoiding copying a list of `VMSharedSignatureIndex` entries into each `VMContext`, instead building one inside of a module and sharing that amongst all instances. This involves less lookups at instantiation time and less movement of data during instantiation. The downside is that type-checks on `call_indirect` now involve an additionally load, but I'm assuming that these are somewhat pessimized enough as-is that the runtime impact won't be much there. For instantiation performance this is a 5-10% win with rustpyhon/spidermonky instantiation. This should also reduce the size of each `VMContext` for an instantiation since signatures are no longer stored inline but shared amongst all instances with one module. Note that one subtle change here is that the array of `VMSharedSignatureIndex` was previously indexed by `TypeIndex`, and now it's indexed by `SignaturedIndex` which is a deduplicated form of `TypeIndex`. This is done because we already had a list of those lying around in `Module`, so it was easier to reuse that than to build a separate array and store it somewhere. * Reserve space in `Store<T>` with `InstancePre` This commit updates the instantiation process to reserve space in a `Store<T>` for the functions that an `InstancePre<T>`, as part of instantiation, will insert into it. Using an `InstancePre<T>` to instantiate allows pre-computing the number of host functions that will be inserted into a store, and by pre-reserving space we can avoid costly reallocations during instantiation by ensuring the function vector has enough space to fit everything during the instantiation process. Overall this makes instantiation of rustpython/spidermonkey about 8% faster locally. * Fix tests * Use checked arithmetic
This commit is contained in:
@@ -1556,13 +1556,26 @@ impl<'module_environment> cranelift_wasm::FuncEnvironment for FuncEnvironment<'m
|
||||
let sig_id_type = Type::int(u16::from(sig_id_size) * 8).unwrap();
|
||||
let vmctx = self.vmctx(builder.func);
|
||||
let base = builder.ins().global_value(pointer_type, vmctx);
|
||||
let offset =
|
||||
i32::try_from(self.offsets.vmctx_vmshared_signature_id(ty_index)).unwrap();
|
||||
|
||||
// Load the caller ID.
|
||||
// Load the caller ID. This requires loading the
|
||||
// `*mut VMCallerCheckedAnyfunc` base pointer from `VMContext`
|
||||
// and then loading, based on `SignatureIndex`, the
|
||||
// corresponding entry.
|
||||
let mut mem_flags = ir::MemFlags::trusted();
|
||||
mem_flags.set_readonly();
|
||||
let caller_sig_id = builder.ins().load(sig_id_type, mem_flags, base, offset);
|
||||
let signatures = builder.ins().load(
|
||||
pointer_type,
|
||||
mem_flags,
|
||||
base,
|
||||
i32::try_from(self.offsets.vmctx_signature_ids_array()).unwrap(),
|
||||
);
|
||||
let sig_index = self.module.types[ty_index].unwrap_function();
|
||||
let offset =
|
||||
i32::try_from(sig_index.as_u32().checked_mul(sig_id_type.bytes()).unwrap())
|
||||
.unwrap();
|
||||
let caller_sig_id = builder
|
||||
.ins()
|
||||
.load(sig_id_type, mem_flags, signatures, offset);
|
||||
|
||||
// Load the callee ID.
|
||||
let mem_flags = ir::MemFlags::trusted();
|
||||
|
||||
Reference in New Issue
Block a user