Improve robustness of cache loading/storing (#974)

* Improve robustness of cache loading/storing Today wasmtime incorrectly loads compiled compiled modules from the global cache when toggling settings such as optimizations. For example if you execute `wasmtime foo.wasm` that will cache globally an unoptimized version of the wasm module. If you then execute `wasmtime -O foo.wasm` it would then reload the unoptimized version from cache, not realizing the compilation settings were different, and use that instead. This can lead to very surprising behavior naturally! This commit updates how the cache is managed in an attempt to make it much more robust against these sorts of issues. This takes a leaf out of rustc's playbook and models the cache with a function that looks like: fn load<T: Hash>( &self, data: T, compute: fn(T) -> CacheEntry, ) -> CacheEntry; The goal here is that it guarantees that all the `data` necessary to `compute` the result of the cache entry is hashable and stored into the hash key entry. This was previously open-coded and manually managed where items were hashed explicitly, but this construction guarantees that everything reasonable `compute` could use to compile the module is stored in `data`, which is itself hashable. This refactoring then resulted in a few workarounds and a few fixes, including the original issue: * The `Module` type was split into `Module` and `ModuleLocal` where only the latter is hashed. The previous hash function for a `Module` left out items like the `start_func` and didn't hash items like the imports of the module. Omitting the `start_func` was fine since compilation didn't actually use it, but omitting imports seemed uncomfortable because while compilation didn't use the import values it did use the *number* of imports, which seems like it should then be put into the cache key. The `ModuleLocal` type now derives `Hash` to guarantee that all of its contents affect the hash key. * The `ModuleTranslationState` from `cranelift-wasm` doesn't implement `Hash` which means that we have a manual wrapper to work around that. This will be fixed with an upstream implementation, since this state affects the generated wasm code. Currently this is just a map of signatures, which is present in `Module` anyway, so we should be good for the time being. * Hashing `dyn TargetIsa` was also added, where previously it was not fully hashed. Previously only the target name was used as part of the cache key, but crucially the flags of compilation were omitted (for example the optimization flags). Unfortunately the trait object itself is not hashable so we still have to manually write a wrapper to hash it, but we likely want to add upstream some utilities to hash isa objects into cranelift itself. For now though we can continue to add hashed fields as necessary. Overall the goal here was to use the compiler to expose what we're not hashing, and then make sure we organize data and write the right code to ensure everything is hashed, and nothing more. * Update crates/environ/src/module.rs Co-Authored-By: Peter Huene <peterhuene@protonmail.com> * Fix lightbeam * Fix compilation of tests * Update the expected structure of the cache * Revert "Update the expected structure of the cache" This reverts commit 2b53fee426a4e411c313d8c1e424841ba304a9cd. * Separate the cache dir a bit * Add a test the cache is busted with opt levels * rustfmt Co-authored-by: Peter Huene <peterhuene@protonmail.com>
2020-02-26 16:18:02 -06:00
parent 2d268f49c9
commit c8ab1e293e
30 changed files with 550 additions and 643 deletions
--- a/crates/obj/src/context.rs
+++ b/crates/obj/src/context.rs
@@ -18,14 +18,14 @@ pub fn layout_vmcontext(
    module: &Module,
    target_config: &TargetFrontendConfig,
 ) -> (Box<[u8]>, Box<[TableRelocation]>) {
-    let ofs = VMOffsets::new(target_config.pointer_bytes(), &module);
+    let ofs = VMOffsets::new(target_config.pointer_bytes(), &module.local);
    let out_len = ofs.size_of_vmctx() as usize;
    let mut out = vec![0; out_len];

    // Assign unique indicies to unique signatures.
    let mut signature_registry = HashMap::new();
    let mut signature_registry_len = signature_registry.len();
-    for (index, sig) in module.signatures.iter() {
+    for (index, sig) in module.local.signatures.iter() {
        let offset = ofs.vmctx_vmshared_signature_id(index) as usize;
        let target_index = match signature_registry.entry(sig) {
            Entry::Occupied(o) => *o.get(),
@@ -43,9 +43,9 @@ pub fn layout_vmcontext(
    }

    let num_tables_imports = module.imported_tables.len();
-    let mut table_relocs = Vec::with_capacity(module.table_plans.len() - num_tables_imports);
-    for (index, table) in module.table_plans.iter().skip(num_tables_imports) {
-        let def_index = module.defined_table_index(index).unwrap();
+    let mut table_relocs = Vec::with_capacity(module.local.table_plans.len() - num_tables_imports);
+    for (index, table) in module.local.table_plans.iter().skip(num_tables_imports) {
+        let def_index = module.local.defined_table_index(index).unwrap();
        let offset = ofs.vmctx_vmtable_definition(def_index) as usize;
        let current_elements = table.table.minimum;
        unsafe {
@@ -67,8 +67,8 @@ pub fn layout_vmcontext(
    }

    let num_globals_imports = module.imported_globals.len();
-    for (index, global) in module.globals.iter().skip(num_globals_imports) {
-        let def_index = module.defined_global_index(index).unwrap();
+    for (index, global) in module.local.globals.iter().skip(num_globals_imports) {
+        let def_index = module.local.defined_global_index(index).unwrap();
        let offset = ofs.vmctx_vmglobal_definition(def_index) as usize;
        let to = unsafe { out.as_mut_ptr().add(offset) };
        match global.initializer {