Improve robustness of cache loading/storing (#974)
* Improve robustness of cache loading/storing
Today wasmtime incorrectly loads compiled compiled modules from the
global cache when toggling settings such as optimizations. For example
if you execute `wasmtime foo.wasm` that will cache globally an
unoptimized version of the wasm module. If you then execute `wasmtime -O
foo.wasm` it would then reload the unoptimized version from cache, not
realizing the compilation settings were different, and use that instead.
This can lead to very surprising behavior naturally!
This commit updates how the cache is managed in an attempt to make it
much more robust against these sorts of issues. This takes a leaf out of
rustc's playbook and models the cache with a function that looks like:
fn load<T: Hash>(
&self,
data: T,
compute: fn(T) -> CacheEntry,
) -> CacheEntry;
The goal here is that it guarantees that all the `data` necessary to
`compute` the result of the cache entry is hashable and stored into the
hash key entry. This was previously open-coded and manually managed
where items were hashed explicitly, but this construction guarantees
that everything reasonable `compute` could use to compile the module is
stored in `data`, which is itself hashable.
This refactoring then resulted in a few workarounds and a few fixes,
including the original issue:
* The `Module` type was split into `Module` and `ModuleLocal` where only
the latter is hashed. The previous hash function for a `Module` left
out items like the `start_func` and didn't hash items like the imports
of the module. Omitting the `start_func` was fine since compilation
didn't actually use it, but omitting imports seemed uncomfortable
because while compilation didn't use the import values it did use the
*number* of imports, which seems like it should then be put into the
cache key. The `ModuleLocal` type now derives `Hash` to guarantee that
all of its contents affect the hash key.
* The `ModuleTranslationState` from `cranelift-wasm` doesn't implement
`Hash` which means that we have a manual wrapper to work around that.
This will be fixed with an upstream implementation, since this state
affects the generated wasm code. Currently this is just a map of
signatures, which is present in `Module` anyway, so we should be good
for the time being.
* Hashing `dyn TargetIsa` was also added, where previously it was not
fully hashed. Previously only the target name was used as part of the
cache key, but crucially the flags of compilation were omitted (for
example the optimization flags). Unfortunately the trait object itself
is not hashable so we still have to manually write a wrapper to hash
it, but we likely want to add upstream some utilities to hash isa
objects into cranelift itself. For now though we can continue to add
hashed fields as necessary.
Overall the goal here was to use the compiler to expose what we're not
hashing, and then make sure we organize data and write the right code to
ensure everything is hashed, and nothing more.
* Update crates/environ/src/module.rs
Co-Authored-By: Peter Huene <peterhuene@protonmail.com>
* Fix lightbeam
* Fix compilation of tests
* Update the expected structure of the cache
* Revert "Update the expected structure of the cache"
This reverts commit 2b53fee426a4e411c313d8c1e424841ba304a9cd.
* Separate the cache dir a bit
* Add a test the cache is busted with opt levels
* rustfmt
Co-authored-by: Peter Huene <peterhuene@protonmail.com>
This commit is contained in:
@@ -1,6 +1,5 @@
|
||||
//! Data structures for representing decoded wasm modules.
|
||||
|
||||
use crate::module_environ::FunctionBodyData;
|
||||
use crate::tunables::Tunables;
|
||||
use crate::WASM_MAX_PAGES;
|
||||
use cranelift_codegen::ir;
|
||||
@@ -12,7 +11,6 @@ use cranelift_wasm::{
|
||||
use indexmap::IndexMap;
|
||||
use more_asserts::assert_ge;
|
||||
use std::collections::HashMap;
|
||||
use std::hash::{Hash, Hasher};
|
||||
use std::sync::atomic::{AtomicUsize, Ordering::SeqCst};
|
||||
|
||||
/// A WebAssembly table initializer.
|
||||
@@ -134,14 +132,16 @@ impl TablePlan {
|
||||
|
||||
/// A translated WebAssembly module, excluding the function bodies and
|
||||
/// memory initializers.
|
||||
// WARNING: when modifying, make sure that `hash_for_cache` is still valid!
|
||||
#[derive(Debug)]
|
||||
pub struct Module {
|
||||
/// A unique identifier (within this process) for this module.
|
||||
pub id: usize,
|
||||
|
||||
/// Unprocessed signatures exactly as provided by `declare_signature()`.
|
||||
pub signatures: PrimaryMap<SignatureIndex, ir::Signature>,
|
||||
/// Local information about a module which is the bare minimum necessary to
|
||||
/// translate a function body. This is derived as `Hash` whereas this module
|
||||
/// isn't, since it contains too much information needed to translate a
|
||||
/// function.
|
||||
pub local: ModuleLocal,
|
||||
|
||||
/// Names of imported functions, as well as the index of the import that
|
||||
/// performed this import.
|
||||
@@ -156,18 +156,6 @@ pub struct Module {
|
||||
/// Names of imported globals.
|
||||
pub imported_globals: PrimaryMap<GlobalIndex, (String, String, u32)>,
|
||||
|
||||
/// Types of functions, imported and local.
|
||||
pub functions: PrimaryMap<FuncIndex, SignatureIndex>,
|
||||
|
||||
/// WebAssembly tables.
|
||||
pub table_plans: PrimaryMap<TableIndex, TablePlan>,
|
||||
|
||||
/// WebAssembly linear memory plans.
|
||||
pub memory_plans: PrimaryMap<MemoryIndex, MemoryPlan>,
|
||||
|
||||
/// WebAssembly global variables.
|
||||
pub globals: PrimaryMap<GlobalIndex, Global>,
|
||||
|
||||
/// Exported entities.
|
||||
pub exports: IndexMap<String, Export>,
|
||||
|
||||
@@ -181,6 +169,42 @@ pub struct Module {
|
||||
pub func_names: HashMap<FuncIndex, String>,
|
||||
}
|
||||
|
||||
/// Local information known about a wasm module, the bare minimum necessary to
|
||||
/// translate function bodies.
|
||||
///
|
||||
/// This is stored within a `Module` and it implements `Hash`, unlike `Module`,
|
||||
/// and is used as part of the cache key when we load compiled modules from the
|
||||
/// global cache.
|
||||
#[derive(Debug, Hash)]
|
||||
pub struct ModuleLocal {
|
||||
/// Unprocessed signatures exactly as provided by `declare_signature()`.
|
||||
pub signatures: PrimaryMap<SignatureIndex, ir::Signature>,
|
||||
|
||||
/// Number of imported functions in the module.
|
||||
pub num_imported_funcs: usize,
|
||||
|
||||
/// Number of imported tables in the module.
|
||||
pub num_imported_tables: usize,
|
||||
|
||||
/// Number of imported memories in the module.
|
||||
pub num_imported_memories: usize,
|
||||
|
||||
/// Number of imported globals in the module.
|
||||
pub num_imported_globals: usize,
|
||||
|
||||
/// Types of functions, imported and local.
|
||||
pub functions: PrimaryMap<FuncIndex, SignatureIndex>,
|
||||
|
||||
/// WebAssembly tables.
|
||||
pub table_plans: PrimaryMap<TableIndex, TablePlan>,
|
||||
|
||||
/// WebAssembly linear memory plans.
|
||||
pub memory_plans: PrimaryMap<MemoryIndex, MemoryPlan>,
|
||||
|
||||
/// WebAssembly global variables.
|
||||
pub globals: PrimaryMap<GlobalIndex, Global>,
|
||||
}
|
||||
|
||||
impl Module {
|
||||
/// Allocates the module data structures.
|
||||
pub fn new() -> Self {
|
||||
@@ -188,132 +212,115 @@ impl Module {
|
||||
|
||||
Self {
|
||||
id: NEXT_ID.fetch_add(1, SeqCst),
|
||||
signatures: PrimaryMap::new(),
|
||||
imported_funcs: PrimaryMap::new(),
|
||||
imported_tables: PrimaryMap::new(),
|
||||
imported_memories: PrimaryMap::new(),
|
||||
imported_globals: PrimaryMap::new(),
|
||||
functions: PrimaryMap::new(),
|
||||
table_plans: PrimaryMap::new(),
|
||||
memory_plans: PrimaryMap::new(),
|
||||
globals: PrimaryMap::new(),
|
||||
exports: IndexMap::new(),
|
||||
start_func: None,
|
||||
table_elements: Vec::new(),
|
||||
func_names: HashMap::new(),
|
||||
local: ModuleLocal {
|
||||
num_imported_funcs: 0,
|
||||
num_imported_tables: 0,
|
||||
num_imported_memories: 0,
|
||||
num_imported_globals: 0,
|
||||
signatures: PrimaryMap::new(),
|
||||
functions: PrimaryMap::new(),
|
||||
table_plans: PrimaryMap::new(),
|
||||
memory_plans: PrimaryMap::new(),
|
||||
globals: PrimaryMap::new(),
|
||||
},
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl ModuleLocal {
|
||||
/// Convert a `DefinedFuncIndex` into a `FuncIndex`.
|
||||
pub fn func_index(&self, defined_func: DefinedFuncIndex) -> FuncIndex {
|
||||
FuncIndex::new(self.imported_funcs.len() + defined_func.index())
|
||||
FuncIndex::new(self.num_imported_funcs + defined_func.index())
|
||||
}
|
||||
|
||||
/// Convert a `FuncIndex` into a `DefinedFuncIndex`. Returns None if the
|
||||
/// index is an imported function.
|
||||
pub fn defined_func_index(&self, func: FuncIndex) -> Option<DefinedFuncIndex> {
|
||||
if func.index() < self.imported_funcs.len() {
|
||||
if func.index() < self.num_imported_funcs {
|
||||
None
|
||||
} else {
|
||||
Some(DefinedFuncIndex::new(
|
||||
func.index() - self.imported_funcs.len(),
|
||||
func.index() - self.num_imported_funcs,
|
||||
))
|
||||
}
|
||||
}
|
||||
|
||||
/// Test whether the given function index is for an imported function.
|
||||
pub fn is_imported_function(&self, index: FuncIndex) -> bool {
|
||||
index.index() < self.imported_funcs.len()
|
||||
index.index() < self.num_imported_funcs
|
||||
}
|
||||
|
||||
/// Convert a `DefinedTableIndex` into a `TableIndex`.
|
||||
pub fn table_index(&self, defined_table: DefinedTableIndex) -> TableIndex {
|
||||
TableIndex::new(self.imported_tables.len() + defined_table.index())
|
||||
TableIndex::new(self.num_imported_tables + defined_table.index())
|
||||
}
|
||||
|
||||
/// Convert a `TableIndex` into a `DefinedTableIndex`. Returns None if the
|
||||
/// index is an imported table.
|
||||
pub fn defined_table_index(&self, table: TableIndex) -> Option<DefinedTableIndex> {
|
||||
if table.index() < self.imported_tables.len() {
|
||||
if table.index() < self.num_imported_tables {
|
||||
None
|
||||
} else {
|
||||
Some(DefinedTableIndex::new(
|
||||
table.index() - self.imported_tables.len(),
|
||||
table.index() - self.num_imported_tables,
|
||||
))
|
||||
}
|
||||
}
|
||||
|
||||
/// Test whether the given table index is for an imported table.
|
||||
pub fn is_imported_table(&self, index: TableIndex) -> bool {
|
||||
index.index() < self.imported_tables.len()
|
||||
index.index() < self.num_imported_tables
|
||||
}
|
||||
|
||||
/// Convert a `DefinedMemoryIndex` into a `MemoryIndex`.
|
||||
pub fn memory_index(&self, defined_memory: DefinedMemoryIndex) -> MemoryIndex {
|
||||
MemoryIndex::new(self.imported_memories.len() + defined_memory.index())
|
||||
MemoryIndex::new(self.num_imported_memories + defined_memory.index())
|
||||
}
|
||||
|
||||
/// Convert a `MemoryIndex` into a `DefinedMemoryIndex`. Returns None if the
|
||||
/// index is an imported memory.
|
||||
pub fn defined_memory_index(&self, memory: MemoryIndex) -> Option<DefinedMemoryIndex> {
|
||||
if memory.index() < self.imported_memories.len() {
|
||||
if memory.index() < self.num_imported_memories {
|
||||
None
|
||||
} else {
|
||||
Some(DefinedMemoryIndex::new(
|
||||
memory.index() - self.imported_memories.len(),
|
||||
memory.index() - self.num_imported_memories,
|
||||
))
|
||||
}
|
||||
}
|
||||
|
||||
/// Test whether the given memory index is for an imported memory.
|
||||
pub fn is_imported_memory(&self, index: MemoryIndex) -> bool {
|
||||
index.index() < self.imported_memories.len()
|
||||
index.index() < self.num_imported_memories
|
||||
}
|
||||
|
||||
/// Convert a `DefinedGlobalIndex` into a `GlobalIndex`.
|
||||
pub fn global_index(&self, defined_global: DefinedGlobalIndex) -> GlobalIndex {
|
||||
GlobalIndex::new(self.imported_globals.len() + defined_global.index())
|
||||
GlobalIndex::new(self.num_imported_globals + defined_global.index())
|
||||
}
|
||||
|
||||
/// Convert a `GlobalIndex` into a `DefinedGlobalIndex`. Returns None if the
|
||||
/// index is an imported global.
|
||||
pub fn defined_global_index(&self, global: GlobalIndex) -> Option<DefinedGlobalIndex> {
|
||||
if global.index() < self.imported_globals.len() {
|
||||
if global.index() < self.num_imported_globals {
|
||||
None
|
||||
} else {
|
||||
Some(DefinedGlobalIndex::new(
|
||||
global.index() - self.imported_globals.len(),
|
||||
global.index() - self.num_imported_globals,
|
||||
))
|
||||
}
|
||||
}
|
||||
|
||||
/// Test whether the given global index is for an imported global.
|
||||
pub fn is_imported_global(&self, index: GlobalIndex) -> bool {
|
||||
index.index() < self.imported_globals.len()
|
||||
}
|
||||
|
||||
/// Computes hash of the module for the purpose of caching.
|
||||
pub fn hash_for_cache<'data, H>(
|
||||
&self,
|
||||
function_body_inputs: &PrimaryMap<DefinedFuncIndex, FunctionBodyData<'data>>,
|
||||
state: &mut H,
|
||||
) where
|
||||
H: Hasher,
|
||||
{
|
||||
// There's no need to cache names (strings), start function
|
||||
// and data initializers (for both memory and tables)
|
||||
self.signatures.hash(state);
|
||||
self.functions.hash(state);
|
||||
self.table_plans.hash(state);
|
||||
self.memory_plans.hash(state);
|
||||
self.globals.hash(state);
|
||||
// IndexMap (self.export) iterates over values in order of item inserts
|
||||
// Let's actually sort the values.
|
||||
let mut exports = self.exports.values().collect::<Vec<_>>();
|
||||
exports.sort();
|
||||
for val in exports {
|
||||
val.hash(state);
|
||||
}
|
||||
function_body_inputs.hash(state);
|
||||
index.index() < self.num_imported_globals
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user