This is the implementation of https://github.com/bytecodealliance/wasmtime/issues/4155, using the "inverted API" approach suggested by @cfallin (thanks!) in Cranelift, and trait object to provide a backend for an all-included experience in Wasmtime. After the suggestion of Chris, `Function` has been split into mostly two parts: - on the one hand, `FunctionStencil` contains all the fields required during compilation, and that act as a compilation cache key: if two function stencils are the same, then the result of their compilation (`CompiledCodeBase<Stencil>`) will be the same. This makes caching trivial, as the only thing to cache is the `FunctionStencil`. - on the other hand, `FunctionParameters` contain the... function parameters that are required to finalize the result of compilation into a `CompiledCode` (aka `CompiledCodeBase<Final>`) with proper final relocations etc., by applying fixups and so on. Most changes are here to accomodate those requirements, in particular that `FunctionStencil` should be `Hash`able to be used as a key in the cache: - most source locations are now relative to a base source location in the function, and as such they're encoded as `RelSourceLoc` in the `FunctionStencil`. This required changes so that there's no need to explicitly mark a `SourceLoc` as the base source location, it's automatically detected instead the first time a non-default `SourceLoc` is set. - user-defined external names in the `FunctionStencil` (aka before this patch `ExternalName::User { namespace, index }`) are now references into an external table of `UserExternalNameRef -> UserExternalName`, present in the `FunctionParameters`, and must be explicitly declared using `Function::declare_imported_user_function`. - some refactorings have been made for function names: - `ExternalName` was used as the type for a `Function`'s name; while it thus allowed `ExternalName::Libcall` in this place, this would have been quite confusing to use it there. Instead, a new enum `UserFuncName` is introduced for this name, that's either a user-defined function name (the above `UserExternalName`) or a test case name. - The future of `ExternalName` is likely to become a full reference into the `FunctionParameters`'s mapping, instead of being "either a handle for user-defined external names, or the thing itself for other variants". I'm running out of time to do this, and this is not trivial as it implies touching ISLE which I'm less familiar with. The cache computes a sha256 hash of the `FunctionStencil`, and uses this as the cache key. No equality check (using `PartialEq`) is performed in addition to the hash being the same, as we hope that this is sufficient data to avoid collisions. A basic fuzz target has been introduced that tries to do the bare minimum: - check that a function successfully compiled and cached will be also successfully reloaded from the cache, and returns the exact same function. - check that a trivial modification in the external mapping of `UserExternalNameRef -> UserExternalName` hits the cache, and that other modifications don't hit the cache. - This last check is less efficient and less likely to happen, so probably should be rethought a bit. Thanks to both @alexcrichton and @cfallin for your very useful feedback on Zulip. Some numbers show that for a large wasm module we're using internally, this is a 20% compile-time speedup, because so many `FunctionStencil`s are the same, even within a single module. For a group of modules that have a lot of code in common, we get hit rates up to 70% when they're used together. When a single function changes in a wasm module, every other function is reloaded; that's still slower than I expect (between 10% and 50% of the overall compile time), so there's likely room for improvement. Fixes #4155.
359 lines
12 KiB
Rust
359 lines
12 KiB
Rust
//! Cranelift compilation context and main entry point.
|
|
//!
|
|
//! When compiling many small functions, it is important to avoid repeatedly allocating and
|
|
//! deallocating the data structures needed for compilation. The `Context` struct is used to hold
|
|
//! on to memory allocations between function compilations.
|
|
//!
|
|
//! The context does not hold a `TargetIsa` instance which has to be provided as an argument
|
|
//! instead. This is because an ISA instance is immutable and can be used by multiple compilation
|
|
//! contexts concurrently. Typically, you would have one context per compilation thread and only a
|
|
//! single ISA instance.
|
|
|
|
use crate::alias_analysis::AliasAnalysis;
|
|
use crate::dce::do_dce;
|
|
use crate::dominator_tree::DominatorTree;
|
|
use crate::flowgraph::ControlFlowGraph;
|
|
use crate::ir::Function;
|
|
use crate::isa::TargetIsa;
|
|
use crate::legalizer::simple_legalize;
|
|
use crate::licm::do_licm;
|
|
use crate::loop_analysis::LoopAnalysis;
|
|
use crate::machinst::{CompiledCode, CompiledCodeStencil};
|
|
use crate::nan_canonicalization::do_nan_canonicalization;
|
|
use crate::remove_constant_phis::do_remove_constant_phis;
|
|
use crate::result::{CodegenResult, CompileResult};
|
|
use crate::settings::{FlagsOrIsa, OptLevel};
|
|
use crate::simple_gvn::do_simple_gvn;
|
|
use crate::simple_preopt::do_preopt;
|
|
use crate::unreachable_code::eliminate_unreachable_code;
|
|
use crate::verifier::{verify_context, VerifierErrors, VerifierResult};
|
|
use crate::{timing, CompileError};
|
|
#[cfg(feature = "souper-harvest")]
|
|
use alloc::string::String;
|
|
use alloc::vec::Vec;
|
|
|
|
#[cfg(feature = "souper-harvest")]
|
|
use crate::souper_harvest::do_souper_harvest;
|
|
|
|
/// Persistent data structures and compilation pipeline.
|
|
pub struct Context {
|
|
/// The function we're compiling.
|
|
pub func: Function,
|
|
|
|
/// The control flow graph of `func`.
|
|
pub cfg: ControlFlowGraph,
|
|
|
|
/// Dominator tree for `func`.
|
|
pub domtree: DominatorTree,
|
|
|
|
/// Loop analysis of `func`.
|
|
pub loop_analysis: LoopAnalysis,
|
|
|
|
/// Result of MachBackend compilation, if computed.
|
|
pub(crate) compiled_code: Option<CompiledCode>,
|
|
|
|
/// Flag: do we want a disassembly with the CompiledCode?
|
|
pub want_disasm: bool,
|
|
}
|
|
|
|
impl Context {
|
|
/// Allocate a new compilation context.
|
|
///
|
|
/// The returned instance should be reused for compiling multiple functions in order to avoid
|
|
/// needless allocator thrashing.
|
|
pub fn new() -> Self {
|
|
Self::for_function(Function::new())
|
|
}
|
|
|
|
/// Allocate a new compilation context with an existing Function.
|
|
///
|
|
/// The returned instance should be reused for compiling multiple functions in order to avoid
|
|
/// needless allocator thrashing.
|
|
pub fn for_function(func: Function) -> Self {
|
|
Self {
|
|
func,
|
|
cfg: ControlFlowGraph::new(),
|
|
domtree: DominatorTree::new(),
|
|
loop_analysis: LoopAnalysis::new(),
|
|
compiled_code: None,
|
|
want_disasm: false,
|
|
}
|
|
}
|
|
|
|
/// Clear all data structures in this context.
|
|
pub fn clear(&mut self) {
|
|
self.func.clear();
|
|
self.cfg.clear();
|
|
self.domtree.clear();
|
|
self.loop_analysis.clear();
|
|
self.compiled_code = None;
|
|
self.want_disasm = false;
|
|
}
|
|
|
|
/// Returns the compilation result for this function, available after any `compile` function
|
|
/// has been called.
|
|
pub fn compiled_code(&self) -> Option<&CompiledCode> {
|
|
self.compiled_code.as_ref()
|
|
}
|
|
|
|
/// Set the flag to request a disassembly when compiling with a
|
|
/// `MachBackend` backend.
|
|
pub fn set_disasm(&mut self, val: bool) {
|
|
self.want_disasm = val;
|
|
}
|
|
|
|
/// Compile the function, and emit machine code into a `Vec<u8>`.
|
|
///
|
|
/// Run the function through all the passes necessary to generate code for the target ISA
|
|
/// represented by `isa`, as well as the final step of emitting machine code into a
|
|
/// `Vec<u8>`. The machine code is not relocated. Instead, any relocations can be obtained
|
|
/// from `compiled_code()`.
|
|
///
|
|
/// This function calls `compile`, taking care to resize `mem` as
|
|
/// needed, so it provides a safe interface.
|
|
///
|
|
/// Returns information about the function's code and read-only data.
|
|
pub fn compile_and_emit(
|
|
&mut self,
|
|
isa: &dyn TargetIsa,
|
|
mem: &mut Vec<u8>,
|
|
) -> CompileResult<&CompiledCode> {
|
|
let compiled_code = self.compile(isa)?;
|
|
let code_info = compiled_code.code_info();
|
|
let old_len = mem.len();
|
|
mem.resize(old_len + code_info.total_size as usize, 0);
|
|
mem[old_len..].copy_from_slice(compiled_code.code_buffer());
|
|
Ok(compiled_code)
|
|
}
|
|
|
|
/// Internally compiles the function into a stencil.
|
|
///
|
|
/// Public only for testing and fuzzing purposes.
|
|
pub fn compile_stencil(&mut self, isa: &dyn TargetIsa) -> CodegenResult<CompiledCodeStencil> {
|
|
let _tt = timing::compile();
|
|
|
|
self.verify_if(isa)?;
|
|
|
|
let opt_level = isa.flags().opt_level();
|
|
log::trace!(
|
|
"Compiling (opt level {:?}):\n{}",
|
|
opt_level,
|
|
self.func.display()
|
|
);
|
|
|
|
self.compute_cfg();
|
|
if opt_level != OptLevel::None {
|
|
self.preopt(isa)?;
|
|
}
|
|
if isa.flags().enable_nan_canonicalization() {
|
|
self.canonicalize_nans(isa)?;
|
|
}
|
|
|
|
self.legalize(isa)?;
|
|
if opt_level != OptLevel::None {
|
|
self.compute_domtree();
|
|
self.compute_loop_analysis();
|
|
self.licm(isa)?;
|
|
self.simple_gvn(isa)?;
|
|
}
|
|
|
|
self.compute_domtree();
|
|
self.eliminate_unreachable_code(isa)?;
|
|
if opt_level != OptLevel::None {
|
|
self.dce(isa)?;
|
|
}
|
|
|
|
self.remove_constant_phis(isa)?;
|
|
|
|
if opt_level != OptLevel::None && isa.flags().enable_alias_analysis() {
|
|
self.replace_redundant_loads()?;
|
|
self.simple_gvn(isa)?;
|
|
}
|
|
|
|
isa.compile_function(&self.func, self.want_disasm)
|
|
}
|
|
|
|
/// Compile the function.
|
|
///
|
|
/// Run the function through all the passes necessary to generate code for the target ISA
|
|
/// represented by `isa`. This does not include the final step of emitting machine code into a
|
|
/// code sink.
|
|
///
|
|
/// Returns information about the function's code and read-only data.
|
|
pub fn compile(&mut self, isa: &dyn TargetIsa) -> CompileResult<&CompiledCode> {
|
|
let _tt = timing::compile();
|
|
let stencil = self.compile_stencil(isa).map_err(|error| CompileError {
|
|
inner: error,
|
|
func: &self.func,
|
|
})?;
|
|
Ok(self
|
|
.compiled_code
|
|
.insert(stencil.apply_params(&self.func.params)))
|
|
}
|
|
|
|
/// If available, return information about the code layout in the
|
|
/// final machine code: the offsets (in bytes) of each basic-block
|
|
/// start, and all basic-block edges.
|
|
pub fn get_code_bb_layout(&self) -> Option<(Vec<usize>, Vec<(usize, usize)>)> {
|
|
if let Some(result) = self.compiled_code.as_ref() {
|
|
Some((
|
|
result.bb_starts.iter().map(|&off| off as usize).collect(),
|
|
result
|
|
.bb_edges
|
|
.iter()
|
|
.map(|&(from, to)| (from as usize, to as usize))
|
|
.collect(),
|
|
))
|
|
} else {
|
|
None
|
|
}
|
|
}
|
|
|
|
/// Creates unwind information for the function.
|
|
///
|
|
/// Returns `None` if the function has no unwind information.
|
|
#[cfg(feature = "unwind")]
|
|
pub fn create_unwind_info(
|
|
&self,
|
|
isa: &dyn TargetIsa,
|
|
) -> CodegenResult<Option<crate::isa::unwind::UnwindInfo>> {
|
|
let unwind_info_kind = isa.unwind_info_kind();
|
|
let result = self.compiled_code.as_ref().unwrap();
|
|
isa.emit_unwind_info(result, unwind_info_kind)
|
|
}
|
|
|
|
/// Run the verifier on the function.
|
|
///
|
|
/// Also check that the dominator tree and control flow graph are consistent with the function.
|
|
pub fn verify<'a, FOI: Into<FlagsOrIsa<'a>>>(&self, fisa: FOI) -> VerifierResult<()> {
|
|
let mut errors = VerifierErrors::default();
|
|
let _ = verify_context(&self.func, &self.cfg, &self.domtree, fisa, &mut errors);
|
|
|
|
if errors.is_empty() {
|
|
Ok(())
|
|
} else {
|
|
Err(errors)
|
|
}
|
|
}
|
|
|
|
/// Run the verifier only if the `enable_verifier` setting is true.
|
|
pub fn verify_if<'a, FOI: Into<FlagsOrIsa<'a>>>(&self, fisa: FOI) -> CodegenResult<()> {
|
|
let fisa = fisa.into();
|
|
if fisa.flags.enable_verifier() {
|
|
self.verify(fisa)?;
|
|
}
|
|
Ok(())
|
|
}
|
|
|
|
/// Perform dead-code elimination on the function.
|
|
pub fn dce<'a, FOI: Into<FlagsOrIsa<'a>>>(&mut self, fisa: FOI) -> CodegenResult<()> {
|
|
do_dce(&mut self.func, &mut self.domtree);
|
|
self.verify_if(fisa)?;
|
|
Ok(())
|
|
}
|
|
|
|
/// Perform constant-phi removal on the function.
|
|
pub fn remove_constant_phis<'a, FOI: Into<FlagsOrIsa<'a>>>(
|
|
&mut self,
|
|
fisa: FOI,
|
|
) -> CodegenResult<()> {
|
|
do_remove_constant_phis(&mut self.func, &mut self.domtree);
|
|
self.verify_if(fisa)?;
|
|
Ok(())
|
|
}
|
|
|
|
/// Perform pre-legalization rewrites on the function.
|
|
pub fn preopt(&mut self, isa: &dyn TargetIsa) -> CodegenResult<()> {
|
|
do_preopt(&mut self.func, &mut self.cfg, isa);
|
|
self.verify_if(isa)?;
|
|
Ok(())
|
|
}
|
|
|
|
/// Perform NaN canonicalizing rewrites on the function.
|
|
pub fn canonicalize_nans(&mut self, isa: &dyn TargetIsa) -> CodegenResult<()> {
|
|
do_nan_canonicalization(&mut self.func);
|
|
self.verify_if(isa)
|
|
}
|
|
|
|
/// Run the legalizer for `isa` on the function.
|
|
pub fn legalize(&mut self, isa: &dyn TargetIsa) -> CodegenResult<()> {
|
|
// Legalization invalidates the domtree and loop_analysis by mutating the CFG.
|
|
// TODO: Avoid doing this when legalization doesn't actually mutate the CFG.
|
|
self.domtree.clear();
|
|
self.loop_analysis.clear();
|
|
|
|
// Run some specific legalizations only.
|
|
simple_legalize(&mut self.func, &mut self.cfg, isa);
|
|
self.verify_if(isa)
|
|
}
|
|
|
|
/// Compute the control flow graph.
|
|
pub fn compute_cfg(&mut self) {
|
|
self.cfg.compute(&self.func)
|
|
}
|
|
|
|
/// Compute dominator tree.
|
|
pub fn compute_domtree(&mut self) {
|
|
self.domtree.compute(&self.func, &self.cfg)
|
|
}
|
|
|
|
/// Compute the loop analysis.
|
|
pub fn compute_loop_analysis(&mut self) {
|
|
self.loop_analysis
|
|
.compute(&self.func, &self.cfg, &self.domtree)
|
|
}
|
|
|
|
/// Compute the control flow graph and dominator tree.
|
|
pub fn flowgraph(&mut self) {
|
|
self.compute_cfg();
|
|
self.compute_domtree()
|
|
}
|
|
|
|
/// Perform simple GVN on the function.
|
|
pub fn simple_gvn<'a, FOI: Into<FlagsOrIsa<'a>>>(&mut self, fisa: FOI) -> CodegenResult<()> {
|
|
do_simple_gvn(&mut self.func, &mut self.domtree);
|
|
self.verify_if(fisa)
|
|
}
|
|
|
|
/// Perform LICM on the function.
|
|
pub fn licm(&mut self, isa: &dyn TargetIsa) -> CodegenResult<()> {
|
|
do_licm(
|
|
&mut self.func,
|
|
&mut self.cfg,
|
|
&mut self.domtree,
|
|
&mut self.loop_analysis,
|
|
);
|
|
self.verify_if(isa)
|
|
}
|
|
|
|
/// Perform unreachable code elimination.
|
|
pub fn eliminate_unreachable_code<'a, FOI>(&mut self, fisa: FOI) -> CodegenResult<()>
|
|
where
|
|
FOI: Into<FlagsOrIsa<'a>>,
|
|
{
|
|
eliminate_unreachable_code(&mut self.func, &mut self.cfg, &self.domtree);
|
|
self.verify_if(fisa)
|
|
}
|
|
|
|
/// Replace all redundant loads with the known values in
|
|
/// memory. These are loads whose values were already loaded by
|
|
/// other loads earlier, as well as loads whose values were stored
|
|
/// by a store instruction to the same instruction (so-called
|
|
/// "store-to-load forwarding").
|
|
pub fn replace_redundant_loads(&mut self) -> CodegenResult<()> {
|
|
let mut analysis = AliasAnalysis::new(&mut self.func, &self.domtree);
|
|
analysis.compute_and_update_aliases();
|
|
Ok(())
|
|
}
|
|
|
|
/// Harvest candidate left-hand sides for superoptimization with Souper.
|
|
#[cfg(feature = "souper-harvest")]
|
|
pub fn souper_harvest(
|
|
&mut self,
|
|
out: &mut std::sync::mpsc::Sender<String>,
|
|
) -> CodegenResult<()> {
|
|
do_souper_harvest(&self.func, out);
|
|
Ok(())
|
|
}
|
|
}
|