Cranelift: MachBuffer: apply branch peephole opts one last time at buffer tail. (#4652)
The `MachBuffer` applies a set of peephole-optimization rules to do branch threading, leverage fallthrough paths, eliminate empty blocks, and flip conditional branches where needed to make branches more efficient starting from naive always-branch-at-end-of-BB code. This works by applying the rules at every label-bind, which is equivalent to applying them at the end of every basic block, where branches are usually inserted. However, this misses one case: the end of the buffer! Currently we don't optimize any redundant or foldable branches at the very end of the machine code. This usually doesn't matter when the function ends in an epilogue with `ret` as the last instruction. However, when cold blocks exist, it can actually matter. Thanks to @mchesser for pointing out this issue in #4636.
This commit is contained in:
@@ -1214,6 +1214,10 @@ impl<I: VCodeInst> MachBuffer<I> {
|
||||
pub fn finish(mut self) -> MachBufferFinalized {
|
||||
let _tt = timing::vcode_emit_finish();
|
||||
|
||||
// Do any optimizations on branches at tail of buffer, as if we
|
||||
// had bound one last label.
|
||||
self.optimize_branches();
|
||||
|
||||
self.finish_emission_maybe_forcing_veneers(false);
|
||||
|
||||
let mut srclocs = self.srclocs;
|
||||
|
||||
Reference in New Issue
Block a user