Use maximum inline capacity available for SmallVec<VRegIndex> in SpillSet (#100)

* Use maximum inline capacity available for `SmallVec<VRegIndex>` in `SpillSet`

We were using 2, which is the maximum for 32-bit architectures, but on 64-bit
architectures we can get 4 inline elements without growing the size of the
`SmallVec`.

This is a statistically significant speed up, but is so small that our
formatting of floats truncates it (so less than 1%).

```
compilation :: instructions-retired :: benchmarks/bz2/benchmark.wasm

  Δ = 3360297.85 ± 40136.18 (confidence = 99%)

  more-inline-capacity.so is 1.00x to 1.00x faster than main.so!

  [945563401 945906690.73 946043245] main.so
  [942192473 942546392.88 942729104] more-inline-capacity.so

compilation :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm

  Δ = 1780540.13 ± 39362.84 (confidence = 99%)

  more-inline-capacity.so is 1.00x to 1.00x faster than main.so!

  [1544990595 1545359408.41 1545626251] main.so
  [1543269057 1543578868.28 1543851201] more-inline-capacity.so

compilation :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm

  Δ = 36577153.54 ± 243753.54 (confidence = 99%)

  more-inline-capacity.so is 1.00x to 1.00x faster than main.so!

  [33956158997 33957780594.50 33959538220] main.so
  [33919762415 33921203440.96 33923023358] more-inline-capacity.so
```

* Use a `const fn` to calculate number of inline elements
This commit is contained in:
Nick Fitzgerald
2022-11-02 12:16:22 -07:00
committed by GitHub
parent eb0a8fd22f
commit b41b1f9a3c

View File

@@ -264,9 +264,26 @@ pub struct BundleProperties {
pub fixed: bool,
}
/// Calculate the maximum `N` inline capacity for a `SmallVec<[T; N]>` we can
/// have without bloating its size to be larger than a `Vec<T>`.
const fn no_bloat_capacity<T>() -> usize {
// `Vec<T>` is three words: `(pointer, capacity, length)`.
//
// A `SmallVec<[T; N]>` replaces the first two members with the following:
//
// union {
// Inline([T; N]),
// Heap(pointer, capacity),
// }
//
// So if `size_of([T; N]) == size_of(pointer) + size_of(capacity)` then we
// get the maximum inline capacity without bloat.
std::mem::size_of::<usize>() * 2 / std::mem::size_of::<T>()
}
#[derive(Clone, Debug)]
pub struct SpillSet {
pub vregs: SmallVec<[VRegIndex; 2]>,
pub vregs: SmallVec<[VRegIndex; no_bloat_capacity::<VRegIndex>()]>,
pub slot: SpillSlotIndex,
pub reg_hint: PReg,
pub class: RegClass,