Cranelift: support 14-bit Type index with some bitpacking. (#4269)
* Cranelift: make `ir::Type` a `u16`.
* Cranelift: pack ValueData back into 64 bits.
After extending `Type` to a `u16`, `ValueData` became 12 bytes rather
than 8. This packs it back down to 8 bytes (64 bits) by stealing two
bits from the `Type` for the enum discriminant (leaving 14 bits for the
type itself).
Performance comparison (3-way between original (`ty-u8`), 16-bit `Type`
(`ty-u16`), and this PR (`ty-packed`)):
```
~/work/sightglass% target/release/sightglass-cli benchmark \
-e ~/ty-u8.so -e ~/ty-u16.so -e ~/ty-packed.so \
--iterations-per-process 10 --processes 2 \
benchmarks-next/spidermonkey/benchmark.wasm
compilation
benchmarks-next/spidermonkey/benchmark.wasm
cycles
[20654406874 21749213920.50 22958520306] /home/cfallin/ty-packed.so
[22227738316 22584704883.90 22916433748] /home/cfallin/ty-u16.so
[20659150490 21598675968.60 22588108428] /home/cfallin/ty-u8.so
nanoseconds
[5435333269 5723139427.25 6041072883] /home/cfallin/ty-packed.so
[5848788229 5942729637.85 6030030341] /home/cfallin/ty-u16.so
[5436002390 5683248226.10 5943626225] /home/cfallin/ty-u8.so
```
So, when compiling SpiderMonkey.wasm, making `Type` 16 bits regresses
performance by 4.5% (5.683s -> 5.723s), while this PR gets 14 bits for a 1.0%
cost (5.683s -> 5.723s). That's still not great, and we can likely do better,
but it's a start.
* Fix test failure: entities to/from u32 via `{from,to}_bits`, not `{from,to}_u32`.
This commit is contained in:
@@ -109,6 +109,20 @@ macro_rules! entity_impl {
|
||||
pub fn as_u32(self) -> u32 {
|
||||
self.0
|
||||
}
|
||||
|
||||
/// Return the raw bit encoding for this instance.
|
||||
#[allow(dead_code)]
|
||||
#[inline]
|
||||
pub fn as_bits(self) -> u32 {
|
||||
self.0
|
||||
}
|
||||
|
||||
/// Create a new instance from the raw bit encoding.
|
||||
#[allow(dead_code)]
|
||||
#[inline]
|
||||
pub fn from_bits(x: u32) -> Self {
|
||||
$entity(x)
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
Reference in New Issue
Block a user