Commit Graph

4 Commits

Author SHA1 Message Date
Sam Parker
9c43749dfe [RFC] Dynamic Vector Support (#4200)
Introduce a new concept in the IR that allows a producer to create
dynamic vector types. An IR function can now contain global value(s)
that represent a dynamic scaling factor, for a given fixed-width
vector type. A dynamic type is then created by 'multiplying' the
corresponding global value with a fixed-width type. These new types
can be used just like the existing types and the type system has a
set of hard-coded dynamic types, such as I32X4XN, which the user
defined types map onto. The dynamic types are also used explicitly
to create dynamic stack slots, which have no set size like their
existing counterparts. New IR instructions are added to access these
new stack entities.

Currently, during codegen, the dynamic scaling factor has to be
lowered to a constant so the dynamic slots do eventually have a
compile-time known size, as do spill slots.

The current lowering for aarch64 just targets Neon, using a dynamic
scale of 1.

Copyright (c) 2022, Arm Limited.
2022-07-07 12:54:39 -07:00
Chris Fallin
00f357c028 Cranelift: support 14-bit Type index with some bitpacking. (#4269)
* Cranelift: make `ir::Type` a `u16`.

* Cranelift: pack ValueData back into 64 bits.

After extending `Type` to a `u16`, `ValueData` became 12 bytes rather
than 8. This packs it back down to 8 bytes (64 bits) by stealing two
bits from the `Type` for the enum discriminant (leaving 14 bits for the
type itself).

Performance comparison (3-way between original (`ty-u8`), 16-bit `Type`
(`ty-u16`), and this PR (`ty-packed`)):

```
~/work/sightglass% target/release/sightglass-cli benchmark \
    -e ~/ty-u8.so -e ~/ty-u16.so -e ~/ty-packed.so \
    --iterations-per-process 10 --processes 2 \
    benchmarks-next/spidermonkey/benchmark.wasm

compilation
  benchmarks-next/spidermonkey/benchmark.wasm
    cycles
      [20654406874 21749213920.50 22958520306] /home/cfallin/ty-packed.so
      [22227738316 22584704883.90 22916433748] /home/cfallin/ty-u16.so
      [20659150490 21598675968.60 22588108428] /home/cfallin/ty-u8.so
    nanoseconds
      [5435333269 5723139427.25 6041072883] /home/cfallin/ty-packed.so
      [5848788229 5942729637.85 6030030341] /home/cfallin/ty-u16.so
      [5436002390 5683248226.10 5943626225] /home/cfallin/ty-u8.so
```

So, when compiling SpiderMonkey.wasm, making `Type` 16 bits regresses
performance by 4.5% (5.683s -> 5.723s), while this PR gets 14 bits for a 1.0%
cost (5.683s -> 5.723s). That's still not great, and we can likely do better,
but it's a start.

* Fix test failure: entities to/from u32 via `{from,to}_bits`, not `{from,to}_u32`.
2022-07-05 14:51:02 -07:00
bjorn3
4c75616a7c Remove unused constants from cranelift-codegen-shared (#3479) 2021-11-18 18:51:35 +01:00
Benjamin Bouvier
f668869508 Share constants between codegen and the meta crate; 2019-10-10 16:45:48 +02:00