aarch64: Add specialized shuffle lowerings (#5977)
* aarch64: Add `shuffle` lowerings for the `uzp{1,2}` instructions
This commit uses the same style of patterns in the x64 backend to start
adding specific lowerings of the Cranelift `shuffle` instruction to
particular AArch64 instructions.
* aarch64: Add `shuffle` lowerings to the `zip{1,2}` instructions
These instructions match the `punpck*` family of instructions on x64 and
should help provide more efficient lowerings than the current `shuffle`
fallback.
* aarch64: Add `shuffle` lowerings for `trn{1,2}`
Along the lines of prior commits adds specific patterns to lowering for
individual AArch64 instructions available.
* aarch64: Add a `shuffle` lowering for the `ext` instruction
This instruction will more-or-less concatenate two 128-bit vector
registers to create a 256-bit value, shift it right, and then take the
lower 128-bits into the destination. This can be modeled with a
`shuffle` of consecutive bytes so this adds a lowering rule to generate
this instruction.
* aarch64: Add `shuffle` special case for `dup`
This commit adds special cases for Cranelift's `shuffle` on AArch64 when
the lowering can be represented with a `dup` instruction which
broadcasts one vector's lane into all lanes of the destination.
* aarch64: Add `shuffle` specializations for `rev` instructions
This commit adds shuffle mask specializations for the `rev{16,32,64}`
family of instructions on AArch64 which can be used to reverse bytes,
16-bit values, or 32-bit values within larger values.
* Fix tests
* Add doc-comments in ISLE
This commit is contained in:
@@ -586,6 +586,17 @@ macro_rules! isle_lower_prelude_methods {
|
||||
self.lower_ctx.gen_return(rets);
|
||||
}
|
||||
|
||||
/// Same as `shuffle32_from_imm`, but for 64-bit lane shuffles.
|
||||
fn shuffle64_from_imm(&mut self, imm: Immediate) -> Option<(u8, u8)> {
|
||||
use crate::machinst::isle::shuffle_imm_as_le_lane_idx;
|
||||
|
||||
let bytes = self.lower_ctx.get_immediate_data(imm).as_slice();
|
||||
Some((
|
||||
shuffle_imm_as_le_lane_idx(8, &bytes[0..8])?,
|
||||
shuffle_imm_as_le_lane_idx(8, &bytes[8..16])?,
|
||||
))
|
||||
}
|
||||
|
||||
/// Attempts to interpret the shuffle immediate `imm` as a shuffle of
|
||||
/// 32-bit lanes, returning four integers, each of which is less than 8,
|
||||
/// which represents a permutation of 32-bit lanes as specified by
|
||||
|
||||
Reference in New Issue
Block a user