Files
wasmtime/tests/misc_testsuite/relaxed-simd/i32x4_relaxed_trunc.wast
Alex Crichton 8bb183f16e Implement the relaxed SIMD proposal (#5892)
* Initial support for the Relaxed SIMD proposal

This commit adds initial scaffolding and support for the Relaxed SIMD
proposal for WebAssembly. Codegen support is supported on the x64 and
AArch64 backends on this time.

The purpose of this commit is to get all the boilerplate out of the way
in terms of plumbing through a new feature, adding tests, etc. The tests
are copied from the upstream repository at this time while the
WebAssembly/testsuite repository hasn't been updated.

A summary of changes made in this commit are:

* Lowerings for all relaxed simd opcodes have been added, currently all
  exhibiting deterministic behavior. This means that few lowerings are
  optimal on the x86 backend, but on the AArch64 backend, for example,
  all lowerings should be optimal.

* Support is added to codegen to, eventually, conditionally generate
  different code based on input codegen flags. This is intended to
  enable codegen to more efficient instructions on x86 by default, for
  example, while still allowing embedders to force
  architecture-independent semantics and behavior. One good example of
  this is the `f32x4.relaxed_fmadd` instruction which when deterministic
  forces the `fma` instruction, but otherwise if the backend doesn't
  have support for `fma` then intermediate operations are performed
  instead.

* Lowerings of `iadd_pairwise` for `i16x8` and `i32x4` were added to the
  x86 backend as they're now exercised by the deterministic lowerings of
  relaxed simd instructions.

* Sample codegen tests for added for x86 and aarch64 for some relaxed
  simd instructions.

* Wasmtime embedder support for the relaxed-simd proposal and forcing
  determinism have been added to `Config` and the CLI.

* Support has been added to the `*.wast` runtime execution for the
  `(either ...)` matcher used in the relaxed-simd proposal.

* Tests for relaxed-simd are run both with a default `Engine` as well as
  a "force deterministic" `Engine` to test both configurations.

* All tests from the upstream repository were copied into Wasmtime.
  These tests should be deleted when WebAssembly/testsuite is updated.

* x64: Add x86-specific lowerings for relaxed simd

This commit builds on the prior commit and adds an array of `x86_*`
instructions to Cranelift which have semantics that match their
corresponding x86 equivalents. Translation for relaxed simd is then
additionally updated to conditionally generate different CLIF for
relaxed simd instructions depending on whether the target is x86 or not.
This means that for AArch64 no changes are made but for x86 most relaxed
instructions now lower to some x86-equivalent with slightly different
semantics than the "deterministic" lowering.

* Add libcall support for fma to Wasmtime

This will be required to implement the `f32x4.relaxed_madd` instruction
(and others) when an x86 host doesn't specify the `has_fma` feature.

* Ignore relaxed-simd tests on s390x and riscv64

* Enable relaxed-simd tests on s390x

* Update cranelift/codegen/meta/src/shared/instructions.rs

Co-authored-by: Andrew Brown <andrew.brown@intel.com>

* Add a FIXME from review

* Add notes about deterministic semantics

* Don't default `has_native_fma` to `true`

* Review comments and rebase fixes

---------

Co-authored-by: Andrew Brown <andrew.brown@intel.com>
2023-03-07 15:52:41 +00:00

124 lines
6.4 KiB
Plaintext

;; Tests for i32x4.relaxed_trunc_f32x4_s, i32x4.relaxed_trunc_f32x4_u, i32x4.relaxed_trunc_f64x2_s_zero, and i32x4.relaxed_trunc_f64x2_u_zero.
(module
(func (export "i32x4.relaxed_trunc_f32x4_s") (param v128) (result v128) (i32x4.relaxed_trunc_f32x4_s (local.get 0)))
(func (export "i32x4.relaxed_trunc_f32x4_u") (param v128) (result v128) (i32x4.relaxed_trunc_f32x4_u (local.get 0)))
(func (export "i32x4.relaxed_trunc_f64x2_s_zero") (param v128) (result v128) (i32x4.relaxed_trunc_f64x2_s_zero (local.get 0)))
(func (export "i32x4.relaxed_trunc_f64x2_u_zero") (param v128) (result v128) (i32x4.relaxed_trunc_f64x2_u_zero (local.get 0)))
(func (export "i32x4.relaxed_trunc_f32x4_s_cmp") (param v128) (result v128)
(i32x4.eq
(i32x4.relaxed_trunc_f32x4_s (local.get 0))
(i32x4.relaxed_trunc_f32x4_s (local.get 0))))
(func (export "i32x4.relaxed_trunc_f32x4_u_cmp") (param v128) (result v128)
(i32x4.eq
(i32x4.relaxed_trunc_f32x4_u (local.get 0))
(i32x4.relaxed_trunc_f32x4_u (local.get 0))))
(func (export "i32x4.relaxed_trunc_f64x2_s_zero_cmp") (param v128) (result v128)
(i32x4.eq
(i32x4.relaxed_trunc_f64x2_s_zero (local.get 0))
(i32x4.relaxed_trunc_f64x2_s_zero (local.get 0))))
(func (export "i32x4.relaxed_trunc_f64x2_u_zero_cmp") (param v128) (result v128)
(i32x4.eq
(i32x4.relaxed_trunc_f64x2_u_zero (local.get 0))
(i32x4.relaxed_trunc_f64x2_u_zero (local.get 0))))
)
;; Test some edge cases around min/max to ensure that the instruction either
;; saturates correctly or returns INT_MIN.
;;
;; Note, though, that INT_MAX itself is not tested. The value for INT_MAX is
;; 2147483647 but that is not representable in a `f32` since it requires 31 bits
;; when a f32 has only 24 bits available. This means that the closest integers
;; to INT_MAX which can be represented are 2147483520 and 2147483648, meaning
;; that the INT_MAX test case cannot be tested.
(assert_return (invoke "i32x4.relaxed_trunc_f32x4_s"
;; INT32_MIN <INT32_MIN >INT32_MAX
(v128.const f32x4 -2147483648.0 -2147483904.0 2.0 2147483904.0))
;; out of range -> saturate or INT32_MIN
(either (v128.const i32x4 -2147483648 -2147483648 2 2147483647)
(v128.const i32x4 -2147483648 -2147483648 2 -2147483648)))
(assert_return (invoke "i32x4.relaxed_trunc_f32x4_s"
(v128.const f32x4 nan -nan nan:0x444444 -nan:0x444444))
;; nans -> 0 or INT32_MIN
(either (v128.const i32x4 0 0 0 0)
(v128.const i32x4 0x80000000 0x80000000 0x80000000 0x80000000)))
(assert_return (invoke "i32x4.relaxed_trunc_f32x4_u"
;; UINT32_MIN UINT32_MIN-1 <UINT32_MAX UINT32_MAX+1
(v128.const f32x4 0 -1.0 4294967040.0 4294967296.0))
;; out of range -> saturate or UINT32_MAX
(either (v128.const i32x4 0 0 4294967040 0xffffffff)
(v128.const i32x4 0 0xffffffff 4294967040 0xffffffff)))
(assert_return (invoke "i32x4.relaxed_trunc_f32x4_u"
(v128.const f32x4 nan -nan nan:0x444444 -nan:0x444444))
;; nans -> 0 or UINT32_MAX
(either (v128.const i32x4 0 0 0 0)
(v128.const i32x4 0xffffffff 0xffffffff 0xffffffff 0xffffffff)))
(assert_return (invoke "i32x4.relaxed_trunc_f64x2_s_zero"
(v128.const f64x2 -2147483904.0 2147483904.0))
;; out of range -> saturate or INT32_MIN
(either (v128.const i32x4 -2147483648 2147483647 0 0)
(v128.const i32x4 -2147483648 -2147483648 0 0)))
(assert_return (invoke "i32x4.relaxed_trunc_f64x2_s_zero"
(v128.const f64x2 nan -nan))
(either (v128.const i32x4 0 0 0 0)
(v128.const i32x4 0x80000000 0x80000000 0 0)))
(assert_return (invoke "i32x4.relaxed_trunc_f64x2_u_zero"
(v128.const f64x2 -1.0 4294967296.0))
;; out of range -> saturate or UINT32_MAX
(either (v128.const i32x4 0 0xffffffff 0 0)
(v128.const i32x4 0xffffffff 0xffffffff 0 0)))
(assert_return (invoke "i32x4.relaxed_trunc_f64x2_u_zero"
(v128.const f64x2 nan -nan))
(either (v128.const i32x4 0 0 0 0)
(v128.const i32x4 0 0 0xffffffff 0xffffffff)))
;; Check that multiple calls to the relaxed instruction with same inputs returns same results.
(assert_return (invoke "i32x4.relaxed_trunc_f32x4_s_cmp"
;; INT32_MIN <INT32_MIN INT32_MAX >INT32_MAX
(v128.const f32x4 -2147483648.0 -2147483904.0 2147483647.0 2147483904.0))
;; out of range -> saturate or INT32_MIN
(v128.const i32x4 -1 -1 -1 -1))
(assert_return (invoke "i32x4.relaxed_trunc_f32x4_s_cmp"
(v128.const f32x4 nan -nan nan:0x444444 -nan:0x444444))
;; nans -> 0 or INT32_MIN
(v128.const i32x4 -1 -1 -1 -1))
(assert_return (invoke "i32x4.relaxed_trunc_f32x4_u_cmp"
;; UINT32_MIN UINT32_MIN-1 <UINT32_MAX UINT32_MAX+1
(v128.const f32x4 0 -1.0 4294967040.0 4294967296.0))
;; out of range -> saturate or UINT32_MAX
(v128.const i32x4 -1 -1 -1 -1))
(assert_return (invoke "i32x4.relaxed_trunc_f32x4_u_cmp"
(v128.const f32x4 nan -nan nan:0x444444 -nan:0x444444))
;; nans -> 0 or UINT32_MAX
(v128.const i32x4 -1 -1 -1 -1))
(assert_return (invoke "i32x4.relaxed_trunc_f64x2_s_zero_cmp"
(v128.const f64x2 -2147483904.0 2147483904.0))
;; out of range -> saturate or INT32_MIN
(v128.const i32x4 -1 -1 -1 -1))
(assert_return (invoke "i32x4.relaxed_trunc_f64x2_s_zero_cmp"
(v128.const f64x2 nan -nan))
(v128.const i32x4 -1 -1 -1 -1))
(assert_return (invoke "i32x4.relaxed_trunc_f64x2_u_zero_cmp"
(v128.const f64x2 -1.0 4294967296.0))
;; out of range -> saturate or UINT32_MAX
(v128.const i32x4 -1 -1 -1 -1))
(assert_return (invoke "i32x4.relaxed_trunc_f64x2_u_zero_cmp"
(v128.const f64x2 nan -nan))
(v128.const i32x4 -1 -1 -1 -1))