clang扩展向量的三元运算符 [英] ternary operator for clang's extended vectors

查看:128
本文介绍了clang扩展向量的三元运算符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试过使用clang的扩展向量。三元运算符应该工作,但对我来说不工作。示例:

I've tried playing with clang's extended vectors. The ternary operator is supposed to work, but it is not working for me. Example:

int main()
{
  using int4 = int __attribute__((ext_vector_type(4)));

  int4 a{0, 1, 3, 4};
  int4 b{2, 1, 4, 5};

  auto const r(a - b ? a : b);

  return 0;
}

请提供示例说明如何使其工作,就像< a href = http://www.informit.com/articles/article.aspx?p=1732873&seqNum=10 rel = nofollow> OpenCL 。我正在使用 clang-3.4.2

Please provide examples on how I might make it work, like it works under OpenCL. I am using clang-3.4.2.

错误:

t.cpp:8:16: error: value of type 'int __attribute__((ext_vector_type(4)))' is not contextually convertible to 'bool'
  auto const r(a - b ? a : b);
               ^~~~~
1 error generated.


推荐答案

您可以直接在Clang中遍历元素。这是针对GCC和Clang的解决方案。

You can loop over the elements directly in Clang. Here is a solution for GCC and Clang.

#include <inttypes.h>
#include <x86intrin.h>

#if defined(__clang__)
typedef float float4 __attribute__ ((ext_vector_type(4)));
typedef   int   int4 __attribute__ ((ext_vector_type(4)));
#else
typedef float float4 __attribute__ ((vector_size (sizeof(float)*4)));
typedef   int   int4 __attribute__ ((vector_size (sizeof(int)*4)));
#endif

float4 select(int4 s, float4 a, float4 b) {
  float4 c;
  #if defined(__GNUC__) && !defined(__INTEL_COMPILER) && !defined(__clang__)
  c = s ? a : b;
  #else
  for(int i=0; i<4; i++) c[i] = s[i] ? a[i] : b[i];
  #endif
  return c;
}

两者都会产生

select(int __vector(4), float __vector(4), float __vector(4)):
  pxor xmm3, xmm3
  pcmpeqd xmm0, xmm3
  blendvps xmm1, xmm2, xmm0
  movaps xmm0, xmm1
  ret




  • Nehalem: https://godbolt.org/g/cVWYym

  • Skylake: https://godbolt.org/g/LhEpnN

  • KNL: https://godbolt.org/g/NFrFKg

    • Nehalem: https://godbolt.org/g/cVWYym
    • Skylake: https://godbolt.org/g/LhEpnN
    • KNL: https://godbolt.org/g/NFrFKg
    • 但是使用AVX512最好使用掩码(例如 __ mmask16

      But with AVX512 it's better to use masks (e.g. __mmask16).

      这篇关于clang扩展向量的三元运算符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆