“ long long long”向量的可用性是什么? [英] What is the availability of 'vector long long'?

查看:145
本文介绍了“ long long long”向量的可用性是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用一台旧的PowerMac G5(一台Power4机器)进行测试。构建失败:

I'm testing on an old PowerMac G5, which is a Power4 machine. The build is failing:

$ make
...
g++ -DNDEBUG -g2 -O3 -mcpu=power4 -maltivec -c ppc-simd.cpp
ppc-crypto.h:36: error: use of 'long long' in AltiVec types is invalid
make: *** [ppc-simd.o] Error 1

失败的原因是:

typedef __vector unsigned long long uint64x2_p8;

我在确定何时应该提供typedef时遇到麻烦。使用 -mcpu = power4 -maltivec ,该计算机报告64位可用性:

I'm having trouble determining when I should make the typedef available. With -mcpu=power4 -maltivec the machine reports 64-bit availability:

$ gcc -mcpu=power4 -maltivec -dM -E - </dev/null | sort | egrep -i -E 'power|ARCH'
#define _ARCH_PPC 1
#define _ARCH_PPC64 1
#define __POWERPC__ 1

OpenPOWER | 6.1。向量数据类型手册对向量数据类型有很好的信息,但是没有讨论 code>何时可用。

The OpenPOWER | 6.1. Vector Data Types manual has a good information on vector data types, but it does not discuss when the vector long long are available.

__ vector unsigned long long 的可用性是什么?

推荐答案

TL:DR:看来POWER7是64位元素大小的最低要求与AltiVec。这是 VSX(向量标量扩展名)的一部分,维基百科首先对此进行了确认出现在POWER7中。

TL:DR: it looks like POWER7 is the minimum requirement for 64-bit element size with AltiVec. This is part of VSX (Vector Scalar Extension), which Wikipedia confirms first appeared in POWER7.

gcc很可能知道它在做什么,并启用了64位元素大小的矢量内在函数

It's very likely that gcc knows what it's doing, and enables 64-bit element-size vector intrinsics with the lowest necessary -mcpu= requirement.

#include <altivec.h>

auto vec32(void) {       // compiles with your options: Power4
    return vec_splats((int) 1);
}

// gcc error: use of 'long long' in AltiVec types is invalid without -mvsx
vector long long vec64(void) {
    return vec_splats((long long) 1);
}

(使用 auto 而不是 long long ,第二个函数编译为返回两个64位整数寄存器。)

(With auto instead of vector long long, the 2nd function compiles to returning in two 64-bit integer registers.)

-mvsx 可以编译第二个函数。使用 -mcpu = power7 也可以,但是power6无效。

Adding -mvsx lets the 2nd function compile. Using -mcpu=power7 also works, but power6 doesn't.

源上Godbolt + ASM(PowerPC64 gcc6.3)

# with auto without VSX:
vec64():     # -O3 -mcpu=power4 -maltivec -mregnames
    li %r4,1
    li %r3,1
    blr

vec64():  # -O3 -mcpu=power7 -maltivec -mregnames
.LCF2:
0:  addis 2,12,.TOC.-.LCF2@ha
    addi 2,2,.TOC.-.LCF2@l
    addis %r9,%r2,.LC0@toc@ha
    addi %r9,%r9,.LC0@toc@l       # PC-relative addressing for static constant, I think.
    lxvd2x %vs34,0,%r9            # vector load?
    xxpermdi %vs34,%vs34,%vs34,2
    blr


 .LC0:    # in .rodata
    .quad   1
    .quad   1






BTW,<$ c $具有常量的c> vec_splats (splat标量)可编译为一条指令。但是使用运行时变量(例如函数arg),它可以编译为整数存储/向量加载/向量-splat(例如 vec_splat 内在函数)。显然,没有用于int-> vec的单一指令。


And BTW, vec_splats (splat scalar) with a constant compiles to a single instruction. But with a runtime variable (e.g. a function arg), it compiles to an integer store / vector load / vector-splat (like the vec_splat intrinsic). Apparently there isn't a single instruction for int->vec.

vec_splat_s32 和相关内在函数接受一个小的(5位)常量,因此它们仅在编译器可以使用相应的splat-inmediate指令的情况下进行编译。

The vec_splat_s32 and related intrinsics only accept a small (5-bit) constant, so they only compile in cases where the compiler can use the corresponding splat-immediate instruction.

This < a href = https://www.ibm.com/developerworks/community/wikis/home?lang=zh-CN#!/wiki/W51a7ffcf4dfd_4b40_9d82_446ebc23c550/page/Intel%20SSE%20to%20PowerPC%20AltiVec%20migration rel = nofollow noreferrer>从英特尔SSE到PowerPC AltiVec的迁移看起来大多不错,但是却犯了错误(它声称 vec_splats splats是一个已签名的字节)。

This Intel SSE to PowerPC AltiVec migration looks mostly good, but got that wrong (it claims that vec_splats splats a signed byte).

这篇关于“ long long long”向量的可用性是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆