霓虹灯中的成对加法 [英] pairwise addition in neon
问题描述
我想在 neon 中添加 int64x2_t
向量的 00
和 01
索引值.我找不到任何可以执行此功能的成对添加指令.
I want to add 00
and 01
indices value of int64x2_t
vector in neon .
I am not able to find any pairwise-add instruction which will do this functionality .
int64x2_t sum_64_2;
//I am expecting result should be..
//int64_t result = sum_64_2[0] + sum_64_2[1];
- neon 中是否有针对此逻辑的说明.
推荐答案
你可以用两种方式来写.这个明确使用了 NEON VADD.I64
指令:
You can write it in two ways. This one explicitly uses the NEON VADD.I64
instruction:
int64x1_t f(int64x2_t v)
{
return vadd_s64 (vget_high_s64 (v), vget_low_s64 (v));
}
和下面的一个依赖于编译器在使用 NEON 和通用整数指令集之间正确选择.在这种情况下,GCC 4.9 做了正确的事情,但其他编译器可能不会.
and the following one relies on the compiler to correctly select between using the NEON and general integer instruction sets. GCC 4.9 does the right thing in this case, but other compilers may not.
int64x1_t g(int64x2_t v)
{
int64x1_t r;
r=vset_lane_s64(vgetq_lane_s64(v, 0) + vgetq_lane_s64(v, 1), r, 0);
return r;
}
当面向 ARM 时,代码生成是高效的.对于 AArch64,使用了额外的指令,但编译器可以做得更好.
When targeting ARM, the code generation is efficient. For AArch64, extra instructions are used, but the compiler could do better.
这篇关于霓虹灯中的成对加法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!