SSE指令:字节+短 [英] SSE Instructions: Byte+Short

查看:37
本文介绍了SSE指令:字节+短的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有很长的字节数组,需要将它们添加到 short(或 int)类型的目标数组中.这样的SSE指令存在吗?或者他们的套餐?

I have very long byte arrays that need to be added to a destination array of type short (or int). Does such SSE instruction exist? Or maybe their set ?

推荐答案

您需要将每个 8 位值的向量解包为两个 16 位值的向量,然后将它们相加.

You need to unpack each vector of 8 bit values to two vectors of 16 bit values and then add those.

__m128i v = _mm_set_epi8(15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0);
__m128i vl = _mm_unpacklo_epi8(v, _mm_set1_epi8(0)); // vl = { 7, 6, 5, 4, 3, 2, 1, 0 }
__m128i vh = _mm_unpackhi_epi8(v, _mm_set1_epi8(0)); // vh = { 15, 14, 13, 12, 11, 10, 9, 8 }

其中 v 是 16 x 8 位值的向量,vlvh 是两个解包后的 8 x 16 位值向量.

where v is a vector of 16 x 8 bit values and vl, vh are the two unpacked vectors of 8 x 16 bit values.

请注意,我假设 8 位值是无符号的,因此当解包为 16 位时,高字节设置为 0(即无符号扩展).

Note that I'm assuming that the 8 bit values are unsigned so when unpacking to 16 bits the high byte is set to 0 (i.e. no sign extension).

如果你想对很多这些向量求和并得到一个 32 位的结果,那么一个有用的技巧是使用乘数为 1 的 _mm_madd_epi16,例如

If you want to sum a lot of these vectors and get a 32 bit result then a useful trick is to use _mm_madd_epi16 with a multiplier of 1, e.g.

__m128i vsuml = _mm_set1_epi32(0);
__m128i vsumh = _mm_set1_epi32(0);
__m128i vsum;
int sum;

for (int i = 0; i < N; i += 16)
{
    __m128i v = _mm_load_si128(&x[i]);
    __m128i vl = _mm_unpacklo_epi8(v, _mm_set1_epi8(0));
    __m128i vh = _mm_unpackhi_epi8(v, _mm_set1_epi8(0));
    vsuml = _mm_add_epi32(vsuml, _mm_madd_epi16(vl, _mm_set1_epi16(1)));
    vsumh = _mm_add_epi32(vsumh, _mm_madd_epi16(vh, _mm_set1_epi16(1)));
}
// do horizontal sum of 4 partial sums and store in scalar int
vsum = _mm_add_epi32(vsuml, vsumh);
vsum = _mm_add_epi32(vsum, _mm_srli_si128(vsum, 8));
vsum = _mm_add_epi32(vsum, _mm_srli_si128(vsum, 4));
sum = _mm_cvtsi128_si32(vsum);

这篇关于SSE指令:字节+短的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆