"扩展和QUOT;上证寄存器的数据类型的大小 [英] "Extend" data type size in SSE register
问题描述
我使用VS2005(工作),需要一个SSE内在的执行以下操作:
I'm using VS2005 (at work) and need an SSE intrinsic that does the following:
我有一个pre-现有的 __ m128i
的 N 的充满了16位整数 A_1,A_2,。 ...,a_8
。
I have a pre-existing __m128i
n filled with 16 bit integers a_1,a_2,....,a_8
.
由于一些计算,我现在想要做的要求32,而不是16位,我想从的 N 的提取两个四组的16位整数,并把他们分成两个分开的 __ m128i
■哪些包含 A_1,...,A_4
和 A_5,...,a_8
分别。
Since some calculations that I now want to do require 32 instead of 16 bits, I want to extract the two four-sets of 16-bit integers from n and put them into two separated __m128i
s which contain a_1,...,a_4
and a_5,...,a_8
respectively.
我能做到这一点使用各种 _mm_set
内在手动,但那些将导致在八个 MOV
S IN装配,而且我希望会有这样做一个更快的方法。
I could do this manually using the various _mm_set
intrinsics, but those would result in eight mov
s in assembly, and I'd hoped that there would be a faster way to do this.
推荐答案
假设我理解正确的是什么,你想实现(解包8×16位的一个向量成4×32位整数两个向量),我通常像这样做在SSE2及更高版本:
Assuming that I understand correctly what it that you want to achieve (unpack 8 x 16 bits in one vector into two vectors of 4 x 32 bit ints), I typically do it like this in SSE2 and later:
__mm128i v = _mm_set_epi16(7, 6, 5, 4, 3, 2, 1, 0); // v = { 7, 6, 5, 4, 3, 2, 1, 0 }
__mm128i v_lo = _mm_srai_epi32(_mm_unpacklo_epi16(v, v), 16); // v_lo = { 3, 2, 1, 0 }
__mm128i v_hi = _mm_srai_epi32(_mm_unpackhi_epi16(v, v), 16); // v_hi = { 7, 6, 5, 4 }
这篇关于"扩展和QUOT;上证寄存器的数据类型的大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!