如何使用 SSE 执行 uint32/float 转换? [英] How to perform uint32/float conversion with SSE?
问题描述
在 SSE 中有一个函数 _mm_cvtepi32_ps(__m128i input)
它接受 32 位宽有符号整数 (int32_t
) 的输入向量并将它们转换为 float
s.
In SSE there is a function _mm_cvtepi32_ps(__m128i input)
which takes input vector of 32 bits wide signed integers (int32_t
) and converts them into float
s.
现在,我想将输入整数解释为未签名.但是没有函数 _mm_cvtepu32_ps
并且我找不到一个实现.你知道我在哪里可以找到这样的函数或者至少给出一个实现的提示吗?为了说明结果的差异:
Now, I want to interpret input integers as not signed. But there is no function _mm_cvtepu32_ps
and I could not find an implementation of one. Do you know where I can find such a function or at least give a hint on the implementation?
To illustrate the the difference in results:
unsigned int a = 2480160505; // 10010011 11010100 00111110 11111001
float a1 = a; // 01001111 00010011 11010100 00111111;
float a2 = (signed int)a; // 11001110 11011000 01010111 10000010
推荐答案
此功能存在于 AVX-512 中,但如果您不能等到那时,我唯一能建议的就是转换 unsigned int
输入值成对较小的值,将它们转换,然后再将它们加在一起,例如
This functionality exists in AVX-512, but if you can't wait until then the only thing I can suggest is to convert the unsigned int
input values into pairs of smaller values, convert these, and then add them together again, e.g.
inline __m128 _mm_cvtepu32_ps(const __m128i v)
{
__m128i v2 = _mm_srli_epi32(v, 1); // v2 = v / 2
__m128i v1 = _mm_sub_epi32(v, v2); // v1 = v - (v / 2)
__m128 v2f = _mm_cvtepi32_ps(v2);
__m128 v1f = _mm_cvtepi32_ps(v1);
return _mm_add_ps(v2f, v1f);
}
<小时>
更新
正如 @wim 在 他的回答,对于 UINT_MAX
的输入值,上述解决方案失败.这是一个更强大但效率稍低的解决方案,它应该适用于完整的 uint32_t
输入范围:
As noted by @wim in his answer, the above solution fails for an input value of UINT_MAX
. Here is a more robust, but slightly less efficient solution, which should work for the full uint32_t
input range:
inline __m128 _mm_cvtepu32_ps(const __m128i v)
{
__m128i v2 = _mm_srli_epi32(v, 1); // v2 = v / 2
__m128i v1 = _mm_and_si128(v, _mm_set1_epi32(1)); // v1 = v & 1
__m128 v2f = _mm_cvtepi32_ps(v2);
__m128 v1f = _mm_cvtepi32_ps(v1);
return _mm_add_ps(_mm_add_ps(v2f, v2f), v1f); // return 2 * v2 + v1
}
这篇关于如何使用 SSE 执行 uint32/float 转换?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!