如何将单精度浮点数的XMM寄存器转换为整数? [英] How can I convert an XMM register of single-precision floats to integers?
问题描述
我在XMM寄存器中有一堆打包的浮点数(使用SSE内在函数):
I have a bunch of packed floats inside an XMM register (using SSE intrinsics):
__m128 xmm = _mm_set_ps(4.0f, 3.0f, 2.0f, 1.0f);
我想一次将所有这些转换为整数.我发现了一个内在函数,它可以满足我的要求( _mm_cvtps_pi16()
),但是它会生成4x16位的 short 而不是成熟的 int .名为 _mm_cvtps_pi32()
的内部函数会产生 int ,但仅适用于 xmm
中的两个较低值.我可以使用它,提取值,四处移动并再次使用它,但是有没有更简单的方法?为什么没有直接的32位压缩浮点-> 32位整数指令?两者肯定都适合XMM寄存器的同一空间吗?
I'd like to convert all of these to integers in one go. I found an intrinsic, that does what I want (_mm_cvtps_pi16()
), but it yields 4x16-bit short instead of full-blown int. An intrinsic called _mm_cvtps_pi32()
yields int, but only for the two lower values in xmm
. I can use it, extract the values, move things around and use it again, but is there a simpler way? Why wouldn't there be a straightforward 32bit packed float -> 32bit integer instruction? Surely both fit in the same space of an XMM register?
好的,我现在看到 _mm_cvtps_pi32()
返回__m64而不是__m128,这意味着它在MMX风格的MM ...寄存器上运行.那可以解释为什么它只返回两个整数,但是现在我在想:
Okay, I see now that _mm_cvtps_pi32()
returns __m64 instead of __m128, which means it operates on a MMX-style MM... register. That would explain why it returns just two ints, but now I'm wondering:
- 为x64编译时会遇到麻烦吗?据说那里不支持__m64 ...
- 为什么在SSE推出时他们没有扩展此指令?
谢谢!
推荐答案
According to this documentation: __m128d _mm_cvtps_epi32(__m128d a)
generates a cvtps2dq
instruction, which does what you want.
这篇关于如何将单精度浮点数的XMM寄存器转换为整数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!