如何将单精度浮点数的XMM寄存器转换为整数? [英] How can I convert an XMM register of single-precision floats to integers?

查看:108
本文介绍了如何将单精度浮点数的XMM寄存器转换为整数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在XMM寄存器中有一堆打包的浮点数(使用SSE内在函数):

I have a bunch of packed floats inside an XMM register (using SSE intrinsics):

__m128 xmm = _mm_set_ps(4.0f, 3.0f, 2.0f, 1.0f);

我想一次将所有这些转换为整数.我发现了一个内在函数,它可以满足我的要求( _mm_cvtps_pi16()),但是它会生成4x16位的 short 而不是成熟的 int .名为 _mm_cvtps_pi32()的内部函数会产生 int ,但仅适用于 xmm 中的两个较低值.我可以使用它,提取值,四处移动并再次使用它,但是有没有更简单的方法?为什么没有直接的32位压缩浮点-> 32位整数指令?两者肯定都适合XMM寄存器的同一空间吗?

I'd like to convert all of these to integers in one go. I found an intrinsic, that does what I want (_mm_cvtps_pi16()), but it yields 4x16-bit short instead of full-blown int. An intrinsic called _mm_cvtps_pi32() yields int, but only for the two lower values in xmm. I can use it, extract the values, move things around and use it again, but is there a simpler way? Why wouldn't there be a straightforward 32bit packed float -> 32bit integer instruction? Surely both fit in the same space of an XMM register?

好的,我现在看到 _mm_cvtps_pi32()返回__m64而不是__m128,这意味着它在MMX风格的MM ...寄存器上运行.那可以解释为什么它只返回两个整数,但是现在我在想:

Okay, I see now that _mm_cvtps_pi32() returns __m64 instead of __m128, which means it operates on a MMX-style MM... register. That would explain why it returns just two ints, but now I'm wondering:

  • 为x64编译时会遇到麻烦吗?据说那里不支持__m64 ...
  • 为什么在SSE推出时他们没有扩展此指令?

谢谢!

推荐答案

根据

According to this documentation: __m128d _mm_cvtps_epi32(__m128d a) generates a cvtps2dq instruction, which does what you want.

这篇关于如何将单精度浮点数的XMM寄存器转换为整数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆