最有效的方法,以UINT32的向量转换成浮动的载体? [英] Most efficient way to convert vector of uint32 to vector of float?

查看:158
本文介绍了最有效的方法,以UINT32的向量转换成浮动的载体?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

86不具有SSE指令从符号 INT32转换为浮点。什么是最有效的指令序列实现这一目标?

编辑:
为了澄清,我想要做以下标量操作的载体序列:

  unsigned int类型X = ...
浮动解析度=(浮点)X;

EDIT2:这里是做一个标量转换一个天真的算法

  unsigned int类型X = ...
浮动偏差= 0.f;
如果(X>为0x7FFFFFFF){
    偏压=(浮点)0x80000000的;
    点¯x - = 0x80000000的;
}
RES = signed_convert(X)+偏差;


解决方案

您天真标算法不提供一个正确全面的转换 - 它会从某些输入双舍入受损。举个例子:如果 X 0x88000081 ,然后转换为float的正确全面的结果是 2281701632.0f ,但你的标量的算法将返回 2281701376.0f 代替。

关闭我的头顶,你可以做一个正确的转换如下(我说的,这是从我的头顶,所以它可能可以保存在某个地方的指令):

  MOVDQA将xmm1,XMM0 //使x的拷贝
psrld XMM0,x的16 //高16位
PAND将xmm1,[面具] // x的低16位
ORPS XMM0 [onep39] //浮动(2 ^ 39 +高16×位)
cvtdq2ps将xmm1,xmm1中//浮动(x的低16位)
SUBPS XMM0 [onep39] //浮动(高16×位)
ADDPS XMM0,xmm1中//浮动(X)

其中常数具有下列值:

 面膜:0000FFFF 0000FFFF 0000FFFF 0000FFFF
onep39:5300 5300 5300 5300

这样做是单独的高,每条泳道的低半转换为浮点,然后添加这些转换后的值在一起。因为每个一半只有16位宽,转换为浮动不承担任何舍入。仅四舍五入发生时被添加的两半;因为除了是一个正确全面的操作,整个转换是正确的舍入。

相反,您的幼稚执行第一转换低31位浮动,这招致舍入,然后有条件地增加了2 ^ 31至该结果,这可能引起的第二舍入。你在一个转换有两个单独的舍入点,除非你是非常小心,他们是如何发生的任何时候,你不应该期望的结果进行正确舍入。

x86 does not have an SSE instruction to convert from unsigned int32 to floating point. What would be the most efficient instruction sequence for achieving this?

EDIT: To clarify, i want to do the vector sequence of the following scalar operation:

unsigned int x = ...
float res = (float)x;

EDIT2: Here is a naive algorithm for doing a scalar conversion.

unsigned int x = ...
float bias = 0.f;
if (x > 0x7fffffff) {
    bias = (float)0x80000000;
    x -= 0x80000000;
}
res = signed_convert(x) + bias;

解决方案

Your naive scalar algorithm doesn't deliver a correctly-rounded conversion -- it will suffer from double rounding on certain inputs. As an example: if x is 0x88000081, then the correctly-rounded result of conversion to float is 2281701632.0f, but your scalar algorithm will return 2281701376.0f instead.

Off the top of my head, you can do a correct conversion as follows (as I said, this is off the top of my head, so it's likely possible to save an instruction somewhere):

movdqa   xmm1,  xmm0    // make a copy of x
psrld    xmm0,  16      // high 16 bits of x
pand     xmm1, [mask]   // low 16 bits of x
orps     xmm0, [onep39] // float(2^39 + high 16 bits of x)
cvtdq2ps xmm1, xmm1     // float(low 16 bits of x)
subps    xmm0, [onep39] // float(high 16 bits of x)
addps    xmm0,  xmm1    // float(x)

where the constants have the following values:

mask:   0000ffff 0000ffff 0000ffff 0000ffff
onep39: 53000000 53000000 53000000 53000000

What this does is separately convert the high- and low-halves of each lane to floating-point, then add these converted values together. Because each half is only 16 bits wide, the conversion to float does not incur any rounding. Rounding only occurs when the two halves are added; because addition is a correctly-rounded operation, the entire conversion is correctly rounded.

By contrast, your naive implementation first converts the low 31 bits to float, which incurs a rounding, then conditionally adds 2^31 to that result, which may cause a second rounding. Any time you have two separate rounding points in a conversion, unless you are exceedingly careful about how they occur, you should not expect the result to be correctly rounded.

这篇关于最有效的方法,以UINT32的向量转换成浮动的载体?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆