高性能加法和乘法的双常数的形式 [英] Forms of constants for high performance addition and multiplication for double

查看:199
本文介绍了高性能加法和乘法的双常数的形式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要有效地将一些常量添加或乘以一个循环中类型double的结果,以防止下溢。例如,如果我们有int,乘以2的幂将很快,因为编译器将使用位移。有没有有效的 double 加法和乘法的常数形式?

I need to efficiently add or multiply some constants to a result of type double in a loop to prevent underflow. For example, if we have int, multiplying with a power of 2 will be fast as the compiler will use bit shift. Is there a form of constants for efficient double addition and multiplication?

编辑:似乎没有多少人明白我的问题,对我的吝啬道歉。我将添加一些代码。
如果 a 是一个int,这个乘以2的幂将更有效

It seems that not many understand my question, apologies for my sloppiness . I will add some code. If a is a int, this (multiplying with a power of 2) will be more efficient

int a = 1;
for(...)
    for(...)
        a *= somefunction() * 1024;

比1024更换为1023.不知道什么是最好的如果我们想添加一个int,但这不是我的兴趣。我感兴趣的情况下, a 是双。什么是常量形式(例如2的幂),我们可以有效地将乘以double?常数是任意,只需要足够大,以防止下溢。

than when 1024 is replaced with say 1023. not sure what is the best if we want to add to a int, but that is not of my interest. I am interested in the case where a is a double. What are the forms of constants (e.g. power of 2) that we can efficiently add and multiply to a double? The constant is arbitrary, just need to be large enough to prevent underflow.

这可能不仅限于C和C ++,不知道更合适的标签。

This is probably not restricted to C and C++ only, but I do not know of a more appropriate tag.

推荐答案

在大多数现代处理器上, code> x * = 0x1p10; 乘以2 10 x * = 0x1p-10; 除以2 10 )将是快速和无错误的(除非结果足够大到溢出或小到足以下溢)。

On most modern processors, simply multiplying by a power of two (e.g., x *= 0x1p10; to multiply by 210 or x *= 0x1p-10; to divide by 210) will be fast and error-free (unless the result is large enough to overflow or small enough to underflow).

有些处理器有一些浮点运算的早期输出。也就是说,当某些位为零或满足其他标准时,它们更快地完成指令。然而,浮点加法,减法和乘法通常在大约四个CPU周期中执行,因此即使没有早期输出,它们也相当快。另外,大多数现代处理器一次执行几个指令,因此在发生乘法时进行其他工作,并且它们被流水线化,因此通常在每个CPU周期中可以开始一次乘法(和一次完成)。 (有时更多。)

There are some processors with "early outs" for some floating-point operations. That is, they complete the instruction more quickly when certain bits are zero or meet other criteria. However, floating-point addition, subtraction, and multiplication commonly execute in about four CPU cycles, so they are fairly fast even without early outs. Additionally, most modern processors execute several instructions at a time, so other work proceeds while a multiplication is occurring, and they are pipelined, so, commonly, one multiplication can be started (and one finish) in each CPU cycle. (Sometimes more.)

乘以2的幂没有舍入误差,因为有效位数(值的小数部分)不会改变,因此新有效位数是可精确表示的。 (除了乘以小于1的值,有效位数的位可以被推到低于浮点类型的限制,导致下溢。对于公共IEEE 754双格式,这不会发生,直到该值小于0x1p-1022。)

Multiplying by powers of two has no rounding error because the significand (fraction portion of the value) does not change, so the new significand is exactly representable. (Except, multiplying by a value less than 1, bits of the significand can be pushed lower than the limit of the floating-point type, causing underflow. For the common IEEE 754 double format, this does not occur until the value is less than 0x1p-1022.)

不要使用除法缩放(或逆转前面缩放的效果)。相反,乘以逆。 (要删除先前的0x1p57缩放,乘以0x1p-57。)这是因为分割指令在大多数现代处理器上速度较慢。例如,30个周期并不奇怪。

Do not use division for scaling (or for reversing the effects of prior scaling). Instead, multiply by the inverse. (To remove a previous scaling of 0x1p57, multiply by 0x1p-57.) This is because division instructions are slow on most modern processors. E.g., 30 cycles is not unusual.

这篇关于高性能加法和乘法的双常数的形式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆