高性能加法和乘法的双常数的形式 [英] Forms of constants for high performance addition and multiplication for double
问题描述
我需要有效地将一些常量添加或乘以一个循环中类型double的结果,以防止下溢。例如,如果我们有int,乘以2的幂将很快,因为编译器将使用位移。有没有有效的 double
加法和乘法的常数形式?
I need to efficiently add or multiply some constants to a result of type double in a loop to prevent underflow. For example, if we have int, multiplying with a power of 2 will be fast as the compiler will use bit shift. Is there a form of constants for efficient double
addition and multiplication?
编辑:似乎没有多少人明白我的问题,对我的吝啬道歉。我将添加一些代码。
如果 a
是一个int,这个乘以2的幂将更有效
It seems that not many understand my question, apologies for my sloppiness . I will add some code.
If a
is a int, this (multiplying with a power of 2) will be more efficient
int a = 1;
for(...)
for(...)
a *= somefunction() * 1024;
比1024更换为1023.不知道什么是最好的如果我们想添加一个int,但这不是我的兴趣。我感兴趣的情况下, a
是双。什么是常量形式(例如2的幂),我们可以有效地将和乘以double?常数是任意,只需要足够大,以防止下溢。
than when 1024 is replaced with say 1023. not sure what is the best if we want to add to a int, but that is not of my interest. I am interested in the case where a
is a double. What are the forms of constants (e.g. power of 2) that we can efficiently add and multiply to a double? The constant is arbitrary, just need to be large enough to prevent underflow.
这可能不仅限于C和C ++,不知道更合适的标签。
This is probably not restricted to C and C++ only, but I do not know of a more appropriate tag.
推荐答案
在大多数现代处理器上, code> x * = 0x1p10; 乘以2 10 或 x * = 0x1p-10;
除以2 10 )将是快速和无错误的(除非结果足够大到溢出或小到足以下溢)。
On most modern processors, simply multiplying by a power of two (e.g., x *= 0x1p10;
to multiply by 210 or x *= 0x1p-10;
to divide by 210) will be fast and error-free (unless the result is large enough to overflow or small enough to underflow).
有些处理器有一些浮点运算的早期输出。也就是说,当某些位为零或满足其他标准时,它们更快地完成指令。然而,浮点加法,减法和乘法通常在大约四个CPU周期中执行,因此即使没有早期输出,它们也相当快。另外,大多数现代处理器一次执行几个指令,因此在发生乘法时进行其他工作,并且它们被流水线化,因此通常在每个CPU周期中可以开始一次乘法(和一次完成)。 (有时更多。)
There are some processors with "early outs" for some floating-point operations. That is, they complete the instruction more quickly when certain bits are zero or meet other criteria. However, floating-point addition, subtraction, and multiplication commonly execute in about four CPU cycles, so they are fairly fast even without early outs. Additionally, most modern processors execute several instructions at a time, so other work proceeds while a multiplication is occurring, and they are pipelined, so, commonly, one multiplication can be started (and one finish) in each CPU cycle. (Sometimes more.)
乘以2的幂没有舍入误差,因为有效位数(值的小数部分)不会改变,因此新有效位数是可精确表示的。 (除了乘以小于1的值,有效位数的位可以被推到低于浮点类型的限制,导致下溢。对于公共IEEE 754双格式,这不会发生,直到该值小于0x1p-1022。)
Multiplying by powers of two has no rounding error because the significand (fraction portion of the value) does not change, so the new significand is exactly representable. (Except, multiplying by a value less than 1, bits of the significand can be pushed lower than the limit of the floating-point type, causing underflow. For the common IEEE 754 double format, this does not occur until the value is less than 0x1p-1022.)
不要使用除法缩放(或逆转前面缩放的效果)。相反,乘以逆。 (要删除先前的0x1p57缩放,乘以0x1p-57。)这是因为分割指令在大多数现代处理器上速度较慢。例如,30个周期并不奇怪。
Do not use division for scaling (or for reversing the effects of prior scaling). Instead, multiply by the inverse. (To remove a previous scaling of 0x1p57, multiply by 0x1p-57.) This is because division instructions are slow on most modern processors. E.g., 30 cycles is not unusual.
这篇关于高性能加法和乘法的双常数的形式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!