GCC/ARM 上的快速除法 [英] Fast Division on GCC/ARM

查看:22
本文介绍了GCC/ARM 上的快速除法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据我所知,大多数编译器会通过乘法然后向右移位来进行快速除法.例如,如果您检查 this SO thread 它说当您询问Microsoft 编译器除以 10 时,会将被除数乘以 0x1999999A(即 2^32/10),然后将结果除以 2^32(使用向右移动 32 次).

As far as I know most compilers will do fast division by multiplying and then bit shifting to the right. For instance, if you check this SO thread it says that when you ask the Microsoft compiler to do division by 10 it will multiply the dividend by 0x1999999A (which is 2^32/10) and then divide the result by 2^32 (using 32 shifts to the right).

(编者注:链接的答案直到错误为止.编译器不会这样做,因为它对所有输入都不准确.编译器确实会进行乘法和移位,但使用更复杂的方法来确定魔术常数和移位计数:为什么 GCC 使用乘法实现整数除法时的奇怪数字?)

(Editor's note: that linked answer was wrong until. Compilers don't do that because it's not exact for all inputs. Compilers do multiply and shift, but with a more involved way of determining the magic constant and shift counts: Why does GCC use multiplication by a strange number in implementing integer division?)

到目前为止一切顺利.

不过,一旦我使用 GCC 在 ARM 上测试了相同的除法 10,编译器做了一些稍微不同的事情.首先它将被除数乘以 0x66666667 (2^34/10),然后将结果除以 2^34.到目前为止,它与 Microsoft 相同,只是使用了更高的乘数.然而,在那之后,它从结果中减去 (dividend/2^31).

Once I tested the same division by 10 on ARM using GCC, though, the compiler did something slightly different. First it multiplied the dividend by 0x66666667 (2^34/10), then it divided the result by 2^34. Thus far it's the same as Microsoft, except using a higher multiplier. After that, however, it subtracted (dividend/2^31) from the result.

我的问题:为什么在 ARM 版本上有额外的减法?你能给我一个数字例子,如果没有那个减法,结果会是错误的吗?

My question: why on the ARM version there's that extra subtraction? Can you give me an numeric example where without that subtraction the result will be wrong?

如果你想检查生成的代码,它在下面(有我的评论):

If you want to check the generated code, it's below (with my comments):

        ldr     r2, [r7, #4] @--this loads the dividend from memory into r2
        movw    r3, #:lower16:1717986919 @--moves the lower 16 bits of the constant 
        movt    r3, #:upper16:1717986919 @--moves the upper 16 bits of the constant
        smull   r1, r3, r3, r2 @--multiply long, put lower 32 bits in r1, higher 32 in r3
        asr     r1, r3, #2 @--r3>>2, then store in r1 (effectively >>34, since r3 was higher 32 bits of multiplication)
        asr     r3, r2, #31 @--dividend>>31, then store in r3
        rsb     r3, r3, r1 @--r1 - r3, store in r3
        str     r3, [r7, #0] @--this stores the result in memory (from r3) 

推荐答案

然而,在那之后,它从结果中减去了 (dividend/2^31).

After that, however, it subtracted (dividend/2^31) from the result.

其实就是减去dividend>>31,对于负的dividend-1,对于非负的被除数为0,当负整数右移是算术右移时(和int 是 32 位宽).

Actually, it subtracts dividend >> 31, which is -1 for negative dividend, and 0 for non-negative dividend, when right-shifting negative integers is arithmetic right-shift (and int is 32 bits wide).

0x6666667 = (2^34 + 6)/10

所以对于 x <0,我们有,用 -10 < 写 x = 10*k + rr <= 0,

So for x < 0, we have, writing x = 10*k + r with -10 < r <= 0,

0x66666667 * (10*k+r) = (2^34+6)*k + (2^34 + 6)*r/10 = 2^34*k + 6*k + (2^34+6)*r/10

现在,负整数的算术右移产生v/2^n的下限,所以

Now, arithmetic right shift of negative integers yields the floor of v / 2^n, so

(0x66666667 * x) >> 34

结果

k + floor((6*k + (2^34+6)*r/10) / 2^34)

所以我们需要看到

-2^34 < 6*k + (2^34+6)*r/10 < 0

正确的不等式很容易,kr 都是非正的,而且不是都为 0.

The right inequality is easy, both k and r are non-positive, and not both are 0.

对于左不等式,还需要更多的分析.

For the left inequality, a bit more analysis is needed.

r >= -9

所以(2^34+6)*r/10的绝对值最多为2^34+6 - (2^34+6)/10.

|k| <= 2^31/10,

所以 <代码>|6*k|<= 3*2^31/5.

还有待验证

6 + 3*2^31/5 < (2^34+6)/10
1288490194   < 1717986919

是的,没错.

这篇关于GCC/ARM 上的快速除法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆