当分母已知时,更快的整数除法? [英] Faster integer division when denominator is known?
问题描述
我正在开发具有非常高的除法整数延迟,数百个周期的GPU设备。我希望优化分区。
I am working on GPU device which has very high division integer latency, several hundred cycles. I am looking to optimize divisions.
分母中的所有分区,在集合{1,3,6,10}中,但分子是运行时正值,大致32000或更少。由于内存限制,查找表可能不是一个好的选择。
All divisions by denominator which is in a set { 1,3,6,10 }, however numerator is a runtime positive value, roughly 32000 or less. due to memory constraints, lookup table may not be a good option.
你能想到其他选择吗?
我想过计算浮点反转,并用它们来乘以分子。
Can you think of alternatives? I have thought of computing float point inverses, and using those to multiply numerator.
谢谢
PS。谢谢你们。位移黑客是一个非常酷的。
从舍入中恢复,我使用以下C段:
PS. thank you people. bit shift hack is a really cool. to recover from roundoff, I use following C segment:
// q = m/n
q += (n*(j +1)-1) < m;
推荐答案
a/b=a*(1/b)
x=(1<<16)/b
a/b=(a*x)>>16
你能为分母建立一个查找表吗?既然你说了15位分子,如果一切都是无符号的32位,你可以使用17作为移位:
can you build a lookup table for the denominators? since you said 15 bit numerators, you could use 17 for the shifts if everything is unsigned 32 bit:
a/b=a*((1<<17)/b)>>17
班次越大舍入误差越小。您可以进行强力检查,看看实际上有多少次,如果有的话。
The larger the shift the less the rounding error. You can do a brute force check to see how many times, if any, this is actually wrong.
这篇关于当分母已知时,更快的整数除法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!