需要帮助理解Rabin-Karp实现的常数时间的Rolling Hash计算 [英] Need help in understanding Rolling Hash computation in constant time for Rabin-Karp Implementation
问题描述
我一直在尝试用Java实现Rabin-Karp算法。我很难在恒定时间内计算滚动哈希值。我在 http://algs4.cs.princeton.edu/找到了一个实现。 53substring / RabinKarp.java.html 。我仍然无法理解这两条线是如何工作的。
I've been trying to implement Rabin-Karp algorithm in Java. I have hard time computing the rolling hash value in constant time. I've found one implementation at http://algs4.cs.princeton.edu/53substring/RabinKarp.java.html. Still I could not get how these two lines work.
txtHash = (txtHash + Q - RM*txt.charAt(i-M) % Q) % Q;
txtHash = (txtHash*R + txt.charAt(i)) % Q;
我查看了几篇关于模运算的文章,但没有文章可以穿透我厚厚的头骨。请给出一些指示来理解这一点。
I looked at couple of articles on modular arithmetic but no article could able to penetrate my thick skull. Please give some pointers to understand this.
推荐答案
这是哈希的滚动方面。它消除了最旧字符( txt.charAt(iM)
)的贡献,并且包含了最新字符的贡献( txt.charAt(i )
)。
This is the "rolling" aspect of the hash. It's eliminating the contribution of the oldest character (txt.charAt(i-M)
), and incorporating the contribution of the newest character(txt.charAt(i)
).
哈希函数定义为:
M-1
hash[i] = ( SUM { input[i-j] * R^j } ) % Q
j=0
(我使用 ^
来表示强大的力量。)
(where I'm using ^
to denote "to the power of".)
但这可以写成一个有效的递归实现:
But this can be written as an efficient recursive implementation as:
hash[i] = (txtHash*R - input[i-M]*(R^M) + input[i]) % Q
您的参考代码正在执行此操作,但它使用各种技术来确保始终正确计算结果(并且有效)。
Your reference code is doing this, but it's using various techniques to ensure that the result is always computed correctly (and efficiently).
因此,对于例如,第一个表达式中的 + Q
没有数学效果,但它确保总和的结果总是正数(如果它变为负数,%Q
d没有预期的效果)。它也将计算分为几个阶段,大概是为了防止数值溢出。
So, for instance, the + Q
in the first expression has no mathematical effect, but it ensures that the result of the sum is always positive (if it goes negative, % Q
doesn't have the desired effect). It's also breaking the calculation into stages, presumably to prevent numerical overflow.
这篇关于需要帮助理解Rabin-Karp实现的常数时间的Rolling Hash计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!