哪些是在一个Int32或UInt32的散列位好方法呢? [英] What are good methods for hashing bits in an Int32 or UInt32?

查看:134
本文介绍了哪些是在一个Int32或UInt32的散列位好方法呢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个伪随机数发生器的实施方案,具体的乔治马尔萨利亚的异或移RNG。我的实现是在这里:

I have an implementation of a pseudo random number generator, specifically of George Marsaglia's XOR-Shift RNG. My implementation is here:

<一个href=\"http://sharpneat.svn.sourceforge.net/svnroot/sharpneat/trunk/SharpNeatV2/src/SharpNeatLib/Utility/FastRandom.cs\"相对=nofollow> FastRandom.cs

原来,第一随机样本是非常密切的种子,如果你看一看的重新初始化(INT种子)方法,该方法是相当明显的相关性。这不好。我提出的解决方案是混合了种子位如下:

It turns out that the first random sample is very closely correlated with the seed, which is fairly obvious if you take a look at the Reinitialise(int seed) method. This is bad. My proposed solution is to mix up the bits of the seed as follows:

_x = (uint)(  (seed * 2147483647) 
           ^ ((seed << 16 | seed >> 48) * 28111) 
           ^ ((seed << 32 | seed >> 32) * 69001)
           ^ ((seed << 48 | seed >> 16) * 45083));

所以我显著由种子的位有四个质数相乘和异或回形成_x减弱任何相关性。我乘法之前也旋转种子的位,以确保幅度不一位混合起来在整个范围值的32位值。

So I have significantly weakened any correlation by multiplying the seed's bits with four primes and XORing back to form _x. I also rotate the seed's bits before multiplication to ensure that bits of varying magnitudes get mixed up across the full range of values for a 32 bit value.

四向旋转只是似乎很喜欢无所事事和每一个可能的转角(32)之间的一个很好的平衡。该素数是手指在空中 - 足够的幅度和位结构,在整个32位混杂起来的点点的$ P $垫'他们不管出发种子

The four-way rotation just seemed liked a nice balance between doing nothing and every possible rotation (32). The primes are 'finger in the air' - enough magnitude and bit structure to jumble up the bits and 'spread' them over the full 32 bits regardless of the starting seed.

我应该用更大的质数?有没有一个标准的办法解决这个问题,也许更正式的依据?我想以最小的CPU开销做到这一点。

Should I use bigger primes? Is there a standard approach to this problem, perhaps with a more formal basis? I am trying to do this with minimal CPU overhead.

感谢

===更新===

我决定使用一些素数,世界各地的32位分布较好的位。其结果是,我可以省略转变为乘法达到同样的效果(散列在整个范围内的32位位),所以我然后只添加四个产品,以得到最终的种子......

I decided to use some primes with set bits better distributed across all 32 bits. The result is that I can omit the shifts as the multiplications achieve the same effect (hashing bits across the full range of 32 bits), so I then just add the four products to give the final seed...

_x = (uint)(  (seed * 1431655781) 
            + (seed * 1183186591) 
            + (seed * 622729787)
            + (seed * 338294347));

我可能逃脱较少的素数/ multiplciations。两个似乎太少了(我仍然可以看到的第一个样本模式),三看着好了,安全余量我做到了四人。

I could possibly get away with fewer primes/multiplciations. Two seemed too few (I could still see patterns in the first samples), three looked OK, so for a safety margin I made it four.

===更新2 ===

=== UPDATE 2 ===

供参考以上减少到功能上等同的:

FYI the above reduces to the functionally equivalent:

_x = seed * 3575866506U;

我起初并没有发现这一点,当我做我在想,如果在计算中不同阶段的溢出会导致不同的结果。我认为答案是否定的 - 两种计算总是给相同的答案

I didn't spot this initially and when I did I was wondering if overflowing at different stages in the calculation would cause a different result. I believe the answer is no - the two calculations always give the same answer.

推荐答案

据一些研究人员, CrapWow Crap8 并的 Murmur3 是是最好的非加密散列算法可今天既快速,简便,统计学不错。

According to some researchers, CrapWow, Crap8 and Murmur3 are the best non-cryptographic hash algorithms available today that are both fast, simple and statistically good.

更多信息,请访问非加密散列函数动物园

这篇关于哪些是在一个Int32或UInt32的散列位好方法呢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆