用连续的数字播种java.util.Random [英] Seeding java.util.Random with consecutive numbers

查看:99
本文介绍了用连续的数字播种java.util.Random的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已将我遇到的错误简化为以下几行代码:

I've simplified a bug I'm experiencing down to the following lines of code:

    int[] vals = new int[8];
    for (int i = 0; i < 1500; i++)
        vals[new Random(i).nextInt(8)]++;
    System.out.println(Arrays.toString(vals));

输出为:[0,0,0,0,0,1310,190,0]

The output is: [0, 0, 0, 0, 0, 1310, 190, 0]

这只是选择连续数字来种子随机然后使用功率为2的nextInt的工件吗?如果是这样,是否还有其他类似的陷阱我应该知道,如果没有,我做错了什么? (我不是在寻找上述问题的解决方案,只是了解其他可能出错的地方)

Is this just an artifact of choosing consecutive numbers to seed Random and then using nextInt with a power of 2? If so, are there other pitfalls like this I should be aware of, and if not, what am I doing wrong? (I'm not looking for a solution to the above problem, just some understanding about what else could go wrong)

丹,写得很好的分析。由于javadoc非常明确地计算了数字的计算方式,因此为什么会发生这种情况并不是一个谜,就像有其他类似的异常需要注意一样 - 我没有看到任何关于连续种子的文档,而我我希望有一些java.util.Random经验的人可以指出其他常见的陷阱。

Dan, well-written analysis. As the javadoc is pretty explicit about how numbers are calculated, it's not a mystery as to why this happened as much as if there are other anomalies like this to watch out for-- I didn't see any documentation about consecutive seeds, and I'm hoping someone with some experience with java.util.Random can point out other common pitfalls.

至于代码,需要多个并行代理可重复使用偶然的行为,他们恰好从enum 8元素中选择,只要他们的第一步。一旦我发现了这种行为,种子都来自一个从已知种子创建的主随机对象。在程序的前一个(顺序播种)版本中,所有行为在第一次调用nextInt后迅速分散,所以我花了很长时间才将程序的行为缩小到RNG库,我想避免未来的情况。

As for the code, the need is for several parallel agents to have repeatably random behavior who happen to choose from an enum 8 elements long as their first step. Once I discovered this behavior, the seeds all come from a master Random object created from a known seed. In the former (sequentially-seeded) version of the program, all behavior quickly diverged after that first call to nextInt, so it took quite a while for me to narrow the program's behavior down to the RNG library, and I'd like to avoid that situation in the future.

推荐答案

RNG的种子本身应该是随机的。您正在使用的种子只会有一两位不同。

As much as possible, the seed for an RNG should itself be random. The seeds that you are using are only going to differ in one or two bits.

在一个程序中创建两个单独的RNG的理由非常少。你的代码不是那些有意义的情况之一。

There's very rarely a good reason to create two separate RNGs in the one program. Your code is not one of those situations where it makes sense.

只需创建一个RNG并重复使用它,那么你就不会遇到这个问题。

Just create one RNG and reuse it, then you won't have this problem.

回应 mmyers 的评论:

In response to comment from mmyers:


你碰巧知道java.util.Random
足以解释为什么它选择5
和6 case?

Do you happen to know java.util.Random well enough to explain why it picks 5 and 6 in this case?

答案在java.util.Random的源代码中,这是一个线性同余RNG 。当您在构造函数中指定种子时,它将按如下方式进行操作。

The answer is in the source code for java.util.Random, which is a linear congruential RNG. When you specify a seed in the constructor, it is manipulated as follows.

seed = (seed ^ 0x5DEECE66DL) & mask;

掩码只保留较低的48位并丢弃其他位。

Where the mask simply retains the lower 48 bits and discards the others.

当生成实际随机位时,此种子操作如下:

When generating the actual random bits, this seed is manipulated as follows:

randomBits = (seed * 0x5DEECE66DL + 0xBL) & mask;

现在,如果您考虑 Parker 是顺序的(0-1499),它们被使用一次然后丢弃,前四个种子产生以下四组随机位:

Now if you consider that the seeds used by Parker were sequential (0 -1499), and they were used once then discarded, the first four seeds generated the following four sets of random bits:

101110110010000010110100011000000000101001110100
101110110001101011010101011100110010010000000111
101110110010110001110010001110011101011101001110
101110110010011010010011010011001111000011100001

请注意,前10位在每种情况下都是缩进的。这是一个问题,因为他只想生成0-7范围内的值(只需要几位),而RNG实现通过将高位向右移位并丢弃低位来实现。这样做是因为在一般情况下,高位比低位更随机。在这种情况下,它们不是因为种子数据很差。

Note that the top 10 bits are indentical in each case. This is a problem because he only wants to generate values in the range 0-7 (which only requires a few bits) and the RNG implementation does this by shifting the higher bits to the right and discarding the low bits. It does this because in the general case the high bits are more random than the low bits. In this case they are not because the seed data was poor.

最后,要了解这些位如何转换为我们得到的十进制值,您需要知道java当n是2的幂时,.util.Random会出现一个特殊情况。它请求31个随机位(上面48位的前31位),将该值乘以n,然后将它向右移31位。

Finally, to see how these bits convert into the decimal values that we get, you need to know that java.util.Random makes a special case when n is a power of 2. It requests 31 random bits (the top 31 bits from the above 48), multiplies that value by n and then shifts it 31 bits to the right.

乘以8(本例中的n值)与左移3位相同。因此,此过程的净效果是将31位28位置向右移动。在上面的4个例子的每一个中,这留下了位模式101(或十进制的5)。

Multiplying by 8 (the value of n in this example) is the same as shifting left 3 places. So the net effect of this procedure is to shift the 31 bits 28 places to the right. In each of the 4 examples above, this leaves the bit pattern 101 (or 5 in decimal).

如果我们在一个值之后没有丢弃RNG,我们会看到序列分歧。虽然上面的四个序列都以5开头,但每个的第二个值分别为6,0,2和4。初始种子的微小差异开始产生影响。

If we didn't discard the RNGs after just one value, we would see the sequences diverge. While the four sequences above all start with 5, the second values of each are 6, 0, 2 and 4 respectively. The small differences in the initial seeds start to have an influence.

响应更新的问题: java.util.Random 是线程安全的,你可以共享一个跨多个线程的实例,因此仍然不需要有多个实例。如果确实必须有多个RNG实例,请确保它们彼此完全独立,否则您不能相信输出是独立的。

In response to the updated question: java.util.Random is thread-safe, you can share one instance across multiple threads, so there is still no need to have multiple instances. If you really have to have multiple RNG instances, make sure that they are seeded completely independently of each other, otherwise you can't trust the outputs to be independent.

至于为什么会得到这些效果,java.util.Random不是最好的RNG。它很简单,非常快,如果你看起来不太近,那就相当随机。但是,如果您对其输出运行严重测试,您会发现它存在缺陷。您可以在视觉上看到此处

As to why you get these kind of effects, java.util.Random is not the best RNG. It's simple, pretty fast and, if you don't look too closely, reasonably random. However, if you run some serious tests on its output, you'll see that it's flawed. You can see that visually here.

如果您需要更随机的RNG,可以使用 java.security.SecureRandom 。这有点慢,但它运作正常。对你来说可能有问题的一件事是它不可重复。具有相同种子的两个SecureRandom实例将不会提供相同的输出。这是设计的。

If you need a more random RNG, you can use java.security.SecureRandom. It's a fair bit slower, but it works properly. One thing that might be a problem for you though is that it is not repeatable. Two SecureRandom instances with the same seed won't give the same output. This is by design.

那么还有其他选择吗?这是我插入我自己的图书馆的地方。它包括3个可重复的伪RNG,比SecureRandom更快,比java.util.Random更随机。我没有发明它们,我只是从最初的C版本移植它们。它们都是线程安全的。

So what other options are there? This is where I plug my own library. It includes 3 repeatable pseudo-RNGs that are faster than SecureRandom and more random than java.util.Random. I didn't invent them, I just ported them from the original C versions. They are all thread-safe.

我实现了这些RNG,因为我需要更好的东西我的进化计算代码。根据我原来的简要回答,这段代码是多线程的,但它只使用一个RNG实例,在所有线程之间共享。

I implemented these RNGs because I needed something better for my evolutionary computation code. In line with my original brief answer, this code is multi-threaded but it only uses a single RNG instance, shared between all threads.

这篇关于用连续的数字播种java.util.Random的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆