什么是随机(Java 7)中的181783497276652981和8682522807148012? [英] What's with 181783497276652981 and 8682522807148012 in Random (Java 7)?

查看:645
本文介绍了什么是随机(Java 7)中的181783497276652981和8682522807148012?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么在 Random.java 181783497276652981 8682522807148012 c>?



以下是来自Java SE JDK 1.7的相关源代码:

  / ** 
*创建一个新的随机数生成器。此构造函数将
*随机数生成器的种子设置为很可能
*的值,以区别于此构造函数的任何其他调用。
* /
public Random(){
this(seedUniquifier()^ System.nanoTime());
}

private static long seedUniquifier(){
// L'Ecuyer,
的线性同余生成器表//不同大小和良好的格子结构 ,1999
for(;;){
long current = seedUniquifier.get();
long next = current * 181783497276652981L;
if(seedUniquifier.compareAndSet(current,next))
返回下一个;
}
}

私有静态最终AtomicLong seedUniquifier
= new AtomicLong(8682522807148012L);

因此,无需调用新的Random()任何种子参数都使用当前的种子无统一者并将其与 System.nanoTime()进行异或。然后它使用 181783497276652981 创建另一个种子unquifier,以便下次存储调用新的Random()。 / p>

文字 181783497276652981L 8682522807148012L 未放置在常量中,但它们没有出现在其他任何地方。



首先,评论让我轻松领先。在线搜索该文章会产生实际文章 8682522807148012 未出现在论文中,但 181783497276652981 确实出现 - 作为另一个数字的子字符串, 1181783497276652981 181783497276652981 ,前缀 1



该论文声称 1181783497276652981 是一个能为线性同余生成器带来良好优点的数字。这个数字是否被错误地复制到Java中? 181783497276652981 是否有可接受的优点?



为什么 8682522807148012 选择?



在线搜索任何一个数字都不会产生任何解释,只有此页还注意到 181783497276652981 1 >。



是否可以选择其他数字与这两个数字相同?为什么或为什么不呢?

解决方案



  1. 这是多少?只是错误地复制到Java?


    是的,似乎是一个错字。



  2. 181783497276652981是否具有可接受的优点?


    这可以通过评估来确定本文提出的算法。但原始数字的优点可能更高。



  3. 为什么选择了8682522807148012?


    似乎是随机的。它可能是编写代码时System.nanoTime()的结果。



  4. 是否可以选择其他数字和这两个数字一样好吗?


    并非每个数字都同样好。所以,没有。




播种策略



不同版本和JRE实现之间的默认播种模式的差异。

  public Random (){this(System.currentTimeMillis()); } 





  public Random( ){this(++ seedUniquifier + System.nanoTime()); } 





  public Random( ){this(seedUniquifier()^ System.nanoTime()); } 

如果您连续创建多个RNG,则不接受第一个。如果它们的创建时间落在相同的毫秒范围内,它们将给出完全相同的序列。 (相同的种子=>相同的序列)



第二个不是线程安全的。在同时初始化时,多个线程可以获得相同的RNG。另外,后续初始化的种子倾向于相关。根据系统的实际定时器分辨率,种子序列可以线性增加(n,n + 1,n + 2,...)。如随机种子需要与众不同?和引用的论文伪随机数生成器初始化的常见缺陷,相关种子可以在多个RNG的实际序列之间生成相关性。



第三种方法创建随机分布且因此不相关的种子,甚至跨线程和随后的初始化。
所以当前的java文档:


这个构造函数将随机数生成器的种子设置为
值非常可能与此
构造函数的任何其他调用都不同。


可以通过跨线程和不相关进行扩展



种子序列质量



但播种序列的随机性仅与基础RNG一样好。
此java实现中用于种子序列的RNG使用乘法线性同余生成器(MLCG),其中c = 0且m = 2 ^ 64。 (模数2 ^ 64由64位长整数的溢出隐式给出)
由于零c和2模幂,质量(周期长度,位相关,... 。) 是有限的。正如文章所说,除了整个周期长度之外,每个比特都有一个自己的周期长度,对于较低有效位,它会呈指数级下降。因此,较低位具有较小的重复模式。 (seedUniquifier()的结果在实际RNG中被截断为48位之前应该是位反转的)



但它很快!为了避免不必要的比较和设置循环,循环体应该很快。这可能解释了这个特定MLCG的使用,没有添加,没有xoring,只有一个乘法。



并且上面提到的论文列出了c =的良好乘数列表0和m = 2 ^ 64,如1181783497276652981。



总而言之:A为努力@JRE-开发人员;)但是有一个错字。
(但谁知道,除非有人对其进行评估,否则缺失的前导1实际上可能会改善播种RNG。)



但是一些乘数肯定是更糟糕的是:
1导致一个恒定的序列。
2导致单位移动序列(以某种方式相关)
...



RNG的序列间相关实际上与(蒙特卡罗)模拟相关,其中多个随机序列被实例化甚至并行化。因此,需要一个良好的播种策略来进行独立模拟运行。因此,C ++ 11标准引入了种子序列的概念。用于生成不相关的种子。


Why were 181783497276652981 and 8682522807148012 chosen in Random.java?

Here's the relevant source code from Java SE JDK 1.7:

/**
 * Creates a new random number generator. This constructor sets
 * the seed of the random number generator to a value very likely
 * to be distinct from any other invocation of this constructor.
 */
public Random() {
    this(seedUniquifier() ^ System.nanoTime());
}

private static long seedUniquifier() {
    // L'Ecuyer, "Tables of Linear Congruential Generators of
    // Different Sizes and Good Lattice Structure", 1999
    for (;;) {
        long current = seedUniquifier.get();
        long next = current * 181783497276652981L;
        if (seedUniquifier.compareAndSet(current, next))
            return next;
    }
}

private static final AtomicLong seedUniquifier
    = new AtomicLong(8682522807148012L);

So, invoking new Random() without any seed parameter takes the current "seed uniquifier" and XORs it with System.nanoTime(). Then it uses 181783497276652981 to create another seed uniquifier to be stored for the next time new Random() is called.

The literals 181783497276652981L and 8682522807148012L are not placed in constants, but they don't appear anywhere else.

At first the comment gives me an easy lead. Searching online for that article yields the actual article. 8682522807148012 doesn't appear in the paper, but 181783497276652981 does appear -- as a substring of another number, 1181783497276652981, which is 181783497276652981 with a 1 prepended.

The paper claims that 1181783497276652981 is a number that yields good "merit" for a linear congruential generator. Was this number simply mis-copied into Java? Does 181783497276652981 have an acceptable merit?

And why was 8682522807148012 chosen?

Searching online for either number yields no explanation, only this page that also notices the dropped 1 in front of 181783497276652981.

Could other numbers have been chosen that would have worked as well as these two numbers? Why or why not?

解决方案

  1. Was this number simply mis-copied into Java?

    Yes, seems to be a typo.

  2. Does 181783497276652981 have an acceptable merit?

    This could be determined using the evaluation algorithm presented in the paper. But the merit of the "original" number is probably higher.

  3. And why was 8682522807148012 chosen?

    Seems to be random. It could be the result of System.nanoTime() when the code was written.

  4. Could other numbers have been chosen that would have worked as well as these two numbers?

    Not every number would be equally "good". So, no.

Seeding Strategies

There are differences in the default-seeding schema between different versions and implementation of the JRE.

public Random() { this(System.currentTimeMillis()); }

public Random() { this(++seedUniquifier + System.nanoTime()); }

public Random() { this(seedUniquifier() ^ System.nanoTime()); }

The first one is not acceptable if you create multiple RNGs in a row. If their creation times fall in the same millisecond range, they will give completely identical sequences. (same seed => same sequence)

The second one is not thread safe. Multiple threads can get identical RNGs when initializing at the same time. Additionally, seeds of subsequent initializations tend to be correlated. Depending on the actual timer resolution of the system, the seed sequence could be linearly increasing (n, n+1, n+2, ...). As stated in How different do random seeds need to be? and the referenced paper Common defects in initialization of pseudorandom number generators, correlated seeds can generate correlation among the actual sequences of multiple RNGs.

The third approach creates randomly distributed and thus uncorrelated seeds, even across threads and subsequent initializations. So the current java docs:

This constructor sets the seed of the random number generator to a value very likely to be distinct from any other invocation of this constructor.

could be extended by "across threads" and "uncorrelated"

Seed Sequence Quality

But the randomness of the seeding sequence is only as good as the underlying RNG. The RNG used for the seed sequence in this java implementation uses a multiplicative linear congruential generator (MLCG) with c=0 and m=2^64. (The modulus 2^64 is implicitly given by the overflow of 64bit long integers) Because of the zero c and the power-of-2-modulus, the "quality" (cycle length, bit-correlation, ...) is limited. As the paper says, besides the overall cycle length, every single bit has an own cycle length, which decreases exponentially for less significant bits. Thus, lower bits have a smaller repetition pattern. (The result of seedUniquifier() should be bit-reversed, before it is truncated to 48-bits in the actual RNG)

But it is fast! And to avoid unnecessary compare-and-set-loops, the loop body should be fast. This probably explains the usage of this specific MLCG, without addition, without xoring, just one multiplication.

And the mentioned paper presents a list of good "multipliers" for c=0 and m=2^64, as 1181783497276652981.

All in all: A for effort @ JRE-developers ;) But there is a typo. (But who knows, unless someone evaluates it, there is the possibility that the missing leading 1 actually improves the seeding RNG.)

But some multipliers are definitely worse: "1" leads to a constant sequence. "2" leads to a single-bit-moving sequence (somehow correlated) ...

The inter-sequence-correlation for RNGs is actually relevant for (Monte Carlo) Simulations, where multiple random sequences are instantiated and even parallelized. Thus a good seeding strategy is necessary to get "independent" simulation runs. Therefore the C++11 standard introduces the concept of a Seed Sequence for generating uncorrelated seeds.

这篇关于什么是随机(Java 7)中的181783497276652981和8682522807148012?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆