蒙特卡罗模拟的最佳种子mt19937_64的播种方式 [英] Best way to seed mt19937_64 for Monte Carlo simulations

查看:640
本文介绍了蒙特卡罗模拟的最佳种子mt19937_64的播种方式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个运行蒙特卡洛模拟的程序;具体来说,我使用的是Metropolis算法.该程序需要生成数十亿个随机"数.我知道Mersenne扭曲器在蒙特卡洛模拟中非常受欢迎,但是我想确保以最好的方式播种生成器.

I'm working on a program that runs Monte Carlo simulation; specifically, I'm using a Metropolis algorithm. The program needs to generate possibly billions of "random" numbers. I know that the Mersenne twister is very popular for Monte Carlo simulation, but I would like to make sure that I am seeding the generator in the best way possible.

目前,我正在使用以下方法计算32位种子:

Currently I'm computing a 32-bit seed using the following method:

mt19937_64 prng; //pseudo random number generator
unsigned long seed; //store seed so that every run can follow the same sequence
unsigned char seed_count; //to help keep seeds from repeating because of temporal proximity

unsigned long genSeed() {
    return (  static_cast<unsigned long>(time(NULL))      << 16 )
         | ( (static_cast<unsigned long>(clock()) & 0xFF) << 8  )
         | ( (static_cast<unsigned long>(seed_count++) & 0xFF) );
}

//...

seed = genSeed();
prng.seed(seed);

我觉得有很多更好的方法可以确保不重复的新种子,而且我敢肯定mt19937_64可以使用更多的32位种子.有人有什么建议吗?

I have a feeling there are much better ways to assure non-repeating new seeds, and I'm quite sure mt19937_64 can be seeded with more then 32-bits. Does anyone have any suggestions?

推荐答案

让我们回顾一下(也是注释),我们希望生成不同的种子,以在以下每种情况下获得独立的随机数序列:

Let's recap (comments too), we want to generate different seeds to get independent sequences of random numbers in each of the following occurrences:

  1. 该程序稍后在同一台计算机上重新启动,
  2. 在同一台计算机上同时启动两个线程,
  3. 该程序同时在两台不同的计算机上启动.

1是从纪元开始使用时间来解决的,2是通过全局原子计数器来解决的,3是通过平台相关的id来解决的(请参阅

1 is solved using time since epoch, 2 is solved with a global atomic counter, 3 is solved with a platform dependent id (see How to obtain (almost) unique system identifier in a cross platform way?)

现在的重点是,将它们组合成uint_fast64_t(std::mt19937_64的种子类型)的最佳方法是什么?我在这里假设我们不知道每个参数的范围是先验的,或者它们太大,以至于我们不能只是简单地通过移位来获得唯一的种子.

Now the point is what is the best way to combine them to get a uint_fast64_t (the seed type of std::mt19937_64)? I assume here that we do not know a priori the range of each parameter or that they are too big, so that we cannot just play with bit shifts getting a unique seed in a trivial way.

A std::seed_seq是最简单的方法,但是它的返回类型uint_least32_t并不是我们的最佳选择.

A std::seed_seq would be the easy way to go, however its return type uint_least32_t is not our best choice.

一个好的64位哈希器是一个更好的选择. STL在functional标头下提供std::hash,一种可能是将上面的三个数字连接成一个字符串,然后将其传递给哈希器.返回类型是size_t,它在64台计算机上很可能符合我们的要求.

A good 64 bits hasher is a much better choice. The STL offers std::hash under the functional header, a possibility is to concatenate the three numbers above into a string and then passing it to the hasher. The return type is a size_t which on 64 machines is very likely to match our requirements.

冲突不太可能发生,但当然有可能发生,如果您要确保不建立包含一个以上序列的统计信息,则只能存储种子并丢弃重复的运行.

Collisions are unlikely but of course possible, if you want to be sure to not build up statistics that include a sequence more than once, you can only store the seeds and discard the duplicated runs.

A std::random_device也可以用来生成种子(冲突可能仍会发生,很难或多或少地发生),但是由于实现依赖于库并且可能归结为伪随机生成器,因此必须检查设备的熵并避免为此目的使用零熵设备,因为您可能会破坏上面的几点(尤其是第3点).不幸的是,只有将程序带到特定计算机上并使用已安装的库进行测试时,才能发现熵.

A std::random_device could also be used to generate the seeds (collisions may still happen, hard to say if more or less often), however since the implementation is library dependent and may go down to a pseudo random generator, it is mandatory to check the entropy of the device and avoid to a use zero-entropy device for this purpose as you will probably break the points above (especially point 3). Unfortunately you can discover the entropy only when you take the program to the specific machine and test with the installed library.

这篇关于蒙特卡罗模拟的最佳种子mt19937_64的播种方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆