随机数种子的可能来源 [英] Possible sources for random number seeds

查看:140
本文介绍了随机数种子的可能来源的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有两点-首先,该示例在Fortran中,但是我认为它适用于任何语言;其次,内置的随机数生成器并不是真正的随机数,并且存在其他生成器,但是我们对将它们用于我们的工作并不感兴趣.

Two points -- first, the example is in Fortran, but I think it should hold for any language; second, the built in random number generators are not truly random and other generators exist, but we're not interested in using them for what we're doing.

关于随机种子的大多数讨论都承认,如果程序没有在运行时为种子提供种子,那么种子将在编译时生成.因此,每次运行程序时都会生成相同的数字序列,这对随机数不利.解决此问题的一种方法是为随机数生成器植入系统时钟.

Most discussions on random seeds acknowledge that if the program doesn't seed it at run-time, then the seed is generated at compile time. So, the same sequence of numbers is generated every time the program is run, which is not good for random numbers. One way to overcome this is to seed the random number generator with the system clock.

但是,当在多核计算机上与MPI并行运行时,我们的系统时钟方法会产生相同类型的问题.当序列在运行之间变化时,所有处理器都具有相同的系统时钟,因此具有相同的随机种子和相同的序列.

However, when running in parallel with MPI on a multi-core machine, the system clock approach for us generated the same kinds of problems. While the sequences changed from run to run, all processors got the same system clock and thus the same random seed and same sequences.

因此请考虑以下示例代码:

So consider the following example code:

PROGRAM clock_test
   IMPLICIT NONE
   INCLUDE "mpif.h"
   INTEGER :: ierr, rank, clock, i, n, method
   INTEGER, DIMENSION(:), ALLOCATABLE :: seed
   REAL(KIND=8) :: random
   INTEGER, PARAMETER :: OLD_METHOD = 0, &
                         NEW_METHOD = 1

   CALL MPI_INIT(ierr)

   CALL MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)

   CALL RANDOM_SEED(SIZE=n)
   ALLOCATE(seed(n))

   DO method = 0, 1
      SELECT CASE (method)
      CASE (OLD_METHOD)
         CALL SYSTEM_CLOCK(COUNT=clock)
         seed = clock + 37 * (/ (i - 1, i = 1, n) /)
         CALL RANDOM_SEED(put=seed)  
         CALL RANDOM_NUMBER(random)

         WRITE(*,*) "OLD Rank, dev = ", rank, random
      CASE (NEW_METHOD)
         OPEN(89,FILE='/dev/urandom',ACCESS='stream',FORM='UNFORMATTED')
         READ(89) seed
         CLOSE(89)
         CALL RANDOM_SEED(put=seed)  
         CALL RANDOM_NUMBER(random)

         WRITE(*,*) "NEW Rank, dev = ", rank, random
      END SELECT
      CALL MPI_BARRIER(MPI_COMM_WORLD, ierr)
   END DO

   CALL MPI_FINALIZE(ierr)
END PROGRAM clock_test

在具有2个内核的工作站上运行时,给出以下信息:

Which when run on my workstation with 2 cores, gives:

OLD Rank, dev =            0  0.330676306089146     
OLD Rank, dev =            1  0.330676306089146     
NEW Rank, dev =            0  0.531503215980609     
NEW Rank, dev =            1  0.747413828750221     

因此,我们通过从/dev/urandom读取种子来克服了时钟问题.这样,每个内核都会获得自己的随机数.

So, we overcame the clock issue by reading the seed from /dev/urandom instead. This way each core gets its own random number.

还有什么其他种子方法可以在多核MPI系统中工作,并且在每个核之间仍然是唯一的?

What other seed approaches are there that will work in a multi-core, MPI system and still be unique on each core, from run to run?

推荐答案

如果您查看科学计算中的随机数:简介由Katzgrabber撰写(这是对使用PRNG进行技术计算的来龙去脉的出色,清晰的讨论),同时,他们建议使用时间和PID的哈希函数来生成种子.从他们的7.1节开始:

If you take a look in Random Numbers In Scientific Computing: An Introduction by Katzgrabber (which is an excellent, lucid discussion of the ins and outs of using PRNGs for technical computing), in parallel they suggest using a hash function of time and PID to generate a seed. From their section 7.1:

long seedgen(void)  {
    long s, seed, pid;

    pid = getpid();
    s = time ( &seconds ); /* get CPU seconds since 01/01/1970 */

    seed = abs(((s*181)*((pid-83)*359))%104729); 
    return seed;
}

当然,在Fortran中,这就像

of course, in Fortran this would be something like

function seedgen(pid)
    use iso_fortran_env
    implicit none
    integer(kind=int64) :: seedgen
    integer, intent(IN) :: pid
    integer :: s

    call system_clock(s)
    seedgen = abs( mod((s*181)*((pid-83)*359), 104729) ) 
end function seedgen

有时可以方便地传递时间,而不是从seedgen内部调用它,因此在进行测试时,可以给它固定的值,然后生成可重现(==可测试)的序列.

It's also sometimes handy to be able to pass in the time, rather than calling it from within seedgen, so that when you are testing you can give it fixed values that then generate a reproducable (== testable) sequence.

这篇关于随机数种子的可能来源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆