使用 parSapply 生成随机数 [英] Using parSapply to generate random numbers

查看:60
本文介绍了使用 parSapply 生成随机数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试运行一个函数,该函数中有一个随机数生成器.结果与我预期的不一样,所以我做了以下测试:

I am trying to run a function which there is a random number generator within the function. The results at not as what I expected so I have done the following test:

# Case 1
set.seed(100)
A1 = matrix(NA,20,10)
for (i in 1:10) {
  A1[,i] = sample(1:100,20)
}

# Case 2
set.seed(100)
A2 = sapply(seq_len(10),function(x) sample(1:100,20))

# Case 3
require(parallel)
set.seed(100)
cl <- makeCluster(detectCores() - 1)
A3 = parSapply(cl,seq_len(10), function(x) sample(1:100,20))
stopCluster(cl)

# Check: Case 1 result equals Case 2 result
identical(A1,A2)
# [1] TRUE

# Check: Case 1 result does NOT equal to Case 3 result
identical(A1,A3)
# [1] FALSE

# Check2: Would like to check if it's a matter of ordering
range(rowSums(A1))
# [1] 319 704

range(rowSums(A3))
# [1] 288 612

在上面的代码中,parSapply 生成一组与 A1 和 A2 不同的随机数.我使用 Check2 的目的是,我怀疑 parSapply 可能会改变顺序,但似乎并非如此,因为这些随机数的最大和最小总和不同.

In the above code, the parSapply generates a different set of random numbers than A1 and A2. My purpose of having Check2 is that, I was suspecting that parSapply might alter the order however it doesn't seem to be case as the max and min sums of these random numbers are different.

感谢是否有人可以解释为什么 parSapply 会给出与 sapply 不同的结果.我在这里错过了什么?

Appreciate if someone could shed some colour on why parSapply would give a different result from sapply. What am I missing here?

提前致谢!

推荐答案

查看 ?vignette(parallel),特别是第 6 节随机数生成".其中包括以下内容

Have a look at ?vignette(parallel) and in particular at "Section 6 Random-number generation". Among other things it states the following

使用(伪)随机数进行并行计算时需要注意:运行独立计算部分的进程/线程需要运行独立的(最好是可重现的)随机数流.

Some care is needed with parallel computation using (pseudo-)random numbers: the processes/threads which run separate parts of the computation need to run independent (and preferably reproducible) random-number streams.

当 R 进程启动时,它从保存的工作区中的对象 .Random.seed 中获取随机数种子,或者在第一次使用随机数生成时从时钟时间和进程 ID 构造一个(参见帮助在 RNG 上).因此工作进程可能会获得相同的种子因为在分叉之前恢复了包含 .Random.seed 的工作区或使用了随机数生成器:否则这些会得到一个不可复制的种子(但很有可能每个工人都有不同的种子).

When an R process is started up it takes the random-number seed from the object .Random.seed in a saved workspace or constructs one from the clock time and process ID when random-number generation is first used (see the help on RNG). Thus worker processes might get the same seed because a workspace containing .Random.seed was restored or the random number generator has been used before forking: otherwise these get a non-reproducible seed (but with very high probability a different seed for each worker).

您还应该看看 ?clusterSetRNGStream.

这篇关于使用 parSapply 生成随机数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆