当“L'Ecuyer-CMRG"出现时,R 不会重置种子.使用RNG? [英] R doesn't reset the seed when "L'Ecuyer-CMRG" RNG is used?

查看:12
本文介绍了当“L'Ecuyer-CMRG"出现时,R 不会重置种子.使用RNG?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 R 中进行了一些并行模拟,我注意到种子使用L'Ecuyer-CMRG"rng 时不会更改.我在读书Parallel R",以及选项 mc.set.seed = TRUE 应该给出每次调用 mclapply() 时,每个工人都会有一个新种子.

I was doing some parallel simulations in R and I notice that the seed is not changed when the "L'Ecuyer-CMRG" rng is used. I was reading the book "Parallel R", and the option mc.set.seed = TRUE should give each worker a new seed each time mclapply() is called.

这是我的代码:

library(parallel)
RNGkind("L'Ecuyer-CMRG")

mclapply(1:2, function(n) rnorm(n), mc.set.seed = TRUE)
[[1]]
[1] -0.7125037

[[2]]
[1] -0.9013552  0.3445190

mclapply(1:2, function(n) rnorm(n), mc.set.seed = TRUE)
[[1]]
[1] -0.7125037

[[2]]
[1] -0.9013552  0.3445190

同样的事情发生在我的台式机和笔记本电脑上(都是 Ubuntu 12.04 LTS).

same thing happens both on my desktop and on my laptop (both Ubuntu 12.04 LTS).

推荐答案

在我看来,如果你想保证在 R 会话中对 mclapply 的后续调用得到不同的随机数,你需要调用 set.seed 和一个不同的值,删除全局变量.Random.seed",或者在再次调用 mclapply 之前在那个 R 会话中生成至少一个随机数.

It appears to me that if you want to guarantee that subsequent calls to mclapply in an R session get different random numbers, you need to either call set.seed with a different value, remove the global variable ".Random.seed", or generate at least one random number in that R session before calling mclapply again.

这种行为的原因是 mclapply(例如不同于 mcparallel)在内部调用 mc.reset.stream.这会将隐藏在parallel"包中的种子重置为.Random.seed"的值,因此如果再次调用 mclapply 时.Random.seed"没有改变,则由 mclapply 启动的工作人员将获得与之前相同的随机数.

The reason for this behavior is that mclapply (unlike mcparallel for example) calls mc.reset.stream internally. This resets the seed that is stashed in the "parallel" package to the value of ".Random.seed", so if ".Random.seed" hasn't changed when mclapply is called again, the workers started by mclapply will get the same random numbers as they did previously.

请注意,clusterApply 和 parLapply 等函数不是这种情况,因为它们使用持久性工作器,因此会继续从其 RNG 流中抽取随机数.但是每次调用 mclapply 时都会分叉新的 worker,这大概会使这种行为变得更加困难.

Note that this is not the case with functions such as clusterApply and parLapply, since they use persistent workers, and therefore continue to draw random numbers from their RNG stream. But new workers are forked every time mclapply is called, presumably making it much harder to have that type of behavior.

以下是使用 mclapply 将种子设置为不同值以获取不同随机数的示例:

Here's an example of setting the seed to different values in order to get different random numbers using mclapply:

RNGkind("L'Ecuyer-CMRG")
set.seed(100)
mclapply(1:2, function(i) rnorm(2))
set.seed(101)
mclapply(1:2, function(i) rnorm(2))

这是删除.Random.seed"的示例:

Here's an example of removing ".Random.seed":

RNGkind("L'Ecuyer-CMRG")
mclapply(1:2, function(i) rnorm(2))
rm(.Random.seed)
mclapply(1:2, function(i) rnorm(2))

这是在主服务器上生成随机数的示例:

And here's an example of generating random numbers on the master:

RNGkind("L'Ecuyer-CMRG")
mclapply(1:2, function(i) rnorm(2))
rnorm(1)
mclapply(1:2, function(i) rnorm(2))

我不确定哪种方法最好,但这可能取决于您要尝试做什么.

I'm not sure which is the best approach, but that may depend on what you're trying to do.

虽然看起来只是多次调用 mclapply 而不更改.Random.seed"会导致可重复的结果,但我不知道这是否有保证.为了保证可重复的结果,我认为您需要调用 set.seed:

Although it appears that simply calling mclapply multiple times without changing ".Random.seed" results in reproducible results, I don't know if that is guaranteed. To guarantee reproducible results, I think you need to call set.seed:

RNGkind("L'Ecuyer-CMRG")
set.seed(1234)
mclapply(1:2, function(i) rnorm(2))
set.seed(1234)
mclapply(1:2, function(i) rnorm(2))

这篇关于当“L'Ecuyer-CMRG"出现时,R 不会重置种子.使用RNG?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆