R中rbinom(prob = 0.5)的种子行为不稳定 [英] Erratic seed behavior with rbinom(prob=0.5) in R

查看:336
本文介绍了R中rbinom(prob = 0.5)的种子行为不稳定的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用prob=0.5时,Rrbinom()结合使用种子时,我发现了我认为不稳定的行为(但我希望对此有一个简单的解释).一般想法:对我来说,如果我设置了种子,则无论prob的值是多少,都运行一次rbinom()(即执行一个随机过程) 种子应改变一增量.然后,如果我再次将种子设置为相同的值,并运行另一个随机过程(例如再次运行rbinom(),但可能使用不同的prob值),则种子应再次更改为与原始值相同的值.对于先前的单个随机过程.

I have found what I would consider erratic behavior (but for which I hope there is a simple explanation) in R's use of seeds in conjunction with rbinom() when prob=0.5 is used. General idea: To me, if I set the seed, run rbinom() once (i.e. conduct a single random process), despite what value prob is set to, the random seed should change by one increment. Then, if I again set the seed to the same value, and run another random process (such as rbinom() again, but maybe with a different value of prob), the seed should again change to the same value as it did for the previous single random process.

我发现只要我将rbinom()与任何prob!=0.5一起使用,R就能做到这一点.这是一个示例:

I have found R does exactly this as long as I'm using rbinom() with any prob!=0.5. Here is an example:

比较除0.5之外的两个概率的种子向量.Random.seed:

Compare seed vector, .Random.seed, for two probabilities other than 0.5:

set.seed(234908)
x <- rbinom(n=1,size=60,prob=0.4)
temp1 <- .Random.seed

set.seed(234908)
x <- rbinom(n=1,size=60,prob=0.3)
temp2 <- .Random.seed

any(temp1!=temp2)
> [1] FALSE

比较种子向量.Random.seed(对于prob = 0.5与prob!= 0.5):

Compare seed vector, .Random.seed, for prob=0.5 vs. prob!=0.5:

set.seed(234908)
x <- rbinom(n=1,size=60,prob=0.5)
temp1 <- .Random.seed

set.seed(234908)
x <- rbinom(n=1,size=60,prob=0.3)
temp2 <- .Random.seed
any(temp1!=temp2)
> [1] TRUE

temp1==temp2
> [1]  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
> [8]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
...

对于所有prob=0.5与所有其他概率的比较,我发现了这一点 在集合{0.1,0.2,...,0.9}中.同样,如果我比较来自 {0.5,0.1,0.2,...,0.9}之外,.Random.seed向量始终是逐个元素相等的.这些事实对于rbinom()中的奇数或偶数size也成立.

I have found this for all comparisions of prob=0.5 against all other probabilities in the set {0.1, 0.2, ..., 0.9}. Similarly, if I compare any values of prob from {0.1, 0.2, ..., 0.9} other than 0.5, the .Random.seed vector is always element-by-element equal. These facts also hold true for either odd or even size within rbinom().

为了使它更加奇怪(我很抱歉,这有点令人费解-与我的函数编写方式有关),当我使用向量中另存为元素的概率时,如果第一个元素为0.5,则我会遇到相同的问题,但不是第二.这是这种情况的示例:

To make it even more strange (I apologize that this is a little convoluted - it's relevant to the way my function is written), when I use probabilities saved as elements in a vector, I have same problem if 0.5 is first element, but not second. Here is the example for this case:

第一种情况:0.5是向量中引用的第一个概率

set.seed(234908)
MNAR <- c(0.5,0.3)
x <- rbinom(n=1,size=60,prob=MNAR[1])
y <- rbinom(n=1,size=50,prob=MNAR[2])
temp1 <- .Random.seed

set.seed(234908)
MNAR <- c(0.1,0.3)
x <- rbinom(n=1,size=60,prob=MNAR[1])
y <- rbinom(n=1,size=50,prob=MNAR[2])
temp2 <- .Random.seed

any(temp1!=temp2)
> [1] TRUE

any(temp1!=temp2)
> [1]  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
> [8]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE

第二种情况:0.5是向量中引用的第二个概率

set.seed(234908)
MNAR <- c(0.3,0.5)
x <- rbinom(n=1,size=60,prob=MNAR[1])
y <- rbinom(n=1,size=50,prob=MNAR[2])
temp1 <- .Random.seed

set.seed(234908)
MNAR <- c(0.1,0.3)
x <- rbinom(n=1,size=60,prob=MNAR[1])
y <- rbinom(n=1,size=50,prob=MNAR[2])
temp2 <- .Random.seed

any(temp1!=temp2)
> [1] FALSE

同样,我发现尽管使用了probsize的值,该模式仍然成立.谁能向我解释这个谜?这引起了一个很大的问题,因为本来应该相同的结果却有所不同,因为prob=0.5时(出于其他原因)使用种子/计算种子的方式有所不同,但在其他情况下

Again, I find that despite the values used for prob and size, this pattern holds. Can anyone explain this mystery to me? It's causing quite a problem because results that should be the same are coming up different because the seed is for some reason used/calculated differently when prob=0.5 but in no other instance.

推荐答案

因此,让我们将评论变成答案.感谢Ben Bolker为我们提供了正确的链接,并提供了指向以下代码的链接: https://svn.r-project.org/R/trunk/src/nmath/rbinom.c 以及查找unif_rand()调用位置的建议.

So let's turn our comments into an answer. Thanks to Ben Bolker for putting us on the right track with a link to the code: https://svn.r-project.org/R/trunk/src/nmath/rbinom.c and the suggestion to track down where unif_rand() is called.

快速扫描,似乎代码分为两部分,由注释分隔:

A quick scan and it seems that the code is broken into two sections, delimited by the comments:

/*-------------------------- np = n*p >= 30 : ------------------- */

/*---------------------- np = n*p < 30 : ------------------------- */

在每一个内部,对unif_rand的调用次数都不相同(两个对一个).

Inside each of these, the number of calls to unif_rand is not the same (two versus one.)

因此对于给定的size(n),您的随机种子可能会以不同的状态结束,具体取决于prob(p)的值:是否为size * prob >= 30.

So for a given size (n), your random seed may end up in a different state depending on the value of prob (p): whether size * prob >= 30 or not.

考虑到这一点,现在您从示例中获得的所有结果都应该有意义:

With that in mind, all the results you got with your examples should now make sense:

# these end up in the same state
rbinom(n=1,size=60,prob=0.4) # => np <  30
rbinom(n=1,size=60,prob=0.3) # => np <  30

# these don't
rbinom(n=1,size=60,prob=0.5) # => np >= 30
rbinom(n=1,size=60,prob=0.3) # => np <  30

# these don't
{rbinom(n=1,size=60,prob=0.5)  # np >= 30
 rbinom(n=1,size=50,prob=0.3)} # np <  30
{rbinom(n=1,size=60,prob=0.1)  # np <  30
 rbinom(n=1,size=50,prob=0.3)} # np <  30

# these do
{rbinom(n=1,size=60,prob=0.3)  # np <  30
 rbinom(n=1,size=50,prob=0.5)} # np <  30
{rbinom(n=1,size=60,prob=0.1)  # np <  30
 rbinom(n=1,size=50,prob=0.3)} # np <  30

这篇关于R中rbinom(prob = 0.5)的种子行为不稳定的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆