R中rbinom(prob = 0.5)的种子行为不稳定 [英] Erratic seed behavior with rbinom(prob=0.5) in R
问题描述
在使用prob=0.5
时,R
与rbinom()
结合使用种子时,我发现了我认为不稳定的行为(但我希望对此有一个简单的解释).一般想法:对我来说,如果我设置了种子,则无论prob
的值是多少,都运行一次rbinom()
(即执行一个随机过程)
种子应改变一增量.然后,如果我再次将种子设置为相同的值,并运行另一个随机过程(例如再次运行rbinom()
,但可能使用不同的prob
值),则种子应再次更改为与原始值相同的值.对于先前的单个随机过程.
I have found what I would consider erratic behavior (but for which I hope there is a simple explanation) in R
's use of seeds in conjunction with rbinom()
when prob=0.5
is used. General idea: To me, if I set the seed, run rbinom()
once (i.e. conduct a single random process), despite what value prob
is set to, the random
seed should change by one increment. Then, if I again set the seed to the same value, and run another random process (such as rbinom()
again, but maybe with a different value of prob
), the seed should again change to the same value as it did for the previous single random process.
我发现只要我将rbinom()
与任何prob!=0.5
一起使用,R
就能做到这一点.这是一个示例:
I have found R
does exactly this as long as I'm using rbinom()
with any prob!=0.5
. Here is an example:
比较除0.5之外的两个概率的种子向量.Random.seed
:
Compare seed vector, .Random.seed
, for two probabilities other than 0.5:
set.seed(234908)
x <- rbinom(n=1,size=60,prob=0.4)
temp1 <- .Random.seed
set.seed(234908)
x <- rbinom(n=1,size=60,prob=0.3)
temp2 <- .Random.seed
any(temp1!=temp2)
> [1] FALSE
比较种子向量.Random.seed
(对于prob = 0.5与prob!= 0.5):
Compare seed vector, .Random.seed
, for prob=0.5 vs. prob!=0.5:
set.seed(234908)
x <- rbinom(n=1,size=60,prob=0.5)
temp1 <- .Random.seed
set.seed(234908)
x <- rbinom(n=1,size=60,prob=0.3)
temp2 <- .Random.seed
any(temp1!=temp2)
> [1] TRUE
temp1==temp2
> [1] TRUE FALSE TRUE TRUE TRUE TRUE TRUE
> [8] TRUE TRUE TRUE TRUE TRUE TRUE TRUE
...
对于所有prob=0.5
与所有其他概率的比较,我发现了这一点
在集合{0.1,0.2,...,0.9}中.同样,如果我比较来自
{0.5,0.1,0.2,...,0.9}之外,.Random.seed
向量始终是逐个元素相等的.这些事实对于rbinom()
中的奇数或偶数size
也成立.
I have found this for all comparisions of prob=0.5
against all other probabilities
in the set {0.1, 0.2, ..., 0.9}. Similarly, if I compare any values of prob
from
{0.1, 0.2, ..., 0.9} other than 0.5, the .Random.seed
vector is always element-by-element equal. These facts also hold true for either odd or even size
within rbinom()
.
为了使它更加奇怪(我很抱歉,这有点令人费解-与我的函数编写方式有关),当我使用向量中另存为元素的概率时,如果第一个元素为0.5,则我会遇到相同的问题,但不是第二.这是这种情况的示例:
To make it even more strange (I apologize that this is a little convoluted - it's relevant to the way my function is written), when I use probabilities saved as elements in a vector, I have same problem if 0.5 is first element, but not second. Here is the example for this case:
第一种情况:0.5是向量中引用的第一个概率
set.seed(234908)
MNAR <- c(0.5,0.3)
x <- rbinom(n=1,size=60,prob=MNAR[1])
y <- rbinom(n=1,size=50,prob=MNAR[2])
temp1 <- .Random.seed
set.seed(234908)
MNAR <- c(0.1,0.3)
x <- rbinom(n=1,size=60,prob=MNAR[1])
y <- rbinom(n=1,size=50,prob=MNAR[2])
temp2 <- .Random.seed
any(temp1!=temp2)
> [1] TRUE
any(temp1!=temp2)
> [1] TRUE FALSE TRUE TRUE TRUE TRUE TRUE
> [8] TRUE TRUE TRUE TRUE TRUE TRUE TRUE
第二种情况:0.5是向量中引用的第二个概率
set.seed(234908)
MNAR <- c(0.3,0.5)
x <- rbinom(n=1,size=60,prob=MNAR[1])
y <- rbinom(n=1,size=50,prob=MNAR[2])
temp1 <- .Random.seed
set.seed(234908)
MNAR <- c(0.1,0.3)
x <- rbinom(n=1,size=60,prob=MNAR[1])
y <- rbinom(n=1,size=50,prob=MNAR[2])
temp2 <- .Random.seed
any(temp1!=temp2)
> [1] FALSE
同样,我发现尽管使用了prob
和size
的值,该模式仍然成立.谁能向我解释这个谜?这引起了一个很大的问题,因为本来应该相同的结果却有所不同,因为prob=0.5
时(出于其他原因)使用种子/计算种子的方式有所不同,但在其他情况下
Again, I find that despite the values used for prob
and size
, this pattern holds. Can anyone explain this mystery to me? It's causing quite a problem because results that should be the same are coming up different because the seed is for some reason used/calculated differently when prob=0.5
but in no other instance.
推荐答案
因此,让我们将评论变成答案.感谢Ben Bolker为我们提供了正确的链接,并提供了指向以下代码的链接: https://svn.r-project.org/R/trunk/src/nmath/rbinom.c 以及查找unif_rand()
调用位置的建议.
So let's turn our comments into an answer. Thanks to Ben Bolker for putting us on the right track with a link to the code: https://svn.r-project.org/R/trunk/src/nmath/rbinom.c and the suggestion to track down where unif_rand()
is called.
快速扫描,似乎代码分为两部分,由注释分隔:
A quick scan and it seems that the code is broken into two sections, delimited by the comments:
/*-------------------------- np = n*p >= 30 : ------------------- */
和
/*---------------------- np = n*p < 30 : ------------------------- */
在每一个内部,对unif_rand
的调用次数都不相同(两个对一个).
Inside each of these, the number of calls to unif_rand
is not the same (two versus one.)
因此对于给定的size
(n
),您的随机种子可能会以不同的状态结束,具体取决于prob
(p
)的值:是否为size * prob >= 30
.
So for a given size
(n
), your random seed may end up in a different state depending on the value of prob
(p
): whether size * prob >= 30
or not.
考虑到这一点,现在您从示例中获得的所有结果都应该有意义:
With that in mind, all the results you got with your examples should now make sense:
# these end up in the same state
rbinom(n=1,size=60,prob=0.4) # => np < 30
rbinom(n=1,size=60,prob=0.3) # => np < 30
# these don't
rbinom(n=1,size=60,prob=0.5) # => np >= 30
rbinom(n=1,size=60,prob=0.3) # => np < 30
# these don't
{rbinom(n=1,size=60,prob=0.5) # np >= 30
rbinom(n=1,size=50,prob=0.3)} # np < 30
{rbinom(n=1,size=60,prob=0.1) # np < 30
rbinom(n=1,size=50,prob=0.3)} # np < 30
# these do
{rbinom(n=1,size=60,prob=0.3) # np < 30
rbinom(n=1,size=50,prob=0.5)} # np < 30
{rbinom(n=1,size=60,prob=0.1) # np < 30
rbinom(n=1,size=50,prob=0.3)} # np < 30
这篇关于R中rbinom(prob = 0.5)的种子行为不稳定的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!