算法不放回抽样? [英] Algorithm for sampling without replacement?

查看:649
本文介绍了算法不放回抽样?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想测试数据的特定群集发生是偶然的可能性。鲁棒的方式来做到这一点是蒙特卡洛仿真,在其中数据和组之间的关联是随机重新分配大量的时间(如10,000),和集群的一个度量是用于实际的数据与模拟,以确定AP比较值。

I am trying to test the likelihood that a particular clustering of data has occurred by chance. A robust way to do this is Monte Carlo simulation, in which the associations between data and groups are randomly reassigned a large number of times (e.g. 10,000), and a metric of clustering is used to compare the actual data with the simulations to determine a p value.

我已经得到了大部分这方面的工作,用指针映射分组的数据元素,所以我打算随机重新分配数据指针。问题:什么是一个快速的方式,无需更换样品,让每一个指针在复制数据集的随机重新分配

I've got most of this working, with pointers mapping the grouping to the data elements, so I plan to randomly reassign pointers to data. THE QUESTION: what is a fast way to sample without replacement, so that every pointer is randomly reassigned in the replicate data sets?

例如(这些数据只是一个简单的例子):

For example (these data are just a simplified example):

数据(N = 12个值) - A组:0.1,0.2,0.4 / B组:0.5,0.6,0.8 / C组:0.4,0.5 / D组:0.2,0.2,0.3,0.5

Data (n=12 values) - Group A: 0.1, 0.2, 0.4 / Group B: 0.5, 0.6, 0.8 / Group C: 0.4, 0.5 / Group D: 0.2, 0.2, 0.3, 0.5

对于每个复制的数据集,我有同样的簇大小(A = 3,B = 3,C = 2,D = 4)和数据值,但会重新分配值集群。

For each replicate data set, I would have the same cluster sizes (A=3, B=3, C=2, D=4) and data values, but would reassign the values to the clusters.

要做到这一点,我可以在1-12的范围产生随机数,分配A组的第一个元素,然后在1-11范围内产生随机数和A组指定的第二个元素,依此类推。指针重新分配是快,我将有pre-分配的所有数据结构,但无需更换抽样看起来可能已经解决了很多次的问题。

To do this, I could generate random numbers in the range 1-12, assign the first element of group A, then generate random numbers in the range 1-11 and assign the second element in group A, and so on. The pointer reassignment is fast, and I will have pre-allocated all data structures, but the sampling without replacement seems like a problem that might have been solved many times before.

逻辑或伪code preferred。

Logic or pseudocode preferred.

谢谢!

推荐答案

见我回答这个问题<一href="http://stackoverflow.com/questions/196017/unique-random-numbers-in-o1#196065">http://stackoverflow.com/questions/196017/unique-random-numbers-in-o1#196065.同样的逻辑也应该做到你在找什么做的。

See my answer to this question http://stackoverflow.com/questions/196017/unique-random-numbers-in-o1#196065. The same logic should accomplish what you are looking to do.

这篇关于算法不放回抽样?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆