加权采样,无需更换 [英] Weighted sampling without replacement
问题描述
我在向量w
中具有索引的种群p
和相应的权重.我想从此人口中获取k
个样本,而无需替换,其中选择与随机权重成正比.
I have a population p
of indices and corresponding weights in vector w
. I want to get k
samples from this population without replacement where the selection is done proportional to the weights in random.
我知道 randsample
可以用于选择替换为
I know that randsample
can be used for selection with replacement by saying
J = randsample(p,k,true,w)
但是当我用参数false
而不是true
调用它时,我得到
but when I call it with parameter false
instead of true
, I get
??? Error using ==> randsample at 184
Weighted sampling without replacement is not supported.
我以在此处讨论的方式编写了自己的函数:
I wrote my own function as discussed in here:
p = 1:n;
J = zeros(1,k);
for i = 1:k
J(i) = randsample(p,1,true,w);
w(p == J(i)) = 0;
end
但是由于它在循环中有k
个迭代,因此我寻求一种更短/更快的方法来执行此操作.你有什么建议吗?
But since it has k
iterations in the loop, I seek for a shorter/faster way to do this. Do you have any suggestions?
编辑:我想随机选择与某些加权标准成比例的矩阵的k
个唯一列.这就是为什么我使用采样而不进行替换.
EDIT: I want to randomly select k
unique columns of a matrix proportional to some weighting criteria. That is why I use sampling without replacement.
推荐答案
我认为不可能避免某种形式的循环,因为没有替换的采样意味着采样不再独立.此外,在不更换样本的情况下加权实际上意味着什么?
I don't think it is possible to avoid some sort of loop, since sampling without replacement means that the samples are no longer independent. Besides, what does the weighting actually mean when sampling without replacement?
无论如何,对于相对较小的样本量,我认为您不会注意到性能方面的任何问题.我能想到的所有解决方案基本上都可以完成您的工作,但可能会扩展randsample
中正在发生的事情.
In any case, for relatively small sample sizes I don't think you will notice any problem with performance. All the solutions I can think of basically do what you have done, but possibly expand out what is going on in randsample
.
这篇关于加权采样,无需更换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!