选择单个随机值组合的算法? [英] Algorithm to select a single, random combination of values?

查看:21
本文介绍了选择单个随机值组合的算法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有 y 个不同的值,我想随机选择其中的 x 个.这样做的有效算法是什么?我可以只调用 rand() x 次,但是如果 x, y 很大,性能会很差.

Say I have y distinct values and I want to select x of them at random. What's an efficient algorithm for doing this? I could just call rand() x times, but the performance would be poor if x, y were large.

注意这里需要组合:每个值应该有相同的被选中概率,但它们在结果中的顺序并不重要.当然,任何生成 的算法都是合格的,但我想知道是否有可能在没有随机顺序要求的情况下更有效地做到这一点.

Note that combinations are needed here: each value should have the same probability to be selected but their order in the result is not important. Sure, any algorithm generating permutations would qualify, but I wonder if it's possible to do this more efficiently without the random order requirement.

如何有效地生成 K 个介于 0 和上限 N 之间的非重复整数的列表 涵盖了这种排列情况.

How do you efficiently generate a list of K non-repeating integers between 0 and an upper bound N covers this case for permutations.

推荐答案

Robert Floyd 发明了一种用于此类情况的采样算法.它通常优于混洗然后抓取第一个 x 元素,因为它不需要 O(y) 存储.正如最初编写的那样,它假定值来自 1..N,但是通过简单地将它产生的值作为下标处理到向量/数组/任何东西中,产生 0..N 和/或使用非连续值是微不足道的.

Robert Floyd invented a sampling algorithm for just such situations. It's generally superior to shuffling then grabbing the first x elements since it doesn't require O(y) storage. As originally written it assumes values from 1..N, but it's trivial to produce 0..N and/or use non-contiguous values by simply treating the values it produces as subscripts into a vector/array/whatever.

在伪代码中,算法是这样运行的(从 Jon Bentley 的 Programming Pearls 专栏A sample of Brilliance"中窃取).

In pseuocode, the algorithm runs like this (stealing from Jon Bentley's Programming Pearls column "A sample of Brilliance").

initialize set S to empty
for J := N-M + 1 to N do
    T := RandInt(1, J)
    if T is not in S then
        insert T in S
    else
        insert J in S

最后一点(如果 T 已经在 S 中,则插入 J)是棘手的部分.底线是 它保证了插入J的正确数学概率,从而产生无偏的结果.

That last bit (inserting J if T is already in S) is the tricky part. The bottom line is that it assures the correct mathematical probability of inserting J so that it produces unbiased results.

对于 yO(x)1O(1)O(x) 存储.

It's O(x)1 and O(1) with regard to y, O(x) storage.

注意,按照问题中的标记,算法只保证结果中每个元素出现的概率相等,而不保证它们在结果中的相对顺序.

Note that, in accordance with the combinations tag in the question, the algorithm only guarantees equal probability of each element occuring in the result, not of their relative order in it.

1O(x2) 在最坏的情况下所涉及的哈希映射可以忽略,因为它实际上是一个所有值都具有相同哈希值的不存在的病理情况

这篇关于选择单个随机值组合的算法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆