为什么这是洗牌的算法错了吗? [英] Why is this shuffling algorithm wrong?

查看:126
本文介绍了为什么这是洗牌的算法错了吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我读费雪耶茨,这是该算法我想出了:

Before I read about Fisher-Yates, this is the algorithm I came up with:

def sort(arr):
    for i in range(len(arr)):
        swap(arr, i, rand.randint(0, len(arr) - 1))

而不是

这是我的理解,这与费雪耶茨之间的唯一区别在于:

From my understanding, the only difference between this and Fisher-Yates is that instead of:

swap(arr, i, rand.randint(0, len(arr) - 1))

我应该写:

swap(arr, i, rand.randint(i, len(arr) - 1))

有人能解释如何第一种算法是不正确的? (即不产生一个随机洗牌)。

Could someone explain how the first algorithm is incorrect? (ie. does not produce a random shuffle).

推荐答案

维基百科:

同样,在每次迭代有效数组索引的整个范围总是选择j仍然产生的结果是有偏见的,尽管不那么明显左右。这可以从以下事实可以看出,这样做会产生ñ N 互换不同的可能序列,而现在只有N!正元件阵列的可能的排列。因为n N 永远不能用n整除!当n> 2(因为后者是整除n-1个,这股用正无素因子),一些排列必须由多个n个 N 互换比其他的序列产生。作为该偏压的一个具体的例子,观察的洗牌一个三元件阵列[1,2,3]的可能结果的分布。有此阵列的6个可能置换(3!= 6),但该算法产生27个可能混洗(33 = 27)。在这种情况下,[1,2,3],[3,1,2]和[3,2,1]每个结果从4 27洗牌的,而每一个的剩余的3置换发生在5 27的洗牌。

Similarly, always selecting j from the entire range of valid array indices on every iteration also produces a result which is biased, albeit less obviously so. This can be seen from the fact that doing so yields nn distinct possible sequences of swaps, whereas there are only n! possible permutations of an n-element array. Since nn can never be evenly divisible by n! when n > 2 (as the latter is divisible by n−1, which shares no prime factors with n), some permutations must be produced by more of the nn sequences of swaps than others. As a concrete example of this bias, observe the distribution of possible outcomes of shuffling a three-element array [1, 2, 3]. There are 6 possible permutations of this array (3! = 6), but the algorithm produces 27 possible shuffles (33 = 27). In this case, [1, 2, 3], [3, 1, 2], and [3, 2, 1] each result from 4 of the 27 shuffles, while each of the remaining 3 permutations occurs in 5 of the 27 shuffles.

从本质上讲,你引入了一个微妙的偏见进入洗牌,这将导致一些排列突然出现了一点往往比其他人。它通常不是很明显,但它可以使一些敏感的应用(如Monte Carlo模拟的排列)不能产生准确的答案。

Essentially, you are introducing a subtle bias into the shuffle, which will cause some permutations to crop up a bit more often than others. It's often not very noticeable, but it could make some sensitive applications (e.g. Monte Carlo simulations on permutations) fail to produce accurate answers.

这篇关于为什么这是洗牌的算法错了吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆