Knuth shuffle 的一种变体 [英] A variant of Knuth shuffle

查看:26
本文介绍了Knuth shuffle 的一种变体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个与 Knuth/Fisher-Yates shuffle 相关的非常困难但有趣的概率问题.

This is a very hard but interesting probability question related to Knuth/Fisher-Yates shuffle.

当循环每个元素时,将当前元素与整个数组中的任意一个随机元素(不在左边的元素内)进行交换,那么原始第ith的概率是多少元素在 jth 位置结束?

When looping for each element, the swap is performed for the current element with any random element from the whole array (not within the elements left), then what is the probabilty of the original ith element ending up at the jth position?

推荐答案

Knuth shuffle 如下(在 Python 中,但可能是伪代码)

The Knuth shuffle is as follows (in Python, but it could be pseudocode)

for i in range(len(v)):
  swap(v, i, randrange(i, len(v))

naïve shuffle 非常相似,但它不是 Knuth shuffle:

The naïve shuffle is very similar, but it is not the Knuth shuffle:

for i in range(len(v)):
  swap(v, i, randrange(0, len(v))

Knuth shuffle 产生均匀分布的排列.可以证明 naïve shuffle 不是,因为有 nn 个可能的随机数序列,每个序列的概率相等,并且!可能的排列,它不是 nn 的因数.(另一方面,Knuth shuffle 涉及到 n! 个可能的随机数序列,每个序列都可以被证明产生唯一的排列.)

The Knuth shuffle produces uniformly-distributed permutations. The naïve shuffle can be proven not to, because there are nn possible sequences of random numbers, each of which has an equal probability, and n! possible permutations, which is not a factor of nn. (The Knuth shuffle, on the other hand, involves precisely n! possible random number sequences, each of which can be proven to produce a unique permutation.)

上述证明并没有表明naïve shuffle中的各个置换位置是否均匀分布.尽管如此,排列的非均匀分布可能会产生排列元素的均匀分布:例如,如果除旋转之外的所有排列的概率为 0,并且旋转的概率相等,则排列元素将均匀分布.(这会比 naïve shuffle 更糟糕.)

The above proof does not indicate whether or not the individual permutation positions in the naïve shuffle are uniformly distributed. It is possible for a non-uniform distribution of permutations to nonetheless produce a uniform distribution of permutation elements: for example, if all permutations other than rotations have probability 0, and rotations have equal probability, then the permutation elements will be uniformly distributed. (That would be an even worse shuffle than the naïve shuffle.)

事实证明,naïve shuffle不会产生均匀分布的元素位置.有两点规律性:向量中第一个元素的最终位置是均匀分布的,最终位置的元素也是如此.

As it turns out, the naïve shuffle does not produce uniformly distributed element positions. There are two points of regularity: the final position of the first element in the vector is uniformly distributed, as is are the elements which end up in the final position.

元素 i (i≠0) 最有可能的最终位置是位置 i−1.

The most likely final position of element i (i≠0) is the position i−1.

以下是 n=8 的转移概率表,计算为转移矩阵的乘积:

Here's a table of transition probabilities for n=8, computed as a product of transition matrices:

from/to     0       1       2       3       4       5       6       7
  0      .125    .125    .125    .125    .125    .125    .125    .125
  1      .158    .116    .117    .118    .119    .121    .123    .125
  2      .144    .151    .110    .112    .115    .118    .121    .125
  3      .132    .139    .147    .107    .111    .115    .119    .125
  4      .122    .129    .137    .146    .107    .112    .118    .125
  5      .113    .120    .128    .137    .147    .110    .117    .125
  6      .105    .112    .120    .129    .139    .151    .116    .125
  7      .098    .105    .113    .122    .132    .144    .158    .125

可以为 Pn(i, j) -- n 个元素的向量中的元素 i 将被打乱到 j 位置的概率.

It's possible to derive a closed-form for Pn(i, j) -- the probability that element i in a vector of n elements will be shuffled to position j.

在算法中,迭代i时的swap涉及元素vi 和其他一些元素 vj.(有可能 i=j.)虽然 swap 是一个对称操作,但区分这两个元素很有用;我们称之为元素vi出交换入交换vj

In the algorithm, the swap at iteration i involves the element vi and some other element vj. (It's possible that i=j.) Although swap is a symmetric operation, it's useful to distinguish the two elements; we'll call this an out swap of the element vi and an in swap of vj

注意,在元素 kin swap 之后,k 不能再被 out 交换,因为下面所有的 out swaps 都在 k 的新位置之后的位置.所以如果k曾经出交换,它必须在迭代k出交换;换句话说,只有第一个涉及元素的交换才能out swap.

Note that after an in swap of element k, k can no longer be out swapped, because all the following out swaps are at positions following the new position of k. So if k is ever out swapped, it must be out swapped at iteration k; in other words, only the first swap involving an element can be at out swap.

现在,在shuffle的任何迭代中,将要交换出的元素的最终目的地是均匀分布的.(最终目的地是 j 的概率是下一个位置是 i 的概率的所有 i 的总和乘以下一个位置的最终目的地ij.由于下一个位置是均匀分布的,乘数可以被分解,剩下的和为1,因为j 必须来自某个 i.)

Now, at a any iteration of the shuffle, the final destinations of the element about to be out swapped are uniformly distributed. (The probability of the final destination being j is the sum over all i of the probability of the next position being i times the probability that the final destination of the next position i being j. Since the next positions are uniformly distributed, the multiplier can be factored out, and the remaining sum is 1 since j must come from some i.)

另外,对于一个永远不会out swapped的元素,它的最终目的地是它最后一次in swap发生的迭代.(一个元素不可能既不是out swapped也不是in swapped.如果在元素进入之前没有in swap发生>out swap 位置,它将被out swapped.)

Also, for an element which is never out swapped, its final destination is the iteration at which its last in swap occurred. (It's not possible for an element to be neither out swapped nor in swapped. If no in swap occurs before the element is in the out swap position, it will be out swapped.)

有了这些,我们就可以推导出转换函数的公式了.

With all that, we can derive the formula for the transition function.

首先,元素 k 将被交换出的概率恰好是它在任何迭代中没有交换的概率k 之前,即 (n-1)k/nk.在 kout 交换 的 shuffle 中,最终目的地是均匀分布的,所以这有助于 (n−1)k/nk+1到每个转移概率Pn(k, j).

First, the probability that element k will be out swapped is precisely the probability that it is not in swapped in any of the iterations prior to k, which is (n-1)k/nk. Among the shuffles in which k is out swapped, the final destination is uniformly distributed, so this contributes (n−1)k/nk+1 to every transition probability Pn(k, j).

现在让我们考虑交换中的最后一个 处于迭代 j(从而定位 j)的情况.在每次迭代中,给定元素将交换,概率为 1/n.因此,交换中的最后一个 在迭代 j 的概率是在 j 之后没有交换发生的概率乘以交换发生的概率在迭代j,即(n−1)n−j−1/nn−j.

Now let's consider the cases in which the last in swap is at iteration j (and hence to position j). At every iteration, a given element will be in swapped with probability 1/n. Consequently, the probability that the last in swap is at iteration j is the probability that no swap occurs after j times the probability that the swap occurs at iteration j, which is (n−1)n−j−1/nn−j.

如果j<k,则k不能被交换,但是如果jk,我们只需要计算在迭代k之前有交换的情况.这导致以下定义:

If j<k, then k cannot be out swapped, but if jk, we need to only count the cases where there was a swap prior to iteration k. This leads to the following definition:

Pn(k, j) = (n−1)k/nk+1+ (1−On(k, j))×(n−1)nj-1/nn−j

Pn(k, j) = (n−1)k/nk+1 + (1−On(k, j))×(n−1)n-j-1/nn−j

哪里

On(k, j) = 0 if j<k,否则 (n−1)k/nk

On(k, j) = 0 if j<k, and otherwise (n−1)k/nk

这篇关于Knuth shuffle 的一种变体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆