在恒定空间中创建 1..N 的随机排列 [英] Create a random permutation of 1..N in constant space

查看:19
本文介绍了在恒定空间中创建 1..N 的随机排列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望在固定空间中枚举数字 1..N 的随机排列.这意味着我无法将所有数字存储在列表中.原因是 N 可能非常大,超过可用内存.我仍然希望能够一次遍历一个数字的这种排列,并且只访问每个数字一次.

我知道这可以针对某些 N 完成:许多随机数生成器随机地循环遍历它们的整个状态空间,但完全循环.状态大小为 32 位的良好随机数生成器将发出数字 0..(2^32)-1 的排列.每个数字恰好一次.

我想选择 N 为任何数字,并且不受例如 2 的幂的限制.有没有这方面的算法?

解决方案

最简单的方法可能就是为比你关心的范围更大的范围创建一个全范围 PRNG,当它生成一个比你想要的更大的数字时,把它扔掉,然后得到下一个.

另一种几乎是相同变体的可能性是首先使用线性反馈移位寄存器 (LFSR) 来生成数字.这有几个优点:首先,LFSR 可能比大多数 PRNG 快一点.其次,(我相信)设计一个 LFSR 会更容易一些,该 LFSR 产生的数字接近您想要的范围,并且仍然确保它以(伪)随机顺序循环遍历其范围内的数字,没有任何重复.

没有在细节上花费大量时间,LFSR 背后的数学已经得到了相当彻底的研究.生成一个遍历其范围内的所有数字而不重复的数字只需要选择一组对应于不可约多项式的抽头".如果您不想自己搜索,可以很容易地找到几乎任何合理大小的已知表格(例如,快速浏览一下,维基百科文章列出了它们的最大大小为 19 位).

如果没记错的话,至少有一个不可约的多项式,其位大小永远是可能的.这意味着在最坏的情况下,您可以创建一个生成器,该生成器的范围大约是您需要的范围的两倍,因此平均而言,您会(大致)丢弃您生成的所有其他数字.考虑到 LFSR 的速度,我猜你可以做到这一点并且仍然保持相当可接受的速度.

I am looking to enumerate a random permutation of the numbers 1..N in fixed space. This means that I cannot store all numbers in a list. The reason for that is that N can be very large, more than available memory. I still want to be able to walk through such a permutation of numbers one at a time, visiting each number exactly once.

I know this can be done for certain N: Many random number generators cycle through their whole state space randomly, but entirely. A good random number generator with state size of 32 bit will emit a permutation of the numbers 0..(2^32)-1. Every number exactly once.

I want to get to pick N to be any number at all and not be constrained to powers of 2 for example. Is there an algorithm for this?

解决方案

The easiest way is probably to just create a full-range PRNG for a larger range than you care about, and when it generates a number larger than you want, just throw it away and get the next one.

Another possibility that's pretty much a variation of the same would be to use a linear feedback shift register (LFSR) to generate the numbers in the first place. This has a couple of advantages: first of all, an LFSR is probably a bit faster than most PRNGs. Second, it is (I believe) a bit easier to engineer an LFSR that produces numbers close to the range you want, and still be sure it cycles through the numbers in its range in (pseudo)random order, without any repetitions.

Without spending a lot of time on the details, the math behind LFSRs has been studied quite thoroughly. Producing one that runs through all the numbers in its range without repetition simply requires choosing a set of "taps" that correspond to an irreducible polynomial. If you don't want to search for that yourself, it's pretty easy to find tables of known ones for almost any reasonable size (e.g., doing a quick look, the wikipedia article lists them for size up to 19 bits).

If memory serves, there's at least one irreducible polynomial of ever possible bit size. That translates to the fact that in the worst case you can create a generator that has roughly twice the range you need, so on average you're throwing away (roughly) every other number you generate. Given the speed an LFSR, I'd guess you can do that and still maintain quite acceptable speed.

这篇关于在恒定空间中创建 1..N 的随机排列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆