随机数生成器,每次仅返回一个数字 [英] Random number generator that returns only one number each time

查看:167
本文介绍了随机数生成器,每次仅返回一个数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Python是否有一个随机数生成器,每次调用next()函数时,该生成器仅返回一个随机整数?数字不应重复,并且生成器应在区间[1, 1 000 000]中返回唯一的随机整数.

Does Python have a random number generator that returns only one random integer number each time when next() function is called? Numbers should not repeat and the generator should return random integers in the interval [1, 1 000 000] that are unique.

我需要生成超过一百万个不同的数字,如果所有的数字都同时生成并存储在列表中,这听起来好像要占用大量内存.

I need to generate more than million different numbers and that sounds as if it is very memory consuming in case all the number are generated at same time and stored in a list.

推荐答案

您正在寻找线性同余生成器.这样,您就可以在目标数字范围内获得非重复数字的伪随机序列.

You are looking for a linear congruential generator with a full period. This will allow you to get a pseudo-random sequence of non-repeating numbers in your target number range.

实施LCG实际上非常简单,看起来像这样:

Implementing a LCG is actually very simple, and looks like this:

def lcg(a, c, m, seed = None):
    num = seed or 0
    while True:
        num = (a * num + c) % m
        yield num

然后,仅需为acm选择正确的值,以确保LCG会生成一个完整的周期(这是唯一获得非重复数字的保证) ).如Wikipedia文章所述,以下三个条件必须为真:

Then, it just comes down to choosing the correct values for a, c, and m to guarantee that the LCG will generate a full period (which is the only guarantee that you get non-repeating numbers). As the Wikipedia article explains, the following three conditions need to be true:

  1. mc需要相对素数.
  2. a - 1可被m
  3. 的所有主要因子整除
  4. a - 1可被4整除,如果m也可被4整除.
  1. m and c need to be relatively prime.
  2. a - 1 is divisible by all prime factors of m
  3. a - 1 is divisible by 4, if m is also divisible by 4.

只需为c选择一个质数就可以很容易地保证第一个.另外,这是最后可以选择的值,最终将使我们能够稍微混淆一下序列.

The first one is very easily guaranteed by simply choosing a prime for c. Also, this is the value that can be chosen last, and this will ultimately allow us to mix up the sequence a bit.

a - 1m之间的关系更加复杂.在整个周期的LCG中,m是周期的长度.换句话说,这就是您的数字来源的数字范围.因此,这通常是您首先选择的.在您的情况下,您希望m1000000附近.确切地选择最大数字可能很困难,因为这会限制您很多(在选择ac时),因此您也可以选择大于该数字的数字,然后稍后跳过所有超出范围的数字.

The relationship between a - 1 and m is more complicated though. In a full period LCG, m is the length of the period. Or in other words, it is the number range your numbers come from. So this is what you are usually choosing first. In your case, you want m to be around 1000000. Choosing exactly your maximum number might be difficult since that restricts you a lot (in both your choice of a and also c), so you can also choose numbers larger than that and simply skip all numbers outside of your range later.

现在让我们选择m = 1000000. m的主要因素是25.而且它显然也可以被4整除.因此,对于a - 1,我们需要一个为2 * 2 * 5的倍数的数字,以满足条件2和3.让我们选择a - 1 = 160,所以选择a = 161.

Let’s choose m = 1000000 now though. The prime factors of m are 2 and 5. And it’s also obviously divisible by 4. So for a - 1, we need a number that is a multiple of 2 * 2 * 5 to satisfy the conditions 2 and 3. Let’s choose a - 1 = 160, so a = 161.

对于c,我们使用的随机质数介于我们的范围之间:c = 506903

For c, we are using a random prime that’s somewhere in between of our range: c = 506903

将其放入我们的LCG中可以得到我们所需的序列.我们可以从范围(0 <= seed <= m)中选择任意种子值作为序列的起点.

Putting that into our LCG gives us our desired sequence. We can choose any seed value from the range (0 <= seed <= m) as the starting point of our sequence.

因此,请尝试一下并验证我们认为的内容是否确实有效.为此,我们只是从生成器中收集所有数字,直到我们找到一个重复的数字为止.此时,我们应该在集合中包含m = 1000000个数字:

So let’s try it out and verify that what we thought of actually works. For this purpose, we are just collecting all numbers from the generator in a set until we hit a duplicate. At that point, we should have m = 1000000 numbers in the set:

>>> g = lcg(161, 506903, 1000000)
>>> numbers = set()
>>> for n in g:
        if n in numbers:
            raise Exception('Number {} already encountered before!'.format(n))
        numbers.add(n)

Traceback (most recent call last):
  File "<pyshell#5>", line 3, in <module>
    raise Exception('Number {} already encountered before!'.format(n))
Exception: Number 506903 already encountered before!
>>> len(numbers)
1000000

这是正确的!因此,我们确实创建了一个伪随机数序列,该序列使我们能够从范围m中获得非重复数.当然,根据设计,此序列将始终相同,因此,当您选择这些数字时,它仅是随机的.但是,只要保持上述属性,就可以切换ac的值以获取不同的序列.

And it’s correct! So we did create a pseudo-random sequence of numbers that allowed us to get non-repeating numbers from our range m. Of course, by design, this sequence will be always the same, so it is only random once when you choose those numbers. You can switch up the values for a and c to get different sequences though, as long as you maintain the properties mentioned above.

这种方法的最大好处当然是您不需要存储所有先前生成的数字.这是一种恒定空间算法,因为它只需要记住初始配置和先前生成的值即可.

The big benefit of this approach is of course that you do not need to store all the previously generated numbers. It is a constant space algorithm as it only needs to remember the initial configuration and the previously generated value.

随着您深入到序列中,它也不会恶化.这是解决方案的一个普遍问题,这些解决方案只是不断生成一个随机数,直到找到一个从未遇到过的新数字为止.这是因为生成的数字列表越长,使用均匀分布的随机算法击中不在该列表中的数字的可能性就越小.因此,获得第1000000个数字可能需要很长时间才能使用基于内存的随机生成器进行生成.

It will also not deteriorate as you get further into the sequence. This is a general problem with solutions that just keep generating a random number until a new one is found that hasn’t been encountered before. This is because the longer the list of generated numbers gets, the less likely you are going to hit a numbers that’s not in that list with an evenly distributed random algorithm. So getting the 1000000th number will likely take you a long time to generate with memory based random generators.

但是,当然,拥有这种仅执行一些乘法和一些加法的简单算法似乎并不是很随机.但是您必须记住,这实际上是大多数伪随机数生成器的基础.所以random.random()在内部使用类似这样的东西.只是m大了 ,所以您在那儿没有注意到它.

But of course, having this simply algorithm which just performs some multiplication and some addition does not appear very random. But you have to keep in mind that this is actually the basis for most pseudo-random number generators out there. So random.random() uses something like this internally. It’s just that the m is a lot larger, so you don’t notice it there.

这篇关于随机数生成器,每次仅返回一个数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆