如何选择0到bigint之间的随机值? [英] How can I pick a random value between 0 and a bigint?

查看:178
本文介绍了如何选择0到bigint之间的随机值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到一个组合问题,希望能够随机选择一个介于0和一个大整数之间的整数.


我当前方法的不足之处

现在对于正整数,我通常会写类似int rand 500;的东西并用它完成.

但是对于大整数,看来rand并非为此目的.

使用以下代码,我模拟了对rand $bigint的200万次调用:

$ perl -Mbigint -E 'say int rand 1230138339199329632554990773929330319360000000 for 1 .. 2e6' > rand.txt

结果集的分布远不理想:

  • 0(56个计数)
  • 1e + 040震级(112个计数)
  • 1e + 041震级(1411个计数)
  • 1e + 042级(14496个计数)
  • 1e + 043级(146324个计数)
  • 1e + 044震级(1463824个计数)
  • 1e + 045级(373777个计数)

因此,该过程无法选择像9995e+020这样的数字,这使该方法不适合我想做的事情.

这似乎与rand的任意精度有关,在我的测试过程中,精度从未超过15位:

$ perl -E 'printf "%.66g", rand'
0.307037353515625


我该如何克服这一限制?

我最初的想法是,也许有一种方法可以影响rand的精度,但是它感觉像是一个创可贴,可以解决更大的问题(即rand无法处理大整数)./p>

无论如何,我希望有人以前走过这条路,并且知道如何解决这种情况.

解决方案

(从我的评论转换而来)

更具理论性的方法是使用对PRNG的多次调用来创建足够的随机位,以供您的数字进行采样.如果某些PRNG产生的位数不等于下面概述的所需位数,则必须小心!

伪代码

  • 计算代表您的电话号码所需的位数:n_needed_bits
  • 检查PRNG返回的位的大小:n_bits_prng
  • 计算所需的样本数:needed_prng_samples = ceil(n_needed_bits / n_bits_prng)
  • 虽然为真:
    • 示例needed_prng_samples(调用PRNG)次&将所有获得的位连接起来
    • 检查结果数字是否在您的范围内
    • 是吗?:返回数字(完成)
    • 否?:什么也不做(循环继续;将再次对所有组件进行采样!)

备注

  • 这是接受抽样/拒绝抽样
  • 的一种形式
  • 该方法是拉斯维加斯类型的算法:理论上运行时不受限制
    • 平均需要的循环次数:n_possible-sample-numbers-of-full-concatenation / n_possible-sample-numbers-within-range
  • 根据拒绝方法进行的完整重采样(如果结果不在范围内)将使人们可以更正式地分析非偏差/均匀性,并且是此方法的一个非常重要的方面
  • 当然,有关PRNG输出的经典假设是使这项工作必不可少的.
    • 例如,如果PRNG在低位/高位(如常提到的)方面有些不均匀,则会对上面的输出产生影响

I have a combinatorics problem for which I want to be able to pick an integer at random between 0 and a big integer.


Inadequacies of my current approach

Now for regular integers I would usually write something like int rand 500; and be done with it.

But for big integers, it looks like rand isn't meant for this.

Using the following code, I ran a simulation of 2 million calls to rand $bigint:

$ perl -Mbigint -E 'say int rand 1230138339199329632554990773929330319360000000 for 1 .. 2e6' > rand.txt

The distribution of the resultant set is far from desirable:

  • 0 (56 counts)
  • magnitude 1e+040 (112 counts)
  • magnitude 1e+041 (1411 counts)
  • magnitude 1e+042 (14496 counts)
  • magnitude 1e+043 (146324 counts)
  • magnitude 1e+044 (1463824 counts)
  • magnitude 1e+045 (373777 counts)

So the process was never able to choose a number like 999, or 5e+020, which makes this approach unsuitable for what I want to do.

It looks like this has something to do with the arbitrary precision of rand, which never goes beyond 15 digits in the course of my testing:

$ perl -E 'printf "%.66g", rand'
0.307037353515625


How can I overcome this limitation?

My initial thought is that maybe there is a way to influence the precision of rand, but it feels like a band-aid to a much bigger problem (i.e. the inability of rand to handle big integers).

In any case, I'm hoping someone has walked down this path before and knows how to remedy the situation.

解决方案

(Converted from my comment)

A more theoretical-driven approach would be using multiple calls to the PRNG to create enough random-bits for your number to sample. Care has to be taken, if the number of bits produced by some PRNG is not equal to the number of bits needed as outlined below!

Pseudocode

  • Calculate the bits needed to represent your number: n_needed_bits
  • Check the size of bits returned by your PRNG: n_bits_prng
  • Calculate the number of samples needed: needed_prng_samples = ceil(n_needed_bits / n_bits_prng)
  • While true:
    • Sample needed_prng_samples (calls to PRNG) times & concatenate all the bits obtained
    • Check if the resulting number is within your range
    • Yes?: return number (finished)
    • No?: do nothing (loop continues; will resample all components again!)

Remarks

  • This is a form of acceptance-sampling / rejection-sampling
  • The approach is a Las-vegas type of algorithm: the runtime is not bounded in theory
    • The number of loops needed is in average: n_possible-sample-numbers-of-full-concatenation / n_possible-sample-numbers-within-range
  • The complete resampling (if result not within range) according to the rejection-method is giving access to more formal-analysis of non-bias / uniformity and is a very important aspect for this approach
  • Of course the classic assumptions in regards to PRNG-output are needed to make this work.
    • If the PRNG for example has some non-uniformity in regards to low-bits / high-bits (as often mentioned), this will have an effect of the output above

这篇关于如何选择0到bigint之间的随机值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆