从第一性原理的离散概率分布采样 [英] Sampling from discrete probability distribution from first principles

查看:52
本文介绍了从第一性原理的离散概率分布采样的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个集合 S={a1,a2,a3,a4,a5,......,an}.每个元素被选中的概率分别是 {p1,p2,p3,p4,p5,...,pn}(当然 p1+p2+p3+p4+p5+....+pn=1}.

I have a set S={a1,a2,a3,a4,a5,......,an}. The probability with which each of the element is selected is {p1,p2,p3,p4,p5,...,pn} respectively (where ofcourse p1+p2+p3+p4+p5+....+pn=1}.

我想模拟一个这样做的实验.但是我希望在没有任何库的情况下做到这一点(即来自第一原则)

I want to simulate an experiment which does that. However I wish to do that without any libraries (i.e from first principles)

我使用以下方法: 1) 我将实数轴上的元素映射如下 X(a1)=1;X(a2)=2;X(a3)=3;X(a4)=4;X(a5)=5;....,X(an)=n

I'm using the following method: 1) I map the elements on the real number line as follows X(a1)=1; X(a2)=2; X(a3)=3; X(a4)=4; X(a5)=5;....,X(an)=n

2) 然后我计算每个坐标的累积概率分布函数(即 P(x

2) Then I calculate the cumulative probability distribution function for each coordinate (i.e P(x < X) as follows: cdf(x)= P(a1) + P(a2) + .....P(ai) such that X(ai) <= x < X(a(i+1))

(因此 cdf 是阶跃函数)

(thus the cdf is a step function)

3) 我在 (0,1) 之间随机选择一个实数 q.并计算线 y = q 与 cdf 相交处的 x 坐标.由于 cdf 是在 1,2,...n 处跳跃的阶跃函数,因此该点将具有整数 x 坐标 btw 1 和 n.设 x 坐标为 m.

3) I randomly select an real number,q between (0,1). And calculate the x-coordinate where the line y = q intersects the cdf. Since the cdf is a step function with jumps at 1,2,...n the point would have an integer x-coordinate btw 1 and n. Let the x-coordinate be m.

4) 我选择 ai,使得 X(ai) = m.

4) I select that ai, such that X(ai) = m.

我的问题是这种方法是否可以毫无偏差地模拟实验?

My question is does this method simulate the experiment without any bias?

我没有得到要求的结果,这就是为什么我有点怀疑.

I'm not getting the required results, which is why i'm a bit skeptical.

任何帮助将不胜感激!谢谢!

Any help will be greatly appreciated! Thanks!

推荐答案

逻辑听起来没问题.一般要从均匀分布 U(0,1) 中采样任意分布函数 Y(x),只需在 Y 向量中查找均匀随机值 u 并返回 x 的最小值,其中 Y(x) 大于或等于u 即 min{x:Y(x)>=u}.

The logic sounds ok. Generally to sample an arbitrary distribution function Y(x) from a uniform distribution U(0,1), just lookup the uniform random value u in the Y vector and return the least value of x with Y(x) greater than or equal to u i.e. min{x:Y(x)>=u}.

您可能希望为基本概率添加 x=0 观测值,如下例所示.

You may want to add an x=0 observation for the base probability as in the example below.

x      P(x)    Y(x)
0      0       0
1      0.1     0.1
2      0.3     0.4
3      0.4     0.8
4      0.2     1

例如 u = 0.3 ->x = 2, u = 0.81 ->x = 4

清楚地计算多次试验的相对频率将给出 P(x) 的无偏估计.

Clearly calculating relative frequencies over many trials will give unbiased estimates of P(x).

这篇关于从第一性原理的离散概率分布采样的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆