用Python计算分布中随机变量的概率 [英] Calculating Probability of a Random Variable in a Distribution in Python

查看:453
本文介绍了用Python计算分布中随机变量的概率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出定义正态分布的均值和标准差,您将如何计算以下内容纯Python中的概率(即没有Numpy/Scipy或其他标准库中没有的软件包)?

Given a mean and standard-deviation defining a normal distribution, how would you calculate the following probabilities in pure-Python (i.e. no Numpy/Scipy or other packages not in the standard library)?

    随机变量r的概率,其中r <0. x或r< = x.
  1. 随机变量r的概率,其中r> x或r> = x.
  2. 随机变量r的概率,其中x> r> y.
  1. The probability of a random variable r where r < x or r <= x.
  2. The probability of a random variable r where r > x or r >= x.
  3. The probability of a random variable r where x > r > y.

我发现了一些库,例如 Pgnumerics ,它们提供了计算这些函数的功能,但是基础数学对我来说还不清楚.

I've found some libraries, like Pgnumerics, that provide functions for calculating these, but the underlying math is unclear to me.

为了显示这不是家庭作业,下面发布的是我的Python <= 2.6工作代码,尽管我不确定它是否能够正确处理边界条件.

To show this isn't homework, posted below is my working code for Python<=2.6, albeit I'm not sure if it handles the boundary conditions correctly.

from math import *
import unittest

def erfcc(x):
    """
    Complementary error function.
    """
    z = abs(x)
    t = 1. / (1. + 0.5*z)
    r = t * exp(-z*z-1.26551223+t*(1.00002368+t*(.37409196+
        t*(.09678418+t*(-.18628806+t*(.27886807+
        t*(-1.13520398+t*(1.48851587+t*(-.82215223+
        t*.17087277)))))))))
    if (x >= 0.):
        return r
    else:
        return 2. - r

def normcdf(x, mu, sigma):
    t = x-mu;
    y = 0.5*erfcc(-t/(sigma*sqrt(2.0)));
    if y>1.0:
        y = 1.0;
    return y

def normpdf(x, mu, sigma):
    u = (x-mu)/abs(sigma)
    y = (1/(sqrt(2*pi)*abs(sigma)))*exp(-u*u/2)
    return y

def normdist(x, mu, sigma, f):
    if f:
        y = normcdf(x,mu,sigma)
    else:
        y = normpdf(x,mu,sigma)
    return y

def normrange(x1, x2, mu, sigma, f=True):
    """
    Calculates probability of random variable falling between two points.
    """
    p1 = normdist(x1, mu, sigma, f)
    p2 = normdist(x2, mu, sigma, f)
    return abs(p1-p2)

推荐答案

所有这些都非常相似:如果可以使用函数cdf(x)计算#1,则#2的解决方案就是1 - cdf(x),对于#3是cdf(x) - cdf(y).

All these are very similar: If you can compute #1 using a function cdf(x), then the solution to #2 is simply 1 - cdf(x), and for #3 it's cdf(x) - cdf(y).

由于Python自2.7版以来就内置了(gauss)错误函数,因此您可以使用

Since Python includes the (gauss) error function built in since version 2.7 you can do this by calculating the cdf of the normal distribution using the equation from the article you linked to:

import math
print 0.5 * (1 + math.erf((x - mean)/math.sqrt(2 * standard_dev**2)))

其中mean是平均值,standard_dev是标准偏差.

where mean is the mean and standard_dev is the standard deviation.

考虑到文章中的信息,自您提出的问题起一些注意事项:

Some notes since what you asked seemed relatively straightforward given the information in the article:

    [li]随机变量的CDF(例如X)是X介于-无限和某个极限(例如x(小写))之间的概率. CDF是pdf连续发行版的组成部分. cdf正是您为#1所描述的,您希望一些正态分布的RV在-infinity和x(< = x)之间.
  • <对于连续随机变量,< =以及>和> =相同,因为rv是任意单点的概率为0.因此,计算连续分布的概率时,是否包含x本身并不重要.
  • 如果概率之和不为< ;,则概率之和为1. x,则为> = x,因此如果您有cdf(x).那么1 - cdf(x)是随机变量X> = x的概率.由于> =等于连续随机变量等于>,因此这也是X> x的概率.
  • CDF of a random variable (say X) is the probability that X lies between -infinity and some limit, say x (lower case). CDF is the integral of the pdf for continuous distributions. The cdf is exactly what you described for #1, you want some normally distributed RV to be between -infinity and x (<= x).
  • < and <= as well as > and >= are same for continuous random variables as the probability that the rv is any single point is 0. So whether or not x itself is included doesn't actually matter when calculating the probabilities for continuous distributions.
  • Sum of probabilities is 1, if its not < x then it's >= x so if you have the cdf(x). then 1 - cdf(x) is the probability that the random variable X >= x. Since >= is equivalent for continuous random variables to >, this is also the probability X > x.

这篇关于用Python计算分布中随机变量的概率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆