在Python中拟合负二项式 [英] Fitting negative binomial in python

查看:184
本文介绍了在Python中拟合负二项式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在scipy中,不支持使用数据
拟合负二项式分布(可能是由于scipy中的负二项式只是离散的事实)。



对于正态分布,我只想做:

$ s $$$$$$$$$$$$$$$$$$$$$$来自scipy.stats import norm
param = norm。 fit(samp)

在任何其他库中是否有类似的准备使用功能?

解决方案

不仅因为它是离散的,而且还因为涉及负二项式的最大似然拟合可能会涉及很多,尤其是使用附加的位置参数时。这就是为什么未为其提供 .fit()方法(以及 Scipy 中的其他离散分布)的原因。 ,例如:

 在[163]中:

import scipy.stats as ss
import scipy.optimize as as
在[164]中:
#定义似然函数
def可能性_f(P,x,neg = 1):
n = np。 round(P [0])#根据定义,它应该是整数
p = P [1]
loc = np.round(P [2])
return neg *(np。 log(ss.nbinom.pmf(x,n,p,loc)))。sum()
在[165]中:
#生成一个随机变量
X = ss.nbinom。 rvs(n = 100,p = 0.4,loc = 0,size = 1000)
在[166]中:
#可能性
可能性_f([100,0.4,0],X)
出局[166]:
-4400.3696690513316
在[167]:
#简单拟合,拟合度不佳,参数估计值偏离
结果= so.fmin(likelihood_f,[50,1,1],args = [X,-1),full_output = True,disp = False)
P1 =结果[0]
(结果[1 ],结果[0])
出[167]:
(4418.59949588647 4,数组([59.61196161,0.28650831,1.15141838]))
在[168]中:
#尝试使用另一组起始参数,拟合度仍然不佳,参数估计值仍较远b $ b结果= so.fmin(likelihood_f,[50,0.5,0],args =(X,-1),full_output = True,disp = False)
P1 =结果[0]
(result [1],result [0])
Out [168]:
(4417.1495981801972,
array([6.24809397e + 01,2.91877405e-01,6.63343536e-04]) )
在[169]中:
#在这种情况下,我们需要一个循环来使其正确处理
result = []
for i in range(40,120):#in事实(80,120)应该足够了
_ = so.fmin(likelihood_f,[i,0.5,0],args =(X,-1),full_output = True,disp = False)
result.append(((_ [1],_ [0]))
在[170]中:
#获取MLE
P2 = sorted(result,key = lambda x:x [0])[0] [1]
已排序(结果,键= lambda x:x [0])[0]
出局[170]:
(4399.780263084549,
array([9.37289361e + 01,3.84587087e-01,3.36856705e-04]))
在[171]中:
#哪个在视觉上更好?
plt.hist(X,bins = 20,normed = True)
plt.plot(range(260),ss.nbinom.pmf(range(260),np.round(P1 [0] ),P1 [1],np.round(P1 [2]),'g-')
plt.plot(range(260),ss.nbinom.pmf(range(260),np.round) (P2 [0]),P2 [1],np.round(P2 [2])), r-)
Out [171]:
[< matplotlib.lines.Line2D at 0x109776c10>]


In scipy there is no support for fitting a negative binomial distribution using data (maybe due to the fact that the negative binomial in scipy is only discrete).

For a normal distribution I would just do:

from scipy.stats import norm
param = norm.fit(samp)

Is there something similar 'ready to use' function in any other library?

解决方案

Not only because it is discrete, also because maximum likelihood fit to negative binomial can be quite involving, especially with an additional location parameter. That would be the reason why .fit() method is not provided for it (and other discrete distributions in Scipy), here is an example:

In [163]:

import scipy.stats as ss
import scipy.optimize as so
In [164]:
#define a likelihood function
def likelihood_f(P, x, neg=1):
    n=np.round(P[0]) #by definition, it should be an integer 
    p=P[1]
    loc=np.round(P[2])
    return neg*(np.log(ss.nbinom.pmf(x, n, p, loc))).sum()
In [165]:
#generate a random variable
X=ss.nbinom.rvs(n=100, p=0.4, loc=0, size=1000)
In [166]:
#The likelihood
likelihood_f([100,0.4,0], X)
Out[166]:
-4400.3696690513316
In [167]:
#A simple fit, the fit is not good and the parameter estimate is way off
result=so.fmin(likelihood_f, [50, 1, 1], args=(X,-1), full_output=True, disp=False)
P1=result[0]
(result[1], result[0])
Out[167]:
(4418.599495886474, array([ 59.61196161,   0.28650831,   1.15141838]))
In [168]:
#Try a different set of start paramters, the fit is still not good and the parameter estimate is still way off
result=so.fmin(likelihood_f, [50, 0.5, 0], args=(X,-1), full_output=True, disp=False)
P1=result[0]
(result[1], result[0])
Out[168]:
(4417.1495981801972,
 array([  6.24809397e+01,   2.91877405e-01,   6.63343536e-04]))
In [169]:
#In this case we need a loop to get it right
result=[]
for i in range(40, 120): #in fact (80, 120) should probably be enough
    _=so.fmin(likelihood_f, [i, 0.5, 0], args=(X,-1), full_output=True, disp=False)
    result.append((_[1], _[0]))
In [170]:
#get the MLE
P2=sorted(result, key=lambda x: x[0])[0][1]
sorted(result, key=lambda x: x[0])[0]
Out[170]:
(4399.780263084549,
 array([  9.37289361e+01,   3.84587087e-01,   3.36856705e-04]))
In [171]:
#Which one is visually better?
plt.hist(X, bins=20, normed=True)
plt.plot(range(260), ss.nbinom.pmf(range(260), np.round(P1[0]), P1[1], np.round(P1[2])), 'g-')
plt.plot(range(260), ss.nbinom.pmf(range(260), np.round(P2[0]), P2[1], np.round(P2[2])), 'r-')
Out[171]:
[<matplotlib.lines.Line2D at 0x109776c10>]

这篇关于在Python中拟合负二项式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆