使用 Scipy 拟合威布尔分布 [英] Fitting a Weibull distribution using Scipy
问题描述
我正在尝试重新创建最大似然分布拟合,我已经可以在 Matlab 和 R 中做到这一点,但现在我想使用 scipy.特别是,我想估计我的数据集的威布尔分布参数.
我已经试过了:
import scipy.stats as s将 numpy 导入为 np导入 matplotlib.pyplot 作为 pltdef weib(x,n,a):返回 (a/n) * (x/n)**(a - 1) * np.exp(-(x/n)**a)data = np.loadtxt(stack_data.csv")(loc, scale) = s.exponweib.fit_loc_scale(data, 1, 1)打印位置,比例x = np.linspace(data.min(), data.max(), 1000)plt.plot(x,weib(x, loc, scale))plt.hist(数据,data.max(),密度=真)plt.show()
得到这个:
(2.5827280639441961, 3.4955032285727947)
一个看起来像这样的分布:
在阅读完这篇发布了我的数据!为了完整起见,我使用 Python 2.7.5、Scipy 0.12.0、R 2.15.2 和 Matlab 2012b.
为什么我得到了不同的结果!?
我的猜测是,您希望在保持位置固定的同时估计形状参数和 Weibull 分布的尺度.修复 loc
假设您的数据和分布的值是正值,下限为零.
floc=0
保持位置固定为零,f0=1
保持指数威布尔的第一个形状参数固定为 1.
与直方图相比的拟合看起来不错,但不是很好.参数估计比你提到的来自 R 和 matlab 的估计要高一些.
更新
我能得到的最接近现在可用的图是无限制拟合,但使用起始值.情节仍然没有达到顶峰.注意前面没有 f 的 fit 值用作起始值.
<预><代码>>>>来自 scipy 导入统计>>>导入 matplotlib.pyplot 作为 plt>>>plt.plot(data, stats.exponweib.pdf(data, *stats.exponweib.fit(data, 1, 1, scale=02, loc=0)))>>>_ = plt.hist(data, bins=np.linspace(0, 16, 33), normed=True, alpha=0.5);>>>plt.show()I am trying to recreate maximum likelihood distribution fitting, I can already do this in Matlab and R, but now I want to use scipy. In particular, I would like to estimate the Weibull distribution parameters for my data set.
I have tried this:
import scipy.stats as s
import numpy as np
import matplotlib.pyplot as plt
def weib(x,n,a):
return (a / n) * (x / n)**(a - 1) * np.exp(-(x / n)**a)
data = np.loadtxt("stack_data.csv")
(loc, scale) = s.exponweib.fit_loc_scale(data, 1, 1)
print loc, scale
x = np.linspace(data.min(), data.max(), 1000)
plt.plot(x, weib(x, loc, scale))
plt.hist(data, data.max(), density=True)
plt.show()
And get this:
(2.5827280639441961, 3.4955032285727947)
And a distribution that looks like this:
I have been using the exponweib
after reading this http://www.johndcook.com/distributions_scipy.html. I have also tried the other Weibull functions in scipy (just in case!).
In Matlab (using the Distribution Fitting Tool - see screenshot) and in R (using both the MASS library function fitdistr
and the GAMLSS package) I get a (loc) and b (scale) parameters more like 1.58463497 5.93030013. I believe all three methods use the maximum likelihood method for distribution fitting.
I have posted my data here if you would like to have a go! And for completeness I am using Python 2.7.5, Scipy 0.12.0, R 2.15.2 and Matlab 2012b.
Why am I getting a different result!?
My guess is that you want to estimate the shape parameter and the scale of the Weibull distribution while keeping the location fixed. Fixing loc
assumes that the values of your data and of the distribution are positive with lower bound at zero.
floc=0
keeps the location fixed at zero, f0=1
keeps the first shape parameter of the exponential weibull fixed at one.
>>> stats.exponweib.fit(data, floc=0, f0=1)
[1, 1.8553346917584836, 0, 6.8820748596850905]
>>> stats.weibull_min.fit(data, floc=0)
[1.8553346917584836, 0, 6.8820748596850549]
The fit compared to the histogram looks ok, but not very good. The parameter estimates are a bit higher than the ones you mention are from R and matlab.
Update
The closest I can get to the plot that is now available is with unrestricted fit, but using starting values. The plot is still less peaked. Note values in fit that don't have an f in front are used as starting values.
>>> from scipy import stats
>>> import matplotlib.pyplot as plt
>>> plt.plot(data, stats.exponweib.pdf(data, *stats.exponweib.fit(data, 1, 1, scale=02, loc=0)))
>>> _ = plt.hist(data, bins=np.linspace(0, 16, 33), normed=True, alpha=0.5);
>>> plt.show()
这篇关于使用 Scipy 拟合威布尔分布的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!