使用 Scipy 拟合威布尔分布 [英] Fitting a Weibull distribution using Scipy

查看:38
本文介绍了使用 Scipy 拟合威布尔分布的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试重新创建最大似然分布拟合,我已经可以在 Matlab 和 R 中做到这一点,但现在我想使用 scipy.特别是,我想估计我的数据集的威布尔分布参数.

我已经试过了:

import scipy.stats as s将 numpy 导入为 np导入 matplotlib.pyplot 作为 pltdef weib(x,n,a):返回 (a/n) * (x/n)**(a - 1) * np.exp(-(x/n)**a)data = np.loadtxt(stack_data.csv")(loc, scale) = s.exponweib.fit_loc_scale(data, 1, 1)打印位置,比例x = np.linspace(data.min(), data.max(), 1000)plt.plot(x,weib(x, loc, scale))plt.hist(数据,data.max(),密度=真)plt.show()

得到这个:

(2.5827280639441961, 3.4955032285727947)

一个看起来像这样的分布:

在阅读完这篇发布了我的数据!为了完整起见,我使用 Python 2.7.5、Scipy 0.12.0、R 2.15.2 和 Matlab 2012b.

为什么我得到了不同的结果!?

解决方案

我的猜测是,您希望在保持位置固定的同时估计形状参数和 Weibull 分布的尺度.修复 loc 假设您的数据和分布的值是正值,下限为零.

floc=0 保持位置固定为零,f0=1 保持指数威布尔的第一个形状参数固定为 1.

<预><代码>>>>stats.exponweib.fit(数据,floc=0,f0=1)[1, 1.8553346917584836, 0, 6.8820748596850905]>>>stats.weibull_min.fit(数据,絮凝物=0)[1.8553346917584836, 0, 6.8820748596850549]

与直方图相比的拟合看起来不错,但不是很好.参数估计比你提到的来自 R 和 matlab 的估计要高一些.

更新

我能得到的最接近现在可用的图是无限制拟合,但使用起始值.情节仍然没有达到顶峰.注意前面没有 f 的 fit 值用作起始值.

<预><代码>>>>来自 scipy 导入统计>>>导入 matplotlib.pyplot 作为 plt>>>plt.plot(data, stats.exponweib.pdf(data, *stats.exponweib.fit(data, 1, 1, scale=02, loc=0)))>>>_ = plt.hist(data, bins=np.linspace(0, 16, 33), normed=True, alpha=0.5);>>>plt.show()

I am trying to recreate maximum likelihood distribution fitting, I can already do this in Matlab and R, but now I want to use scipy. In particular, I would like to estimate the Weibull distribution parameters for my data set.

I have tried this:

import scipy.stats as s
import numpy as np
import matplotlib.pyplot as plt

def weib(x,n,a):
    return (a / n) * (x / n)**(a - 1) * np.exp(-(x / n)**a)

data = np.loadtxt("stack_data.csv")

(loc, scale) = s.exponweib.fit_loc_scale(data, 1, 1)
print loc, scale

x = np.linspace(data.min(), data.max(), 1000)
plt.plot(x, weib(x, loc, scale))
plt.hist(data, data.max(), density=True)
plt.show()

And get this:

(2.5827280639441961, 3.4955032285727947)

And a distribution that looks like this:

I have been using the exponweib after reading this http://www.johndcook.com/distributions_scipy.html. I have also tried the other Weibull functions in scipy (just in case!).

In Matlab (using the Distribution Fitting Tool - see screenshot) and in R (using both the MASS library function fitdistr and the GAMLSS package) I get a (loc) and b (scale) parameters more like 1.58463497 5.93030013. I believe all three methods use the maximum likelihood method for distribution fitting.

I have posted my data here if you would like to have a go! And for completeness I am using Python 2.7.5, Scipy 0.12.0, R 2.15.2 and Matlab 2012b.

Why am I getting a different result!?

解决方案

My guess is that you want to estimate the shape parameter and the scale of the Weibull distribution while keeping the location fixed. Fixing loc assumes that the values of your data and of the distribution are positive with lower bound at zero.

floc=0 keeps the location fixed at zero, f0=1 keeps the first shape parameter of the exponential weibull fixed at one.

>>> stats.exponweib.fit(data, floc=0, f0=1)
[1, 1.8553346917584836, 0, 6.8820748596850905]
>>> stats.weibull_min.fit(data, floc=0)
[1.8553346917584836, 0, 6.8820748596850549]

The fit compared to the histogram looks ok, but not very good. The parameter estimates are a bit higher than the ones you mention are from R and matlab.

Update

The closest I can get to the plot that is now available is with unrestricted fit, but using starting values. The plot is still less peaked. Note values in fit that don't have an f in front are used as starting values.

>>> from scipy import stats
>>> import matplotlib.pyplot as plt
>>> plt.plot(data, stats.exponweib.pdf(data, *stats.exponweib.fit(data, 1, 1, scale=02, loc=0)))
>>> _ = plt.hist(data, bins=np.linspace(0, 16, 33), normed=True, alpha=0.5);
>>> plt.show()

这篇关于使用 Scipy 拟合威布尔分布的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆