威布尔分布参数估计误差 [英] Weibull Distribution parameter estimation error

查看:209
本文介绍了威布尔分布参数估计误差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用以下函数来估计三参数威布尔分布.

library(bbmle)
library(FAdist)
set.seed(16)
xl=rweibull3(50, shape = 1,scale=1, thres = 0)
dweib3l <- function(shape, scale, thres) { 
  -sum(dweibull3(xl , shape, scale, thres, log=TRUE))
}
ml <- mle2(dweib3l, start= list(shape = 1, scale = 1, thres=0), data=list(xl))

但是,当我运行上述功能时,出现以下错误.

Error in optim(par = c(shape = 1, scale = 1, thres = 0), fn = function (p)  : 
  non-finite finite-difference value [3]
In addition: There were 16 warnings (use warnings() to see them)

有什么办法可以解决这个问题? 谢谢!

解决方案

问题是阈值参数很特殊:它为分布定义了清晰的边界,因此thres的任何值都高于数据的最小值将给出零可能性(-Inf负对数可能性):如果xl的给定值小于小于指定阈值,则根据您定义的统计模型,这是不可能的.此外,我们已经知道阈值的最大似然值与数据集中的最小值等于(对于其他 ,这会带来约束).

ml <- mle2(xl ~ dweibull3(shape=shape, scale = scale,
                        thres=min(xl)-1e-5),
           start=list(shape=1, scale=1),
           lower=c(0,0),
           method="L-BFGS-B",
           data=data.frame(xl))

(对于简单的示例,公式接口很方便:如果您要做的事情要复杂得多,则可能需要回到明确定义自己的对数似然函数的位置.)


如果您坚持拟合阈值参数,则可以通过将上限设置为(几乎)等于数据中出现的最小值来实现[任何较大的值都会给出NA值,从而破坏优化].但是,您会发现阈值参数的估计值始终收敛于该上限 ...,因此,这种方法确实很难达到先前的答案(您还将收到有关参数的警告处于边界,并且无法反转黑森州).

eps <- 1e-8
ml3 <- mle2(xl ~ dweibull3(shape=shape, scale = scale, thres = thres),
            start=list(shape=1, scale=1, thres=-5),
            lower=c(shape=0,scale=0,thres=-Inf),
            upper=c(shape=Inf,scale=Inf,thres=min(xl)-eps),
            method="L-BFGS-B",
            data=data.frame(xl))


如果值得的话,如果您从一个小值开始并使用Nelder-Mead优化,似乎可以在不固定阈值参数的情况下拟合模型:但是,这似乎给出了不可靠的结果.

I used the following function to estimate the three-parameter Weibull distribution.

library(bbmle)
library(FAdist)
set.seed(16)
xl=rweibull3(50, shape = 1,scale=1, thres = 0)
dweib3l <- function(shape, scale, thres) { 
  -sum(dweibull3(xl , shape, scale, thres, log=TRUE))
}
ml <- mle2(dweib3l, start= list(shape = 1, scale = 1, thres=0), data=list(xl))

However, when I run the above function I am getting the following error.

Error in optim(par = c(shape = 1, scale = 1, thres = 0), fn = function (p)  : 
  non-finite finite-difference value [3]
In addition: There were 16 warnings (use warnings() to see them)

Is there any way to overcome this issue? Thank you!

解决方案

The problem is that the threshold parameter is special: it defines a sharp boundary to the distribution, so any value of thres above the minimum value of the data will give zero likelihoods (-Inf negative log-likelihoods): if a given value of xl is less than the specified threshold, then it's impossible according to the statistical model you have defined. Furthermore, we know already that the maximum likelihood value of the threshold is equal to the minimum value in the data set (analogous results hold for MLE estimation of the bounds of a uniform distribution ...)

I don't know why the other questions on SO that address this question don't encounter this particular problem - it may be because they use a starting value of the threshold that's far enough below the minimum value in the data set ...

Below, I use a fixed value of min(xl)-1e-5 for the threshold (shifting the value downward avoids numerical problems when the value is exactly on the boundary). I also use the formula interface so we can call the dweibull3() function directly, and put lower bounds on the shape and scale parameters (as a result I need to use method="L-BFGS-B", which allows for constraints).

ml <- mle2(xl ~ dweibull3(shape=shape, scale = scale,
                        thres=min(xl)-1e-5),
           start=list(shape=1, scale=1),
           lower=c(0,0),
           method="L-BFGS-B",
           data=data.frame(xl))

(The formula interface is convenient for simple examples: if you want to do something very much more complicated you may want to go back to defining your own log-likelihood function explicitly.)


If you insist on fitting the threshold parameter, you can do it by setting an upper bound that is (nearly) equal to the minimum value that occurs in the data [any larger value will give NA values and thus break the optimization]. However, you will find that the estimate of the threshold parameter always converges to this upper bound ... so this approach is really getting to the previous answer the hard way (you'll also get warnings about parameters being on the boundary, and about not being able to invert the Hessian).

eps <- 1e-8
ml3 <- mle2(xl ~ dweibull3(shape=shape, scale = scale, thres = thres),
            start=list(shape=1, scale=1, thres=-5),
            lower=c(shape=0,scale=0,thres=-Inf),
            upper=c(shape=Inf,scale=Inf,thres=min(xl)-eps),
            method="L-BFGS-B",
            data=data.frame(xl))


For what it's worth it does seem to be possible to fit the model without fixing the threshold parameter, if you start with a small value and use Nelder-Mead optimization: however, it seems to give unreliable results.

这篇关于威布尔分布参数估计误差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆