为什么R中的optimx无法为这种简单的非参数似然最大化提供正确的解决方案? [英] Why does optimx in R not give the correct solution to this simple nonparametric likelihood maximization?

查看:290
本文介绍了为什么R中的optimx无法为这种简单的非参数似然最大化提供正确的解决方案?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试最大化一个非常简单的可能性.这是非参数可能性 从某种意义上说,F的分布不是通过参数指定的.相当, 对于每个观察到的xif(xi)=pi,因此是log(Likelihood)=Sum(log(f(xi)))=Sum(log(pi)).

I am trying to maximize a very simple likelihood. This is a non-parametric likelihood in the sense that the distribution of F is not specified parametrically. Rather, for each observed xi, f(xi)=pi and thus log(Likelihood)=Sum(log(f(xi)))=Sum(log(pi)).

我要最大化的功能是:sum(log(pi))+lamda(sum(pi-1)) 其中sum(pi)=1(即,这是一个受约束的最大化问题,可以使用拉格朗日乘数来解决).

The function I am trying to maximize is: sum(log(pi))+lamda(sum(pi-1)) where sum(pi)=1 (i.e. this is a constrained maximization problem which can be solved using Lagrange multiplier).

此问题的答案是pi=1/n,其中n是数据点的数量.但是,optimx似乎没有提供这种解决方案.有人有什么主意吗?如果是n=2,则我要最大化的函数是log(p1)+log(p2)+lamda(p1+p2-1).

The answer to this problem is pi=1/n where n is the number of data points. However, optimx does not seem to give this solution. Does anybody have any idea. If n=2, the function I am maximizing is log(p1)+log(p2)+lamda(p1+p2-1).

这是我的代码,并从R中输出:

Here is my code and output from R:

n=2
log.like=function(p)
{
  lamda=p[n+1]
  ll=0
  for(i in 1:n){
    temp = log(p[i])+lamda*p[i]-lamda/(n)
    ll=ll+temp
  }
  return(-ll)
}


mle = optimx(c(0.48,.52,-1.5),
             log.like,
             lower=c(rep(0.1,2),-3),
             upper=c(rep(.9,2),-1),
             method = "L-BFGS-B")

> mle
             par  fvalues   method fns grs itns conv  KKT1 KKT2 xtimes
1 0.9, 0.9, -1.0 1.010721 L-BFGS-B   8   8 NULL    0 FALSE   NA      0

n=2p1=p2=1/2lamda=-2时方程的解.但是,在使用optimx时,我不明白这一点.有什么主意吗?

The solution to the equation when n=2 is p1=p2=1/2 and lamda=-2. However, I do not get this when using optimx. Any idea?

推荐答案

optimx没问题.退后一步,查看要最大化的函数:log(p1) + log(p2) + lamda*(p1+p2-1).直觉上说,最佳解决方案是使所有变量尽可能大,不是吗?看到optimx正确返回了您指定的上限.

Nothing wrong with optimx. Take a step back and look at the function you want to maximize: log(p1) + log(p2) + lamda*(p1+p2-1). It's quite intuitive that the optimal solution is to make all variables as large as possible, no? See that optimx rightfully returned the upper bounds you specified.

那么您的方法有什么问题?使用拉格朗日乘数时,临界点是上述函数的鞍点,而不是像optimx这样的局部最小值对您没有帮助.因此,您需要以使这些鞍点成为局部最小值的方式来修改您的问题.这可以通过优化梯度的范数来完成,这很容易针对您的问题进行分析计算.这里有一个很好的例子(带有图片):

So what is wrong with your approach? When using Lagrange multipliers, critical points are saddle points of your function above, and not local minima like optimx would help you find. So you need to modify your problem in a such a way that these saddle points become local minima. This can be done by optimizing over the norm of the gradient, which is easy to compute analytically for your problem. There is a great example (with pictures) here:

http://en.wikipedia.org/wiki/Lagrange_multiplier#Example:_numerical_optimization .

针对您的问题:

grad.norm <- function(x) {
  lambda <- tail(x, 1)
  p <- head(x, -1)
  h2 <- sum((1/p + lambda)^2) + (sum(p) - 1)^2
}

optimx(c(.48, .52, -1.5),
       grad.norm,
       lower = c(rep(.1, 2), -3),
       upper = c(rep(.9, 2), -1),
       method = "L-BFGS-B")

#                               par      fvalues   method  fns grs [...]
# 1 0.5000161, 0.5000161, -1.9999356 1.038786e-09 L-BFGS-B  13  13 [...]

跟进:如果您不想或无法自己计算梯度,可以让R计算一个数值近似值,例如:

Follow up: If you do not want to or cannot compute the gradient yourself, you can let R compute a numerical approximation, for example:

log.like <- function(x) {
  lambda <- tail(x, 1)
  p <- head(x, -1)
  return(sum(log(p)) + lambda*(sum(p) - 1))
}

grad.norm <- function(x) {
  require(numDeriv)
  return(sum(grad(log.like, x)^2))
}

optimx(c(.48, .52, -1.5),
       grad.norm,
       lower = c(rep(.1, 2), -3),
       upper = c(rep(.9, 2), -1),
       method = "L-BFGS-B")

#                                par      fvalues   method fns grs [...]
# 1 0.5000161, 0.5000161, -1.9999356 1.038784e-09 L-BFGS-B  13  13 [...]

这篇关于为什么R中的optimx无法为这种简单的非参数似然最大化提供正确的解决方案?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆