超越R的优化功能 [英] Moving beyond R's optim function

查看:83
本文介绍了超越R的优化功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用R来估算具有手动规格的多项式logit模型.我发现了一些可以让您估算MNL模型的软件包此处.

I am trying to use R to estimate a multinomial logit model with a manual specification. I have found a few packages that allow you to estimate MNL models here or here.

我发现了其他一些有关滚动"自己的MLE函数的著作

I've found some other writings on "rolling" your own MLE function here. However, from my digging around - all of these functions and packages rely on the internal optim function.

在我的基准测试中,optim是瓶颈.使用具有〜16000个观测值和7个参数的模拟数据集,R在我的机器上花费大约90秒. Biogeme 中的等效模型大约需要10秒钟.一位在 Ox 中编写自己的代码的同事报告了大约4秒钟的相同模型.

In my benchmark tests, optim is the bottleneck. Using a simulated dataset with ~16000 observations and 7 parameters, R takes around 90 seconds on my machine. The equivalent model in Biogeme takes ~10 seconds. A colleague who writes his own code in Ox reports around 4 seconds for this same model.

是否有人有编写自己的MLE函数的经验,或者可以将我引向超出默认optim函数(不是双关语)优化的东西的方向?

Does anyone have experience with writing their own MLE function or can point me in the direction of something that is optimized beyond the default optim function (no pun intended)?

如果有人希望R代码重新创建模型,请告诉我-我会很乐意提供它.我没有提供它,因为它与优化optim函数和节省空间的问题没有直接关系.

If anyone wants the R code to recreate the model, let me know - I'll glady provide it. I haven't provided it since it isn't directly relevant to the problem of optimizing the optim function and to preserve space...

感谢大家的想法.根据下面的大量评论,对于较复杂的模型,我们能够将R与Biogeme放在同一球场,而对于我们运行的多个较小/较简单的模型,R实际上更快.我认为该问题的长期解决方案将涉及编写依赖于fortran或C库的单独的最大化函数,但肯定会接受其他方法.

推荐答案

已经尝试使用nlm()函数了吗?不知道它是否快得多,但是它确实提高了速度.还要检查选项. optim使用慢速算法作为默认算法.通过使用拟牛顿算法(方法="BFGS")而不是默认方法,您可以获得> 5倍的加速.如果您不太关心最后一位数字,也可以将nlm()的公差级别设置得更高一些,以提高速度.

Tried with the nlm() function already? Don't know if it's much faster, but it does improve speed. Also check the options. optim uses a slow algorithm as the default. You can gain a > 5-fold speedup by using the Quasi-Newton algorithm (method="BFGS") instead of the default. If you're not concerned too much about the last digits, you can also set the tolerance levels higher of nlm() to gain extra speed.

f <- function(x) sum((x-1:length(x))^2)

a <- 1:5

system.time(replicate(500,
     optim(a,f)
))
   user  system elapsed 
   0.78    0.00    0.79 

system.time(replicate(500,
     optim(a,f,method="BFGS")
))
   user  system elapsed 
   0.11    0.00    0.11 

system.time(replicate(500,
     nlm(f,a)
))
   user  system elapsed 
   0.10    0.00    0.09 

system.time(replicate(500,
      nlm(f,a,steptol=1e-4,gradtol=1e-4)
))
   user  system elapsed 
   0.03    0.00    0.03 

这篇关于超越R的优化功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆