如何加快R for循环? [英] how to speed up an R for loop?

查看:527
本文介绍了如何加快R for循环?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在R的GWmodel包中为gwr.basic函数运行以下循环。我需要做的是收集任意给定带宽的估计参数的均值。

代码如下:
$ b $ pre $ library $($ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $
#Dub.voter


LARentMean = list()
for(i in 20:21)
{
gwr.res< - gwr.basic(GenEl2004_DiffAdd + LARent + SC1 + Unempl + LowEduc + Age18_24 + Age25_44 + Age45_64,data = Dub.voter,bw = i,kernel =bisquare,adaptive = TRUE,F123.test = TRUE)
a < - 平均值(gwr.res $ SDF $ LARent)
LARentMean [i] < - a
}
结果=未列出(LARentMean)

>结果
[1] -0.1117668 -0.1099969

然而,返回结果非常慢。我需要更宽的范围,如20:200。有没有办法加快这个过程?如果没有,如何有一个步进的范围让我们说20到200的步骤为5,以减少操作的数量?



我是一个新的R的Python用户。读到SO,R因循环缓慢而闻名,还有更有效的替代方案。更加明确这一点将是受欢迎的。

解决方案

我得到了像@musically_ut一样的印象。 for循环和传统的 for-versusapply 辩论在这里不太可能帮到你。如果你有一个以上的核心,试着去并行化。有几个软件包像 parallel 或 snowfall 。最终最好和最快的包取决于你的机器和操作系统。

最好在这里并不总是最快。一个跨平台的代码,可以超过一点额外的性能。透明度和易用性也可能超过最高速度。这就是说,我喜欢标准的解决方案,并会推荐使用与R一起发布的 parallel ,并在Windows,OSX和Linux上运行。



编辑:这里是使用OP的例子的完全可重复的例子。

 图书馆(GWmodel)
数据(DubVoter)

图书馆(并行)

bwlist< - list(bw1 = 20,bw2 = 21)

$ b cl < - makeCluster(detectCores())

#load'GWmodel'for each node
clusterEvalQ(cl,library(GWmodel))

#将数据导出到每个节点
clusterExport(cl,varlist = c(bwlist,Dub.voter))

out < - parLapply (cl,bwlist,function(e){
try(gwr.basic(GenEl2004〜DiffAdd + LARent + SC1 +
Unempl + LowEduc + Age18_24 + Age25_44 +
Age45_64,data =选民,
bw = e,kernel =bisquare,
adaptive = TRUE,F123.test = TRUE))

})


larent_l < - lapply(lapply(out,[[,SDF),[[,LARent)
unlist(lapply(LArent_l,mean))

#最后,停止群集
stopCluster(cl)


I am running the following for loop for the gwr.basic function in the GWmodel package in R. What I need to do is to collect the mean of estimate parameter for any given bandwidth.

the code looks like:

library(GWmodel)
data("DubVoter")
#Dub.voter


LARentMean = list()
for (i in 20:21)
{
gwr.res <- gwr.basic(GenEl2004 ~ DiffAdd + LARent + SC1 + Unempl + LowEduc + Age18_24 + Age25_44 + Age45_64, data = Dub.voter, bw = i,  kernel = "bisquare", adaptive = TRUE, F123.test = TRUE)
a <- mean(gwr.res$SDF$LARent)
LARentMean[i] <- a
}
outcome = unlist(LARentMean)

> outcome
[1] -0.1117668 -0.1099969

However it is terribly slow at returning the result. I need a much wider range such as 20:200. Is there a way to speed the process up? If not, how to have a stepped range let's say 20 to 200 with steps of 5 to reduce the number of operations?

I am a python user new to R. I read on SO that R is well known for being slow at for loops and that there are more efficient alternatives. More clarity on this point would be welcomed.

解决方案

I got the same impression like @musically_ut. The for loop and the traditional for-vs.apply debate is unlikely to help you here. Try to go for parallelization if you got more than one core. There are several packages like parallel or snowfall. Which package is ultimately the best and fastest depends on your machine and operating system.

Best does not always equal fastest here. A code that works cross-platform and can be worth more than a bit of extra performance. Also transparency and ease of use can outweigh maximum speed. That being said I like the standard solution a lot and would recommend to use parallel which ships with R and works on Windows, OSX and Linux.

EDIT: here's the fully reproducible example using the OP's example.

library(GWmodel)
data("DubVoter")

library(parallel)

bwlist <- list(bw1 = 20, bw2 = 21)


cl <- makeCluster(detectCores())

# load 'GWmodel' for each node
clusterEvalQ(cl, library(GWmodel))

# export data to each node
clusterExport(cl, varlist = c("bwlist","Dub.voter"))

out <- parLapply(cl, bwlist, function(e){
 try(gwr.basic(GenEl2004 ~ DiffAdd + LARent + SC1 +
 Unempl + LowEduc + Age18_24 + Age25_44 +
 Age45_64, data = Dub.voter,
 bw = e,  kernel = "bisquare",
 adaptive = TRUE, F123.test = TRUE  ))

} )


LArent_l <- lapply(lapply(out,"[[","SDF"),"[[","LARent")
unlist(lapply(LArent_l,"mean"))

# finally, stop the cluster
stopCluster(cl)

这篇关于如何加快R for循环?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆