如何加快R for循环? [英] how to speed up an R for loop?
问题描述
我在R的GWmodel包中为gwr.basic函数运行以下循环。我需要做的是收集任意给定带宽的估计参数的均值。
代码如下:
$ b $ pre $ library $($ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $
#Dub.voter
LARentMean = list()
for(i in 20:21)
{
gwr.res< - gwr.basic(GenEl2004_DiffAdd + LARent + SC1 + Unempl + LowEduc + Age18_24 + Age25_44 + Age45_64,data = Dub.voter,bw = i,kernel =bisquare,adaptive = TRUE,F123.test = TRUE)
a < - 平均值(gwr.res $ SDF $ LARent)
LARentMean [i] < - a
}
结果=未列出(LARentMean)
>结果
[1] -0.1117668 -0.1099969
然而,返回结果非常慢。我需要更宽的范围,如20:200。有没有办法加快这个过程?如果没有,如何有一个步进的范围让我们说20到200的步骤为5,以减少操作的数量?
我是一个新的R的Python用户。读到SO,R因循环缓慢而闻名,还有更有效的替代方案。更加明确这一点将是受欢迎的。
我得到了像@musically_ut一样的印象。 for循环和传统的 最好在这里并不总是最快。一个跨平台的代码,可以超过一点额外的性能。透明度和易用性也可能超过最高速度。这就是说,我喜欢标准的解决方案,并会推荐使用与R一起发布的 编辑:这里是使用OP的例子的完全可重复的例子。 I am running the following for loop for the gwr.basic function in the GWmodel package in R. What I need to do is to collect the mean of estimate parameter for any given bandwidth. the code looks like: However it is terribly slow at returning the result. I need a much wider range such as 20:200. Is there a way to speed the process up? If not, how to have a stepped range let's say 20 to 200 with steps of 5 to reduce the number of operations? I am a python user new to R. I read on SO that R is well known for being slow at for loops and that there are more efficient alternatives. More clarity on this point would be welcomed. I got the same impression like @musically_ut. The for loop and the traditional Best does not always equal fastest here. A code that works cross-platform and can be worth more than a bit of extra performance. Also transparency and ease of use can outweigh maximum speed. That being said I like the standard solution a lot and would recommend to use EDIT: here's the fully reproducible example using the OP's example.
这篇关于如何加快R for循环?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋! for-versusapply
辩论在这里不太可能帮到你。如果你有一个以上的核心,试着去并行化。有几个软件包像 parallel 或
snowfall
。最终最好和最快的包取决于你的机器和操作系统。
parallel ,并在Windows,OSX和Linux上运行。
图书馆(GWmodel)
数据(DubVoter)
图书馆(并行)
bwlist< - list(bw1 = 20,bw2 = 21)
$ b cl < - makeCluster(detectCores())
#load'GWmodel'for each node
clusterEvalQ(cl,library(GWmodel))
#将数据导出到每个节点
clusterExport(cl,varlist = c(bwlist,Dub.voter))
out < - parLapply (cl,bwlist,function(e){
try(gwr.basic(GenEl2004〜DiffAdd + LARent + SC1 +
Unempl + LowEduc + Age18_24 + Age25_44 +
Age45_64,data =选民,
bw = e,kernel =bisquare,
adaptive = TRUE,F123.test = TRUE))
})
larent_l < - lapply(lapply(out,[[,SDF),[[,LARent)
unlist(lapply(LArent_l,mean))
#最后,停止群集
stopCluster(cl)
library(GWmodel)
data("DubVoter")
#Dub.voter
LARentMean = list()
for (i in 20:21)
{
gwr.res <- gwr.basic(GenEl2004 ~ DiffAdd + LARent + SC1 + Unempl + LowEduc + Age18_24 + Age25_44 + Age45_64, data = Dub.voter, bw = i, kernel = "bisquare", adaptive = TRUE, F123.test = TRUE)
a <- mean(gwr.res$SDF$LARent)
LARentMean[i] <- a
}
outcome = unlist(LARentMean)
> outcome
[1] -0.1117668 -0.1099969
for-vs.apply
debate is unlikely to help you here. Try to go for parallelization if you got more than one core. There are several packages like parallel
or snowfall
. Which package is ultimately the best and fastest depends on your machine and operating system. parallel
which ships with R and works on Windows, OSX and Linux. library(GWmodel)
data("DubVoter")
library(parallel)
bwlist <- list(bw1 = 20, bw2 = 21)
cl <- makeCluster(detectCores())
# load 'GWmodel' for each node
clusterEvalQ(cl, library(GWmodel))
# export data to each node
clusterExport(cl, varlist = c("bwlist","Dub.voter"))
out <- parLapply(cl, bwlist, function(e){
try(gwr.basic(GenEl2004 ~ DiffAdd + LARent + SC1 +
Unempl + LowEduc + Age18_24 + Age25_44 +
Age45_64, data = Dub.voter,
bw = e, kernel = "bisquare",
adaptive = TRUE, F123.test = TRUE ))
} )
LArent_l <- lapply(lapply(out,"[[","SDF"),"[[","LARent")
unlist(lapply(LArent_l,"mean"))
# finally, stop the cluster
stopCluster(cl)