在 R 中并行运行 for 循环 [英] run a for loop in parallel in R
问题描述
我有一个类似这样的 for 循环:
for (i=1:150000) {临时矩阵 = {}tempMatrix = functionThatDoesSomething() #调用一个函数finalMatrix = cbind(finalMatrix, tempMatrix)}
你能告诉我如何使这个平行吗?
我根据在线示例尝试了此方法,但不确定语法是否正确.它也没有增加多少速度.
finalMatrix = foreach(i=1:150000, .combine=cbind) %dopar% {临时矩阵 = {}tempMatrix = functionThatDoesSomething() #调用一个函数cbind(finalMatrix,tempMatrix)}
感谢您的反馈.发布这个问题后,我确实查找了 parallel
.
终于在尝试了几次之后,我让它运行了.我已经添加了下面的代码,以防对其他人有用
库(foreach)库(doParallel)#setup 并行后端以使用多个处理器核心=检测核心()cl <- makeCluster(cores[1]-1) #不要让你的计算机过载registerDoParallel(cl)finalMatrix <- foreach(i=1:150000, .combine=cbind) %dopar% {tempMatrix = functionThatDoesSomething() #调用一个函数#如果你愿意,可以做其他事情tempMatrix #相当于finalMatrix = cbind(finalMatrix, tempMatrix)}#停止集群停止集群(cl)
注意 - 我必须添加一个注意,如果用户分配太多进程,那么用户可能会收到此错误:Error in serialize(data, node$con) : errorwriting to connection
>
注意 - 如果 foreach
语句中的 .combine
是 rbind
,则返回的最终对象将通过附加每个逐行循环.
希望这对像我这样第一次在 R 中尝试并行处理的人有用.
参考资料:http://www.r-bloggers.com/parallel-r-loops-for-windows-and-linux/https://beckmw.wordpress.com/2014/01/21/a-brief-foray-into-parallel-processing-with-r/
I have a for loop that is something like this:
for (i=1:150000) {
tempMatrix = {}
tempMatrix = functionThatDoesSomething() #calling a function
finalMatrix = cbind(finalMatrix, tempMatrix)
}
Could you tell me how to make this parallel ?
I tried this based on an example online, but am not sure if the syntax is correct. It also didn't increase the speed much.
finalMatrix = foreach(i=1:150000, .combine=cbind) %dopar% {
tempMatrix = {}
tempMatrix = functionThatDoesSomething() #calling a function
cbind(finalMatrix, tempMatrix)
}
Thanks for your feedback. I did look up parallel
after I posted this question.
Finally after a few tries, I got it running. I have added the code below in case it is useful to others
library(foreach)
library(doParallel)
#setup parallel backend to use many processors
cores=detectCores()
cl <- makeCluster(cores[1]-1) #not to overload your computer
registerDoParallel(cl)
finalMatrix <- foreach(i=1:150000, .combine=cbind) %dopar% {
tempMatrix = functionThatDoesSomething() #calling a function
#do other things if you want
tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, tempMatrix)
}
#stop cluster
stopCluster(cl)
Note - I must add a note that if the user allocates too many processes, then user may get this error: Error in serialize(data, node$con) : error writing to connection
Note - If .combine
in the foreach
statement is rbind
, then the final object returned would have been created by appending output of each loop row-wise.
Hope this is useful for folks trying out parallel processing in R for the first time like me.
References: http://www.r-bloggers.com/parallel-r-loops-for-windows-and-linux/ https://beckmw.wordpress.com/2014/01/21/a-brief-foray-into-parallel-processing-with-r/
这篇关于在 R 中并行运行 for 循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!