使用foreach并行化的问题 [英] Problems using foreach parallelization
问题描述
SNOW
和 mulitcore
实现与使用 doSNOW 或 doMC
和 foreach
。作为样本问题,我通过多次计算从标准正态分布中抽取样本的方法来说明中心极限定理。这里是标准的代码:
pre $ Cl $ Sim < - 函数(nSims = 1000,size = 100,mu = 0,sigma = 1 ){
sapply(1:nSims,function(x){
mean(rnorm(n = size,mean = mu,sd = sigma))
})
}
下面是 SNOW
的实现:
$ b $ pre $ library $($)
cl< - makeCluster(2)
ParCltSim< - function cluster,nSims = 1000,size = 100,mu = 0,sigma = 1){
parSapply(cluster,1:nSims,function(x){
mean(rnorm(n = size,mean =亩,sd = sigma))
})
}
doSNOW
方法:
library(foreach)
实现相对于非并行运行减少了大约23%的计算时间(随着模拟次数的增加,时间节省越来越大,我们期望)。
library (nSims = 1000,size = 100,mu = 0,sigma = 1){
x< - numeric( nSims)
foreach(i = 1:nSims,.combine = cbind)%dopar%{
x [i] < - 均值(rnorm(n =大小,平均值= mu,sd = sigma))
$ / code $ / pre
$ b $ p我得到了f结果如下:
> system.time(CltSim(nSims = 10000,size = 100))
用户系统已经过了
0.476 0.008 0.484
> system.time(ParCltSim(cluster = cl,nSims = 10000,size = 100))
用户系统经过
0.028 0.004 0.375
> system.time(FECltSim(nSims = 10000,size = 100))
用户系统已用
8.865 0.408 11.309
SNOW
foreach
尝试实际上运行时间增加了20倍。另外,如果我更改%dopar%
到%do%
并检查循环的非并行版本,它需要7秒以上。
另外,我们可以考虑
多核
包。为多核
编写的模拟是
$ p $库(多核)
MCCltSim < - 函数(nSims = 1000,size = 100,mu = 0,sigma = 1){
unlist(mclapply(1:nSims,function(x){
mean(rnorm =大小,平均=亩,SD =西格玛))
)))
}
我们得到了比 SNOW
更好的速度提升:
> system.time(MCCltSim(nSims = 10000,size = 100))
用户系统已用完
0.924 0.032 0.307
开始一个新的R会话,我们可以使用 doMC
来尝试执行 foreach
,而不是 doSNOW
,调用
库(doMC)
registerDoMC )
然后运行 FECltSim()
,仍然发现
> system.time(FECltSim(nSims = 10000,size = 100))
用户系统已用完
6.800 0.024 6.887
与非并行化运行时相比,这仅增加了14倍。
$ b
结论:我的 foreach在
代码没有有效运行。任何想法为什么? doSNOW
或 doMC
下,
谢谢,
查理
FECltSim < - function(nSims = 1000,大小= 100,mu = 0,sigma = 1){
foreach(i = 1:nSims,.combine = c)%dopar%{
mean(rnorm(n = size,mean = mu, sd = sigma))
}
}
没有必要明确地在循环内。也不需要使用cbind,因为你的结果是每次只有一个数字。所以 .combine = c
会做
foreach的事情是它会产生相当多的开销在核心之间得到不同核心结合在一起的结果。快速浏览个人资料显示了这一点:
$ by.self
self.time self.pct total .time total.pct
$ 5.46 41.30 5.46 41.30
$ < - 0.76 5.75 0.76 5.75
。呼叫0.76 5.75 0.76 5.75
...
超过40%的时间忙于选择。它还为整个操作使用了很多其他功能。实际上,如果你通过非常耗时的函数进行相对较少的回合,那么只建议 foreach
。另外两个解决方案是建立在不同的技术基础上的,在R方面做得少得多。在一个侧面节点上, snow
>实际上最初是为了在群集上工作,而不是在单个工作站上工作,比如 multicore
是。
I'm trying to compare parallelization options. Specifically, I'm comparing the standard SNOW
and mulitcore
implementations to those using doSNOW
or doMC
and foreach
. As a sample problem, I'm illustrating the central limit theorem by computing the means of samples drawn from a standard normal distribution many times. Here's the standard code:
CltSim <- function(nSims=1000, size=100, mu=0, sigma=1){
sapply(1:nSims, function(x){
mean(rnorm(n=size, mean=mu, sd=sigma))
})
}
Here's the SNOW
implementation:
library(snow)
cl <- makeCluster(2)
ParCltSim <- function(cluster, nSims=1000, size=100, mu=0, sigma=1){
parSapply(cluster, 1:nSims, function(x){
mean(rnorm(n=size, mean=mu, sd=sigma))
})
}
Next, the doSNOW
method:
library(foreach)
library(doSNOW)
registerDoSNOW(cl)
FECltSim <- function(nSims=1000, size=100, mu=0, sigma=1) {
x <- numeric(nSims)
foreach(i=1:nSims, .combine=cbind) %dopar% {
x[i] <- mean(rnorm(n=size, mean=mu, sd=sigma))
}
}
I get the following results:
> system.time(CltSim(nSims=10000, size=100))
user system elapsed
0.476 0.008 0.484
> system.time(ParCltSim(cluster=cl, nSims=10000, size=100))
user system elapsed
0.028 0.004 0.375
> system.time(FECltSim(nSims=10000, size=100))
user system elapsed
8.865 0.408 11.309
The SNOW
implementation shaves off about 23% of computing time relative to an unparallelized run (time savings get bigger as the number of simulations increase, as we would expect). The foreach
attempt actually increases run time by a factor of 20. Additionally, if I change %dopar%
to %do%
and check the unparallelized version of the loop, it takes over 7 seconds.
Additionally, we can consider the multicore
package. The simulation written for multicore
is
library(multicore)
MCCltSim <- function(nSims=1000, size=100, mu=0, sigma=1){
unlist(mclapply(1:nSims, function(x){
mean(rnorm(n=size, mean=mu, sd=sigma))
}))
}
We get an even better speed improvement than SNOW
:
> system.time(MCCltSim(nSims=10000, size=100))
user system elapsed
0.924 0.032 0.307
Starting a new R session, we can attempt the foreach
implementation using doMC
instead of doSNOW
, calling
library(doMC)
registerDoMC()
then running FECltSim()
as above, still finding
> system.time(FECltSim(nSims=10000, size=100))
user system elapsed
6.800 0.024 6.887
This is "only" a 14-fold increase over the non-parallelized runtime.
Conclusion: My foreach
code is not running efficiently under either doSNOW
or doMC
. Any idea why?
Thanks, Charlie
To start with, you could write your foreach code a bit more concise :
FECltSim <- function(nSims=1000, size=100, mu=0, sigma=1) {
foreach(i=1:nSims, .combine=c) %dopar% {
mean(rnorm(n=size, mean=mu, sd=sigma))
}
}
This gives you a vector, no need to explicitly make it within the loop. Also no need to use cbind, as your result is every time just a single number. So .combine=c
will do
The thing with foreach is that it creates quite a lot of overhead to communicate between the cores and get the results of the different cores fit together. A quick look at the profile shows this pretty clearly :
$by.self
self.time self.pct total.time total.pct
$ 5.46 41.30 5.46 41.30
$<- 0.76 5.75 0.76 5.75
.Call 0.76 5.75 0.76 5.75
...
More than 40% of the time it is busy selecting things. It also uses a lot of other functions for the whole operation. Actually, foreach
is only advisable if you have relatively few rounds through very time consuming functions.
The other two solutions are built on a different technology, and do far less in R. On a sidenode, snow
is actually initially developed to work on clusters more than on single workstations, like multicore
is.
这篇关于使用foreach并行化的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!