R中的并行化：％dopar％vs％do％。为什么使用单核收益更好的性能？ [英] Parallelization in R: %dopar% vs %do%. Why using a single core yields to better performance?

查看：335 发布时间：2018/1/24 21:51:23 r foreach parallel-processing

本文介绍了R中的并行化：％dopar％vs％do％。为什么使用单核收益更好的性能？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在使用doMC和foreach在其内核之间分发进程时，我的计算机中出现了奇怪的行为。有人知道为什么使用单核我得到比使用2核更好的性能？正如你所看到的，处理相同的代码而不注册任何内核（据说只使用1个内核），会产生更多的时间效率处理。虽然％do％似乎表现好于％dopar％，但注册4个内核中的2个内核会耗费更多的时间。
$ b $ $ p $

 require（foreach）
 require（doMC）
＃1-core 
>系统时间（m < -  foreach（i = 1：100）％dopar％
 +矩阵（rnorm（1000 * 1000），ncol = 5000））
用户系统已用
 9.285 1.895 11.083 
> system.time（m < -  foreach（i = 1：100）％do％
 + matrix（rnorm（1000 * 1000），ncol = 5000））
用户系统已用
 9.139 1.879 10.979 
 
＃2-core 
> registerDoMC（cores = 2）
>系统时间（m < -  foreach（i = 1：100）％dopar％
 +矩阵（rnorm（1000 * 1000），ncol = 5000））
用户系统已用
 3.322 3.737 132.027 
> system.time（m < -  foreach（i = 1：100）％do％
 + matrix（rnorm（1000 * 1000），ncol = 5000））
用户系统消耗
 9.744 2.054 11.740

在很少的试验中使用4个内核会产生非常不同的结果：

 > registerDoMC（cores = 4）
>系统时间（m < -  foreach（i = 1：100）％dopar％
 {矩阵（rnorm（1000 * 1000），ncol = 5000）}）
用户系统经过
 11.522 4.082 24.444 
>系统时间（m < -  foreach（i = 1：100）％dopar％
 {矩阵（rnorm（1000 * 1000），ncol = 5000）}）
用户系统经过
 21.388 6.299 25.437 
>系统时间（m < -  foreach（i = 1：100）％dopar％
 {矩阵（rnorm（1000 * 1000），ncol = 5000）}）
用户系统经过
 17.439 5.250 9.300 
>系统时间（m < -  foreach（i = 1：100）％dopar％
 {矩阵（rnorm（1000 * 1000），ncol = 5000）}）
用户系统经过
 17.480 5.264 9.170

解决方案

处理时间。如果没有返回结果，这些是我的机器上的 cores = 2 方案的时间。它基本上是相同的代码，只有被创建的矩阵被丢弃，而不是被返回：

 > system.time（m < -  foreach（i = 1：100）％do％
 + {matrix（rnorm（1000 * 1000），ncol = 5000）; NULL}）
 user system elapsed 
 13.793 0.376 14.197 
>系统时间（m < -  foreach（i = 1：100）％dopar％
 + {矩阵（rnorm（1000 * 1000），ncol = 5000）; NULL}）
 user system elapsed 
 8.057 5.236 9.970

仍然不是最优的，但至少现在平行版本更快。 / p>

这是来自 doMC 的文件：

doMC 包为
foreach / <$ c提供了一个并行后端使用
parallel 包的多核功能的$ c>％dopar％函数。
blockquote>

现在， parallel 使用一个 fork 机制产生相同的副本的R进程。从单独的进程收集结果是一项昂贵的任务，这就是您在时间测量中看到的结果。
I'm experiencing a weird behaviour in my computer when distributing processes among its cores using doMC and foreach. Does someone knows why using single core I got better performance than using 2 cores? As you can see, processing the same code without register any core (which supposedly use only 1 core) yields to a much more time-efficiency processing. While %do% seems to perform better than %dopar%, registering 2 cores out of 4 yield to more time consuming. require(foreach) require(doMC) # 1-core > system.time(m <- foreach(i=1:100) %dopar% + matrix(rnorm(1000*1000), ncol=5000) ) user system elapsed 9.285 1.895 11.083 > system.time(m <- foreach(i=1:100) %do% + matrix(rnorm(1000*1000), ncol=5000) ) user system elapsed 9.139 1.879 10.979 # 2-core > registerDoMC(cores=2) > system.time(m <- foreach(i=1:100) %dopar% + matrix(rnorm(1000*1000), ncol=5000) ) user system elapsed 3.322 3.737 132.027 > system.time(m <- foreach(i=1:100) %do% + matrix(rnorm(1000*1000), ncol=5000) ) user system elapsed 9.744 2.054 11.740 Using 4 cores in few trials yield to very different outcomes: > registerDoMC(cores=4) > system.time(m <- foreach(i=1:100) %dopar% { matrix(rnorm(1000*1000), ncol=5000) } ) user system elapsed 11.522 4.082 24.444 > system.time(m <- foreach(i=1:100) %dopar% { matrix(rnorm(1000*1000), ncol=5000) } ) user system elapsed 21.388 6.299 25.437 > system.time(m <- foreach(i=1:100) %dopar% { matrix(rnorm(1000*1000), ncol=5000) } ) user system elapsed 17.439 5.250 9.300 > system.time(m <- foreach(i=1:100) %dopar% { matrix(rnorm(1000*1000), ncol=5000) } ) user system elapsed 17.480 5.264 9.170 解决方案 It's the combination of results that eats all the processing time. These are the timings on my machine for the cores=2 scenario if no results are returned. It's essentially the same code, only the created matrices are discarded instead of being returned: > system.time(m <- foreach(i=1:100) %do% + { matrix(rnorm(1000*1000), ncol=5000); NULL } ) user system elapsed 13.793 0.376 14.197 > system.time(m <- foreach(i=1:100) %dopar% + { matrix(rnorm(1000*1000), ncol=5000); NULL } ) user system elapsed 8.057 5.236 9.970 Still not optimal, but at least the parallel version is now faster. This is from documentation of doMC: The doMC package provides a parallel backend for the foreach/%dopar% function using the multicore functionality of the parallel package. Now, parallel uses a fork mechanism to spawn identical copies of the R process. Collecting results from separate processes is an expensive task, and this is what you see in your time measurements. 这篇关于R中的并行化：％dopar％vs％do％。为什么使用单核收益更好的性能？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R中的并行化：％dopar％vs％do％。为什么使用单核收益更好的性能？ [英] Parallelization in R: %dopar% vs %do%. Why using a single core yields to better performance?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R中的并行化：％dopar％vs％do％。为什么使用单核收益更好的性能？ [英] Parallelization in R: %dopar% vs %do%. Why using a single core yields to better performance?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭