R mclapply与foreach [英] R mclapply vs foreach

查看:95
本文介绍了R mclapply与foreach的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将mclapply用于所有令人尴尬的并行"计算.我发现它干净易用,并且当参数 mc.cores = 1 mc.preschedule = TRUE 时,我可以插入 browser()像在常规R中一样,在 mclapply 内部的函数中进行调试,并逐行调试.这对于将代码更快地投入生产非常有用.

I use mclapply for all my "embarassingly parallel" computations. I find it clean and easy to use, and when arguments mc.cores = 1 and mc.preschedule = TRUE I can insert browser() in the function inside mclapply and debug line by line just like in regular R. This is a huge help in getting code to production quicker.

foreach 提供了哪些 mclapply 不提供的功能?我应该考虑继续编写foreach代码吗?

What does foreach offer that mclapply does not? Is there a reason I should consider writing foreach code going forward?

如果我理解正确,出于性能原因,两者都可以使用 multicore 方法进行并行计算(允许分叉).

If I understand correctly, both can use the multicore approach to parallel computations (permitting forking) which I like to use for performance reasons.

我已经看到 foreach 在各种软件包中使用过,并且已经阅读了它的基础知识,但是坦率地说,我认为它不那么容易使用.我也无法弄清楚如何在 foreach 函数调用中使用 browser().(是的,我已阅读此线程带有foreach%dopar%的浏览器模式,但没有帮助我使浏览器正常工作.)

I have seen foreach being used in various packages, and have read the basics of it, but frankly I don't find it as easy to use. I also am unable to figure out how to get the browser() to work in foreach function calls. (yes I have read this thread browser mode with foreach %dopar% but didn't help me to get the browser to work right).

推荐答案

问题与此处描述的问题几乎相同:

The problem is almost the same as described here: Understanding the differences between mclapply and parLapply in R .

mclapply 在调用 mclapply 时为每个工作进程(线程/核心)创建主进程的克隆,从而保证了可重复性.不幸的是,在Windows上这是不可能的,与多核相反,Windows总是通过 foreach parLapply 使用多会话并行性.

The mclapply is creating clones of the master process for each worker processes (threads/cores) at the point that mclapply is called, reproducibility is guaranteed. Unfortunately, that isn't possible on Windows where in contrast to multicore there is always used the multisession parallelism by foreach or parLapply.

parLapply foreach %dopar%一起使用时,通常必须执行以下附加步骤:创建PSOCK群集,注册如果需要,请在群集上,将必需的程序包加载到群集工作器上,将必要的数据和功能导出到群集工作器的全局环境中.

When using parLapply or foreach with %dopar%, you generally have to perform the following additional steps: Create a PSOCK cluster, Register the cluster if desired, Load necessary packages on the cluster workers, Export necessary data and functions to the global environment of the cluster workers.

这就是为什么 foreach 具有诸如 .packages .export 之类的参数的原因,这些参数使我们能够在会话之间分配所有必需的东西.

That is why foreach has parameters like .packages and .export which enable us to distribute everything needed across sessions.

future 软件包提供了mulicore和多会话处理之间差异的详细信息

future package provided details of differences between mulicore and multisession processing https://cran.r-project.org/web/packages/future/vignettes/future-1-overview.html

这篇关于R mclapply与foreach的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆