R:foreach循环如何找到一个应该被调用的函数? [英] R: how does a foreach loop find a function that should be invoked?

查看:152
本文介绍了R:foreach循环如何找到一个应该被调用的函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我使用一个调用自定义函数的foreach循环(使用%dopar%)时,我遇到了一些问题。当我使用Linux时没有问题,但是当我使用Windows时,找不到自定义函数。用文字来解释这个问题是很难的,所以我举了一个小例子来展示它。假设我有三个简单函数的集合,其中 FUN2 (使用%do%)和 FUN3 (使用%dopar%)调用第一个( FUN ): (x,y,z){x + y + z}
FUN2< - 函数(a,b){
foreach(i = 1:3)%do%FUN(i,a,b)
}
FUN3 < - function(a,b){
foreach(i = 1:3)%dopar%FUN(i,a,b)
}

函数存储在一个名为 foreach_testfunctions.R 的脚本中。在另一个脚本( foreach.test )中,我使用 library(doParallel)来源,并尝试使用函数。首先我用Linux来做,而且所有的工作都正常:
$ b $ pre $ source $($ foreach_testfunctions.R)
a< - 2
b < - 3
library(doParallel)
registerDoParallel()

foreach(i = 1:3)%do%FUN(i,a,b )##工作正常
FUN2(a,b)##正常工作
foreach(i = 1:3)%dopar%FUN(i,a,b)##正常工作
FUN3(a,b)##工作正常

然后我在Windows中执行:



$ p $ source(foreach_testfunctions.R)
a < - 2
b < - 3
library doParallel)
cl < - makeCluster(3)
registerDoParallel(cl)

foreach(i = 1:3)%do%FUN(i,a,b)# #工作正常
FUN2(a,b)##正常工作
foreach(i = 1:3)%dopar%FUN(i,a,b)##正常工作
FUN3 a,b)##不起作用
FUN(i,a,b)中的错误:任务1失败 - 无法找到函数FUN

结论:(1 )%do%没有问题。 (2)在使用Windows时,%dopar%出现问题。我尝试在调用的行之前插入 clusterExport(cl,varlist = c(FUN,a,b),env = environment()) FUN3 确保函数 FUN 和变量a和b在适当的环境中找到,但是错误我的问题:为什么Windows的行为不同于Linux,虽然代码是相同的(除了不同的 registerDoParallel 语法)?当通过函数 FUN3 调用时,如何确保Windows 不会找到函数 FUN

解决方案

它们的行为不同,因为 registerDoParallel 注册了一个 Linux上的mclapply 后端,而在Windows上注册 clusterApplyLB 后端。当使用 mclapply 后端时,基本上没有数据导出问题,所以它可以在Linux上运行。但使用 clusterApplyLB ,如果 foreach 不会自动导出需要的函数和数据,则可能会遇到问题。

你可以通过修改 FUN3 来导出 FUN c code $> c code $>函数(a,b){
foreach(i = 1:3,.export ='FUN')%dopar%FUN(i,a,b)
}
.export 会被 mclapply 后端。



正如Hong Ooi所指出的,您在使用 clusterExport ,但我不会使用 clusterExport 来解决这个问题,因为它是后端特定的。


I have problems when I use a foreach loop (using %dopar%) which invokes a self-defined function. There is not really a problem when I work with Linux, but when I use Windows the self-defined function cannot be found. It is hard to explain the problem in words, so I composed a small example to show it. Assume I have a collection of three simple functions, where FUN2 (using %do%) and FUN3 (using %dopar%) invoke the first one (FUN):

FUN <- function(x,y,z) { x + y + z }
FUN2 <- function(a, b) {
  foreach(i=1:3) %do% FUN(i, a, b)
}
FUN3 <- function(a, b) {
  foreach(i=1:3) %dopar% FUN(i, a, b)
}

The functions are stored in a script called foreach_testfunctions.R. In another script (foreach.test) I source these functions, use library(doParallel) and try to use the functions. First I do it with Linux and all works fine:

source("foreach_testfunctions.R")
a <- 2
b <- 3
library(doParallel)
registerDoParallel()

foreach(i=1:3) %do% FUN(i, a, b)    ## works fine
FUN2(a, b)                          ## works fine
foreach(i=1:3) %dopar% FUN(i, a, b) ## works fine
FUN3(a, b)                          ## works fine 

Then I do it in Windows:

source("foreach_testfunctions.R")
a <- 2
b <- 3
library(doParallel)
cl <- makeCluster(3)
registerDoParallel(cl)

foreach(i=1:3) %do% FUN(i, a, b)    ## works fine
FUN2(a, b)                          ## works fine
foreach(i=1:3) %dopar% FUN(i, a, b) ## works fine
FUN3(a, b)                          ## does not work
Error in FUN(i, a, b) : task 1 failed - "Could not find function "FUN""

Conclusion: (1) No problems with %do%. (2) Problems with %dopar% when using Windows. I tried inserting the line clusterExport(cl, varlist=c("FUN", "a", "b"), env=environment()) before the line that invokes FUN3 to make sure that the function FUN and the variables a and b are found in the proper environment, but the error remains.

My questions: Why does Windows behave different than Linux although the code is identical (apart from the different registerDoParallel syntax)? How can I make sure that Windows does find function FUN when invoked via function FUN3?

解决方案

They behave differently because registerDoParallel registers an mclapply backend on Linux, while it registers a clusterApplyLB backend on Windows. When using an mclapply backend, there are essentially no data exporting issues, so it works on Linux. But with clusterApplyLB, you can run into problems if foreach doesn't auto-export the functions and data that are needed.

You can solve this problem by modifying FUN3 to export FUN via the .export option:

FUN3 <- function(a, b) {
  foreach(i=1:3, .export='FUN') %dopar% FUN(i, a, b)
}

This solution works on both Linux and Windows, since .export is ignored by the mclapply backend.

As pointed out by Hong Ooi, you have an error in your use of clusterExport, but I wouldn't use clusterExport to solve the problem since it is backend specific.

这篇关于R:foreach循环如何找到一个应该被调用的函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆