有没有办法摆脱foreach循环? [英] Is there any way to break out of a foreach loop?

查看:199
本文介绍了有没有办法摆脱foreach循环?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用R包 foreach()%dopar%进行长时间(〜天)计算在平行下。我希望能够在其中一个产生错误的情况下停止整套计算。但是,我还没有找到办法来实现这一点,从文档和各种论坛,我没有发现这是可能的。特别是 break()不起作用, stop()只会停止当前的计算,而不是整个 foreach 循环。

请注意,我不能使用简单的for循环,因为最终我想使用doRNG程序包。

下面是我尝试的一个简化的,可重复的版本(在这里显示的是与%do% ,但是在使用 doRNG %dopar%)时遇到同样的问题。请注意,实际上我想要并行运行这个循环的所有元素(这里是10)。

  library(foreach) 
myfunc< - function(){
x< - foreach(k = 1:10,.combine =cbind,.errorhandling =stop)%do%{
cat (Element,k,\ n)
Sys.sleep(0.5)#只是为了显示停止不会导致从foreach退出
if(is.element(k,2:6 )){
cat(Should stop \\\

stop(Has stopped)
}
k
}
return(x)

x< - myfunc()
#stop()暂停k = 2:6的处理,但是它不会停止foreach循环本身。
#x不被返回。执行过程会在{:任务2失败 - 已停止

我想实现的是整个foreach循环可以在某些情况下立即退出(在这里,遇到 stop())。



我发现没有办法用 foreach 来实现。看来我需要一种方法来发送消息到所有其他进程,使他们也停止。

如果不可能与 foreach ,有没有人知道的替代品?我也尝试过使用 parallel :: mclapply 来达到这个效果,但是这也不行。

 > sessionInfo()
R版本3.0.0(2013-04-03)
平台:x86_64-apple-darwin10.8.0(64位)

语言环境:
[1] C / UTF-8 / C / C / C / C

附加的基本软件包:
[1] stats graphics grDevices utils datasets methods base

其他附加的软件包:
[1] foreach_1.4.0

通过命名空间加载(而不是附加):
[1] codetools_0.2-8 compiler_3.0.0 iterators_1。 0.6


解决方案

听起来像是你想要一个不耐烦版本的停止错误处理。你可以通过编写一个自定义的组合函数来实现这个功能,并且在返回每一个结果后立即调用它来安排 foreach 。要做到这一点,您需要:


  • 使用支持调用 combine 的后端例如 doMPI doRedis

  • 请勿启用 .multicombine

  • .inorder 设置为 FALSE

  • .init 设置为某物(例如 NULL

    下面是一个例子: > library(foreach)
    parfun< - function(errval,n){
    abortable< - function(errfun){
    comb < - function(x ,y){
    if(inherits(y,'error')){
    warning('这会让你的并行后端处于不一致的状态')
    errfun(y)
    }
    c(x,y)
    }
    foreach(i = seq_len(n),.errorhandling ='pass',.export ='errval',
    .combine = 'comb',.inorder = FALSE,.init = NULL)%dopar%{
    if(i == errval)
    stop('tes )
    Sys.sleep(10)
    i
    }
    }
    callCC(abortable)
    }
    foreach
    将调用将函数与错误对象组合在一起。不管在中使用的错误处理如何, callCC 函数用于从 foreach foreach 和后端。在这种情况下, callCC 会调用 abortable 函数,传递一个被使用的函数对象force callCC 立即返回。通过从组合函数中调用该函数,当我们检测到一个错误对象时,我们可以从 foreach 循环中进行转义,并且使用 callCC 返回该对象。你可以使用> parfun 没有注册并行后端,一旦执行一个抛出错误的任务,确认 foreach 循环中断,但是这可能需要一段时间按顺序执行。例如,如果没有后端注册,则需要20秒才能执行:

      print(system.time(parfun(3,4) )))

    并行执行 parfun ,我们需要做的不仅仅是简单地分解 foreach 循环:我们还需要停止工作,否则他们将继续计算他们分配的任务。使用 doMPI ,工作人员可以使用 mpi.abort 停止工作:

      library(doMPI)
    cl < - startMPIcluster()
    registerDoMPI(cl)
    r < - parfun(getDoParWorkers(),getDoParWorkers ())
    if(inherits(r,'error')){
    cat(sprintf('Caught error:%s\',conditionMessage(r)))
    mpi。 abort(cl $ comm)
    }

    请注意,集群对象不能使用循环中止后,因为事情没有被正确清理,这就是为什么正常的停止错误处理不能这样工作。


    I am using the R package foreach() with %dopar% to do long (~days) calculations in parallel. I would like the ability to stop the entire set of calculations in the event that one of them produces an error. However, I have not found a way to achieve this, and from the documentation and various forums I have found no indication that this is possible. In particular, break() does not work and stop() only stops the current calculation, not the whole foreach loop.

    Note that I cannot use a simple for loop, because ultimately I want to parallelize this using the doRNG package.

    Here is a simplified, reproducible version of what I am attempting (shown here in serial with %do%, but I have the same problem when using doRNG and %dopar%). Note that in reality I want to run all of the elements of this loop (here 10) in parallel.

    library(foreach)
    myfunc <- function() {
      x <- foreach(k = 1:10, .combine="cbind", .errorhandling="stop") %do% {
        cat("Element ", k, "\n")
        Sys.sleep(0.5) # just to show that stop does not cause exit from foreach
        if(is.element(k, 2:6)) {
          cat("Should stop\n")
          stop("Has stopped")
        }
        k
      }
      return(x)
    }
    x <- myfunc()
    # stop() halts the processing of k=2:6, but it does not stop the foreach loop itself.
    # x is not returned. The execution produces the error message
    # Error in { : task 2 failed - "Has stopped"
    

    What I would like to achieve is that the entire foreach loop can be exited immediately upon some condition (here, when the stop() is encountered).

    I have found no way to achieve this with foreach. It seems that I would need a way to send a message to all the other processes to make them stop too.

    If not possible with foreach, does anyone know of alternatives? I have also tried to achieve this with parallel::mclapply, but that does not work either.

    > sessionInfo()
    R version 3.0.0 (2013-04-03)
    Platform: x86_64-apple-darwin10.8.0 (64-bit)
    
    locale:
    [1] C/UTF-8/C/C/C/C
    
    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods base
    
    other attached packages:
    [1] foreach_1.4.0
    
    loaded via a namespace (and not attached):
    [1] codetools_0.2-8 compiler_3.0.0  iterators_1.0.6
    

    解决方案

    It sounds like you want an impatient version of the "stop" error handling. You could implement that by writing a custom combine function, and arranging for foreach to call it as soon as each result is returned. To do that you need to:

    • Use a backend that supports calling combine on-the-fly, like doMPI or doRedis
    • Don't enable .multicombine
    • Set .inorder to FALSE
    • Set .init to something (like NULL)

    Here's an example that does that:

    library(foreach)
    parfun <- function(errval, n) {
      abortable <- function(errfun) {
        comb <- function(x, y) {
          if (inherits(y, 'error')) {
            warning('This will leave your parallel backend in an inconsistent state')
            errfun(y)
          }
          c(x, y)
        }
        foreach(i=seq_len(n), .errorhandling='pass', .export='errval',
                .combine='comb', .inorder=FALSE, .init=NULL) %dopar% {
          if (i == errval)
            stop('testing abort')
          Sys.sleep(10)
          i
        }
      }
      callCC(abortable)
    }
    

    Note that I also set the error handling to "pass" so foreach will call the combine function with an error object. The callCC function is used to return from the foreach loop regardless of the error handling used within foreach and the backend. In this case, callCC will call the abortable function, passing it a function object that is used force callCC to immediately return. By calling that function from the combine function we can escape from the foreach loop when we detect an error object, and have callCC return that object. See ?callCC for more information.

    You can actually use parfun without a parallel backend registered and verify that the foreach loop "breaks" as soon as it executes a task that throws an error, but that could take awhile since the tasks are executed sequentially. For example, this takes 20 seconds to execute if no backend is registered:

    print(system.time(parfun(3, 4)))
    

    When executing parfun in parallel, we need to do more than simply break out of the foreach loop: we also need to stop the workers, otherwise they will continue to compute their assigned tasks. With doMPI, the workers can be stopped using mpi.abort:

    library(doMPI)
    cl <- startMPIcluster()
    registerDoMPI(cl)
    r <- parfun(getDoParWorkers(), getDoParWorkers())
    if (inherits(r, 'error')) {
      cat(sprintf('Caught error: %s\n', conditionMessage(r)))
      mpi.abort(cl$comm)
    }
    

    Note that the cluster object can't be used after the loop aborts, because things weren't properly cleaned up, which is why the normal "stop" error handling doesn't work this way.

    这篇关于有没有办法摆脱foreach循环?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆