在 R 中使用 parLapply(并行包)进行 TryCatch [英] TryCatch with parLapply (Parallel package) in R

查看:35
本文介绍了在 R 中使用 parLapply(并行包)进行 TryCatch的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在一个非常大的数据集上运行一些东西.基本上,我想遍历文件夹中的所有文件并在其上运行 fromJSON 函数.但是,我希望它跳过产生错误的文件.我已经使用 tryCatch 构建了一个函数,但只有在我使用函数 lappy 而不是 parLapply 时才有效.

I am trying to run something on a very large dataset. Basically, I want to loop through all files in a folder and run the function fromJSON on it. However, I want it to skip over files that produce an error. I have built a function using tryCatch however, that only works when i use the function lappy and not parLapply.

这是我的异常处理函数的代码:

Here is my code for my exception handling function:

readJson <- function (file) {
 require(jsonlite)
 dat <- tryCatch(
        {
         fromJSON(file, flatten=TRUE)      
        },
         error = function(cond) {
                 message(cond)
                 return(NA)
        },
         warning = function(cond) {
                  message(cond)
                  return(NULL)
                  }
   )
  return(dat)   
}

然后我在包含 JSON 文件完整路径的字符向量 files 上调用 parLapply:

and then I call parLapply on a character vector files which contains the full paths to the JSON files:

 dat<- parLapply(cl,files,readJson)

当它到达一个没有正确结束的文件并且没有通过跳过有问题的文件来创建列表dat"时会产生错误.这就是 readJson 函数应该减轻的.

that produces an error when it reaches a file that doesn't end properly and does not create the list 'dat' by skipping over the problematic file. Which is what the readJson function was supposed to mitigate.

当我使用常规 lapply 时,它工作得很好.它生成错误,但是,它仍然通过跳过错误文件来创建列表.

When I use regular lapply, however it works perfectly fine. It generates the errors, however, it still creates the list by skipping over the erroneous file.

关于如何将异常处理与 parLappy 并行使用以便它跳过有问题的文件并生成列表的任何想法?

any ideas on how I could use exception handling with parLappy parallel such that it will skip over the problematic files and generate the list?

推荐答案

在你的 error 处理函数中 cond 是一个错误条件.message(cond) 发出这种情况的信号,它会被 worker 捕获并作为错误传输给 master.删除 message 调用或用类似的东西替换它们<代码>消息(条件消息(cond))但是,您不会在母版上看到任何内容,因此最好将其删除.

In your error handler function cond is an error condition. message(cond) signals this condition, which is caught on the workers and transmitted as an error to the master. Either remove the message calls or replace them with something like message(conditionMessage(cond)) You won't see anything on the master though, so removing is probably best.

这篇关于在 R 中使用 parLapply(并行包)进行 TryCatch的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆