在 R doParallel 'foreach' 中找不到函数 - { 中的错误:任务 1 失败 -“找不到函数“光栅"" [英] Function not found in R doParallel 'foreach' - Error in { : task 1 failed - "could not find function "raster""

查看:122
本文介绍了在 R doParallel 'foreach' 中找不到函数 - { 中的错误:任务 1 失败 -“找不到函数“光栅""的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我第一次尝试在我所在的机构使用高性能集群,但遇到了无法解决的问题.

I am trying to use a high performance cluster at my institution for the first time and I have hit a problem that I can't resolve.

以下代码返回错误:

ptime<-system.time({
  r <- foreach(z = 1:length(files),.combine=cbind) %dopar% {
    raster <- raster(paste(folder,files[1],sep=""))
    data<-getValues(raster)
    clp <- na.omit(data)
    for(i in 1:length(classes)){
      results[i,z]<-length(clp[clp==classes[i]])/length(clp)
      print(z)
    }
  }
})

Error in { : task 1 failed - "could not find function "raster""

A 还为我的另一项任务尝试了不同的 foreach 代码:

A also tried a different foreach code for another task I have:

r <- foreach (i=1:length(poly)) %dopar% {
  clip<-gIntersection(paths,poly[i,])
  lgth<-gLength(clip)
  vid<-poly@data[i,3]
  path.lgth[i,] <- c(vid,lgth)
  print(i)
}

这次没有找到 gIntersection 函数.显然,这些包都已安装和加载.看了一些论坛帖子后,似乎与函数执行/运行的环境有关.

and this time the gIntersection function isn't found. Obviously the packages are all installed and loaded. After reading some forum posts it seem it has to do with the environment that the functions execute/operate in.

有人可以帮忙吗?我不是程序员!

Can someone please help? I'm not a programmer!

谢谢!

更新:

我已针对所提供的解决方案调整了代码:

I have adjusted my code for the solution provided:

results<-matrix(nrow=length(classes),ncol=length(files))
dimnames(results)[[1]]<-classes
dimnames(results)[[2]]<-files

ptime<-system.time({
    foreach(z = 1:length(files),.packages="raster") %dopar% {
    raster <- raster(paste(folder,files[z],sep=""))
    data<-getValues(raster)
    clp <- na.omit(data)
    for(i in 1:length(classes)){
      results[i,z]<-length(clp[clp==classes[i]])/length(clp)
      print(z)
    }
  }
})

但我得到的是一个充满 na 的输出(我的结果矩阵).如您所见,我创建了一个名为 results 的矩阵对象来填充结果(适用于 for 循环),但在阅读 foreach 的文档后,您似乎使用此函数以不同方式保存结果.

But what I get is an output (my results matrix) filled with na's. As you can see I create a matrix object called results to fill with results (which works with for loops), but after reading the documentation for foreach it seems that you save your results differently with this function.

关于我应该为 .combine 参数选择什么的建议?

And advice on what I should choose for the .combine argument?

推荐答案

foreach 的小插图 和 foreach 的帮助页面,指出参数 .packages 是在使用并行计算时必须提供的,这些函数默认未加载.所以你在第一个例子中的代码应该是:

In the vignette of foreach and the help page of foreach, the argument .packages is pointed out as necessary to provide when using parallel computation with functions that are not loaded by default. So your code in the first example should be:

ptime<-system.time({
  r <- foreach(z = 1:length(files),
               .combine=cbind, 
               .packages='raster') %dopar% {
      # some code
      # and more code
  }
})


更多解释

foreach 包在幕后做了很多设置.会发生以下情况(原则上,技术细节有点复杂):

The foreach package does a lot of setting up behind the scenes. What happens is the following (in principle, technical details are a tad more complicated):

  • foreach 建立了一个工人"系统您可以将其视为单独的 R 会话,每个会话都提交给集群中的不同核心.

  • foreach sets up a system of "workers" that you can see as separate R sessions that are each committed to a different core in a cluster.

需要执行的函数被加载到每个worker"中session,以及执行该功能所需的对象

The function that needs to be carried out is loaded into each "worker" session, together with the objects needed to carry out the function

每个工作人员计算数据子集的结果

each worker calculates the result for a subset of the data

将不同工作人员的计算结果放在一起并报告在主"文件中.R 会话.

The results of the calculation on the different workers is put together and reported in the "master" R session.

由于工作人员可以被视为单独的 R 会话,因此来自主"的包将被视为独立的 R 会话.会话不会自动加载.您必须指定应在这些工作会话中加载哪些包,这就是 foreach.package 参数的用途.

As the workers can be seen as separate R sessions, packages from the "master" session are not automatically loaded. You have to specify which packages should be loaded in those worker sessions, and that's what the .package argument of foreach is used for.

请注意,当您使用其他包(例如 parallelsnowfall)时,您必须显式设置这些工作器,并注意传递对象和在工作进程上加载包.

Note that when you use other packages (e.g. parallel or snowfall), you'll have to set up these workers explicitly, and also take care of passing objects and loading packages on the worker sessions.

这篇关于在 R doParallel 'foreach' 中找不到函数 - { 中的错误:任务 1 失败 -“找不到函数“光栅""的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆