将 R 中并行作业的输出保存到一个文件中 [英] Saving output from parallel jobs in R into one file

查看:68
本文介绍了将 R 中并行作业的输出保存到一个文件中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行一个相当冗长的工作,需要复制 100 次,因此我转向了 R 中的 foreach 功能,然后我通过 shell 脚本在 8 核集群上运行该功能.我试图将每次运行的所有结果输入到同一个文件中.我已经包含了我的代码的简化版本.

I am running a rather lengthy job that I need to replicate 100 times, thus I have turned to the foreach capability in R which I then run on a 8-core cluster through a shell script. I am trying to input all of my results from each run to the same file. I have included a simplified version of my code.

cl<-makeCluster(core-1)
registerDoParallel(cl,cores=core)
SigEpsilonSq<-list()
SigLSq<-list()
RatioMat<-list()
foreach(p=1:100) %dopar%{

functions defining my variables{...}

  for(i in 1:fMaxInd){
   rhoSqjMatr[,i]<-1/(1+Bb[i])*(CbAdj+AbAdj*XjBarAdj+BbAdj[i]*XjSqBarAdj)/(dataZ*dataZ)
     sigmaEpsSqV[i]<-mean(rhoSqjMatr[,i])
     rhoSqjMatr[,i]<-rhoSqjMatr[,i]/sigmaEpsSqV[i]
     biasCorrV[,i]<-sigmaEpsSqV[i]/L*gammaQl(rhoSqjMatr[,i])
     Qcbar[,i]<-Qflbar-biasCorrV[,i]
     sigmaExtSq[,i]<-sigmaSqExt(sigmaEpsSqV[i], rhoSqjMatr[,i])
     ratioMatr[,i]<-sigmaExtSq[,i]/(sigmaL*sigmaL)#ratio (sigma_l^e)^2/(sigmaL)^2

   }   

   sigmaEpsSqV<-as.matrix(sigmaEpsSqV)
   SigEpsilonSq[[p]]<-sigmaEpsSqV
   SigLSq[[p]]<-sigmaExtSq
   RatioMat[[p]]<-ratioMatr 

} #End of the dopar loop

stopCluster(cl)

write.csv(SigEpsilonSq,file="Sigma_Epsilon_Sq.csv")
write.csv(SigLSq,file="Sigma_L_Sq.csv")
write.csv(RatioMat,file="Ratio_Matrix.csv")

作业完成后,我的 .csv 文件为空.我相信我不太了解 foreach 如何保存结果以及如何访问它们.我想避免手动合并文件.另外,我需要写吗停止集群(cl)在我的 foreach 循环结束时还是等到最后?任何帮助将不胜感激.

When the job completes, my .csv files are empty. I believe I'm not quite understanding how the foreach saves results and how I can access them. I would like to avoid having to merge files manually. Also, do I need to write stopCluster(cl) at the end of my foreach loop or do I wait until the very end? Any help would be much appreciated.

推荐答案

这不是 foreach 的工作方式.你应该看看例子.如果要从并行化作业中输出某些内容,则需要使用 .combine.另外,而不是这个:

This is not how foreach works. You should look into examples. You need to use .combine, if you want to output something from your parallelized jobs. Also, instead of this:

sigmaEpsSqV<-as.matrix(sigmaEpsSqV)
SigEpsilonSq[[p]]<-sigmaEpsSqV
SigLSq[[p]]<-sigmaExtSq
RatioMat[[p]]<-ratioMatr 

你必须像这样重写:

list(as.matrix(sigmaEpsSqV),sigmaEpsSqV,sigmaExtSq,ratioMatr)

您还可以使用 rbind、cbind、c、... 将结果聚合为一个最终输出.您甚至可以使用自己的组合功能,例如:

You can also use rbind, cbind, c,... to aggregate the results into one final output. You can even your own combine function, example:

.combine=function(x,y)rbindlist(list(x,y))

<小时>

下面的解决方案应该有效.输出应该是一个列表列表.然而,检索结果并以正确的格式保存它们可能会很痛苦.如果是这样,您应该设计自己的 .combine 函数.


The solution below should work. The output should be a list of lists. However it might be painful to retreive results and save them in the correct format. If so, you should design your own .combine function.

cl<-makeCluster(core-1)
registerDoParallel(cl,cores=core)
SigEpsilonSq<-list()
SigLSq<-list()
RatioMat<-list()
results = foreach(p=1:100, .combine=list) %dopar%{

  functions defining my variables{...}

  for(i in 1:fMaxInd){
   rhoSqjMatr[,i]<-1/(1+Bb[i])*(CbAdj+AbAdj*XjBarAdj+BbAdj[i]*XjSqBarAdj)/(dataZ*dataZ)
     sigmaEpsSqV[i]<-mean(rhoSqjMatr[,i])
     rhoSqjMatr[,i]<-rhoSqjMatr[,i]/sigmaEpsSqV[i]
     biasCorrV[,i]<-sigmaEpsSqV[i]/L*gammaQl(rhoSqjMatr[,i])
     Qcbar[,i]<-Qflbar-biasCorrV[,i]
     sigmaExtSq[,i]<-sigmaSqExt(sigmaEpsSqV[i], rhoSqjMatr[,i])
     ratioMatr[,i]<-sigmaExtSq[,i]/(sigmaL*sigmaL)#ratio (sigma_l^e)^2/(sigmaL)^2

   }   

   list(as.matrix(sigmaEpsSqV),sigmaEpsSqV,sigmaExtSq,ratioMatr)

} #End of the dopar loop

stopCluster(cl)

#Then you extract and save results

这篇关于将 R 中并行作业的输出保存到一个文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆