将 R 中并行作业的输出保存到一个文件中 [英] Saving output from parallel jobs in R into one file
问题描述
我正在运行一个相当冗长的工作,需要复制 100 次,因此我转向了 R 中的 foreach 功能,然后我通过 shell 脚本在 8 核集群上运行该功能.我试图将每次运行的所有结果输入到同一个文件中.我已经包含了我的代码的简化版本.
I am running a rather lengthy job that I need to replicate 100 times, thus I have turned to the foreach capability in R which I then run on a 8-core cluster through a shell script. I am trying to input all of my results from each run to the same file. I have included a simplified version of my code.
cl<-makeCluster(core-1)
registerDoParallel(cl,cores=core)
SigEpsilonSq<-list()
SigLSq<-list()
RatioMat<-list()
foreach(p=1:100) %dopar%{
functions defining my variables{...}
for(i in 1:fMaxInd){
rhoSqjMatr[,i]<-1/(1+Bb[i])*(CbAdj+AbAdj*XjBarAdj+BbAdj[i]*XjSqBarAdj)/(dataZ*dataZ)
sigmaEpsSqV[i]<-mean(rhoSqjMatr[,i])
rhoSqjMatr[,i]<-rhoSqjMatr[,i]/sigmaEpsSqV[i]
biasCorrV[,i]<-sigmaEpsSqV[i]/L*gammaQl(rhoSqjMatr[,i])
Qcbar[,i]<-Qflbar-biasCorrV[,i]
sigmaExtSq[,i]<-sigmaSqExt(sigmaEpsSqV[i], rhoSqjMatr[,i])
ratioMatr[,i]<-sigmaExtSq[,i]/(sigmaL*sigmaL)#ratio (sigma_l^e)^2/(sigmaL)^2
}
sigmaEpsSqV<-as.matrix(sigmaEpsSqV)
SigEpsilonSq[[p]]<-sigmaEpsSqV
SigLSq[[p]]<-sigmaExtSq
RatioMat[[p]]<-ratioMatr
} #End of the dopar loop
stopCluster(cl)
write.csv(SigEpsilonSq,file="Sigma_Epsilon_Sq.csv")
write.csv(SigLSq,file="Sigma_L_Sq.csv")
write.csv(RatioMat,file="Ratio_Matrix.csv")
作业完成后,我的 .csv 文件为空.我相信我不太了解 foreach 如何保存结果以及如何访问它们.我想避免手动合并文件.另外,我需要写吗停止集群(cl)在我的 foreach 循环结束时还是等到最后?任何帮助将不胜感激.
When the job completes, my .csv files are empty. I believe I'm not quite understanding how the foreach saves results and how I can access them. I would like to avoid having to merge files manually. Also, do I need to write stopCluster(cl) at the end of my foreach loop or do I wait until the very end? Any help would be much appreciated.
推荐答案
这不是 foreach 的工作方式.你应该看看例子.如果要从并行化作业中输出某些内容,则需要使用 .combine.另外,而不是这个:
This is not how foreach works. You should look into examples. You need to use .combine, if you want to output something from your parallelized jobs. Also, instead of this:
sigmaEpsSqV<-as.matrix(sigmaEpsSqV)
SigEpsilonSq[[p]]<-sigmaEpsSqV
SigLSq[[p]]<-sigmaExtSq
RatioMat[[p]]<-ratioMatr
你必须像这样重写:
list(as.matrix(sigmaEpsSqV),sigmaEpsSqV,sigmaExtSq,ratioMatr)
您还可以使用 rbind、cbind、c、... 将结果聚合为一个最终输出.您甚至可以使用自己的组合功能,例如:
You can also use rbind, cbind, c,... to aggregate the results into one final output. You can even your own combine function, example:
.combine=function(x,y)rbindlist(list(x,y))
<小时>
下面的解决方案应该有效.输出应该是一个列表列表.然而,检索结果并以正确的格式保存它们可能会很痛苦.如果是这样,您应该设计自己的 .combine 函数.
The solution below should work. The output should be a list of lists. However it might be painful to retreive results and save them in the correct format. If so, you should design your own .combine function.
cl<-makeCluster(core-1)
registerDoParallel(cl,cores=core)
SigEpsilonSq<-list()
SigLSq<-list()
RatioMat<-list()
results = foreach(p=1:100, .combine=list) %dopar%{
functions defining my variables{...}
for(i in 1:fMaxInd){
rhoSqjMatr[,i]<-1/(1+Bb[i])*(CbAdj+AbAdj*XjBarAdj+BbAdj[i]*XjSqBarAdj)/(dataZ*dataZ)
sigmaEpsSqV[i]<-mean(rhoSqjMatr[,i])
rhoSqjMatr[,i]<-rhoSqjMatr[,i]/sigmaEpsSqV[i]
biasCorrV[,i]<-sigmaEpsSqV[i]/L*gammaQl(rhoSqjMatr[,i])
Qcbar[,i]<-Qflbar-biasCorrV[,i]
sigmaExtSq[,i]<-sigmaSqExt(sigmaEpsSqV[i], rhoSqjMatr[,i])
ratioMatr[,i]<-sigmaExtSq[,i]/(sigmaL*sigmaL)#ratio (sigma_l^e)^2/(sigmaL)^2
}
list(as.matrix(sigmaEpsSqV),sigmaEpsSqV,sigmaExtSq,ratioMatr)
} #End of the dopar loop
stopCluster(cl)
#Then you extract and save results
这篇关于将 R 中并行作业的输出保存到一个文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!