使用doSNOW写入全局变量并在R中进行并行化? [英] writing to global variables in using doSNOW and doing parallelization in R?

查看:252
本文介绍了使用doSNOW写入全局变量并在R中进行并行化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在多核上使用doSNOW软件包访问/写入全局变量时是否存在问题?

Is there a problem when accessing/writing to global variable in using doSNOW package on multiple cores?

在下面的程序中,每个MyCalculations(ii)都写入矩阵"globalVariable"的第ii列...

In the below program, each of the MyCalculations(ii) writes to the ii-th column of the matrix "globalVariable"...

您认为结果正确吗?会有隐藏的渔获吗?

Do you think the result will be correct? Will there be hidden catches?

非常感谢!

p.s.我必须写出全局变量,因为这是一个简单的例子,实际上我有很多输出需要从并行循环中传输……因此,可能的唯一方法是写出全局变量. ..

p.s. I have to write out to the global variable because this is a simplied example, in fact I have lots of outputs that need to be transported from within the parallel loops... therefore, probably the only way is to write out to global variables...

library(doSNOW)
MaxSearchSpace=44*5
globalVariable=matrix(0, 10000, MaxSearchSpace)
cl<-makeCluster(7)
registerDoSNOW(cl)
foreach (ii = 2:nMaxSearchSpace, .combine=cbind, .verbose=F) %dopar%
  {
   MyCalculations(ii)
  }

stopCluster(cl)

p.s.我问的是-在DoSnow框架内,是否存在访问/写入全局变量的危险... thx

p.s. I am asking - within the DoSnow framework, is there any danger of accessing/writing global variables... thx

推荐答案

由于这个问题已有两个月的历史了,所以希望您现在已经找到答案.但是,如果您仍然对反馈感兴趣,请考虑以下事项:

Since this question is a couple months old, I hope you've found an answer by now. However, in case you're still interested in feedback, here's something to consider:

foreach与并行后端一起使用时,您将无法以尝试的方式在R的全局环境中分配变量(您可能已经注意到了).使用顺序后端,分配将起作用,但不能使用 parallel ,例如doSNOW.

When using foreach with a parallel backend, you won't be able to assign to variables in R's global environment in the way you're attempting (you probably noticed this). Using a sequential backend, assignment will work, but not using a parallel one like with doSNOW.

相反,将每次迭代的所有计算结果保存在列表中,然后将其返回到对象,以便在所有计算完成后提取适当的结果.

Instead, save all the results of your calculations for each iteration in a list and return this to an object, so that you can extract the appropriate results after all calculations have been completed.

我的建议与您的示例类似:

My suggestion starts similarly to your example:

library(doSNOW)
MaxSearchSpace <- 44*5
cl <- makeCluster(parallel::detectCores())

# do not create the globalVariable object

registerDoSNOW(cl)

# Save the results of the `foreach` iterations as 
# lists of lists in an object (`theRes`)

theRes <- foreach (ii = 2:MaxSearchSpace, .verbose=F) %dopar%
  {
# do some calculations
   theNorms <- rnorm(10000)
   thePois <- rpois(10000, 2)
# store the results in a list
   list(theNorms, thePois)
  }

所有迭代完成后,从theRes中提取结果并将其存储为对象(例如globalVariableglobalVariable2等)

After all iterations have been completed, extract the results from theRes and store them as objects (e.g., globalVariable, globalVariable2, etc.)

globalVariable1 <- do.call(cbind, lapply(theRes, "[[", 1))
globalVariable2 <- do.call(cbind, lapply(theRes, "[[", 2))

请记住,如果您要进行的每次迭代都依赖于先前迭代的计算结果来进行计算,则这种并行计算不是采取的方法.

With this in mind, if you are performing calculations with each iteration that are dependent on the results of calculations from previous iterations, then this type of parallel computing is not the approach to take.

这篇关于使用doSNOW写入全局变量并在R中进行并行化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆