从foreach循环中赋值 [英] Assignment of a value from a foreach loop

查看:164
本文介绍了从foreach循环中赋值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想并行化一个循环,如

  td < -  data.frame(cbind(c(rep(1 ,4),2,rep(1,5)),rep(1:10,2)))
名称(td)< -c(val,id)
$ (b)(b)(b)(b)(b)(b)(d)(b) - 意思是(td $ val [td $ id!= i])
}

库(doParallel) foreach()的帮助下,以加速计算。不幸的是,foreach似乎不支持直接分配,至少是

  registerDoParallel(4)
res< - rep (n = NROW(td))
foreach(i = levels(interaction(td $ id)))%dopar%{
res [td $ id == i]< - mean(td $ val [td $ id!= i])}

与上面的正常循环相同的结果)。任何想法我做错了什么,或者我怎么可能破解在foreach的 .combine 选项为了做我想要的?请注意,id变量的顺序在原始数据集中并不总是相同的。任何暗示将非常感激!


解决方案

如果你使用data.table而不是一个循环的并行化,你的性能增益将会好几个数量级:

$ p $ library(data.table)
DT < - data.table(td)

DT [,means:= mean(DT [ - 。I,val]),by = id]

identical(DT $ means,res)
#[1] TRUE

如果您要使用 foreach 需要将它与 merge 合并:
$ b

  library(foreach)$ (t = $ id)),.combine = rbind)%do%{
data.frame(level = i,means = mean(td $ val [ td $ id!= i]))}

res2 < - merge(res2,td,by.x =level,by.y =id,sort = FALSE)

#level表示val
#1 1 1.111111 1
#2 1 1.111111 1
#3 2 1.111111 1
#4 2 1.111111 1
#5 3 1.111111 1
#6 3 1.111111 1
#7 4 1.111111 1
# 8 4 1.111111 1
#9 5 1.000000 2
#10 5 1.000000 2
#11 6 1.111111 1
#12 6 1.111111 1
#13 7 1.111111 1
#14 7 1.111111 1
#15 8 1.111111 1
#16 8 1.111111 1
#17 9 1.111111 1
#18 9 1.111111 1
#19 10 1.111111 1
#20 10 1.111111 1


I would like to parallelize a loop like

td        <- data.frame(cbind(c(rep(1,4),2,rep(1,5)),rep(1:10,2)))
names(td) <- c("val","id")

res <- rep(NA,NROW(td))
for(i in levels(interaction(td$id))){
res[td$id==i] <- mean(td$val[td$id!=i])
}  

with the help of foreach() of the library(doParallel) in order to speed up computations. Unfortunately foreach doesn't seem to support direct assignments, at least

registerDoParallel(4)
res <- rep(NA,NROW(td))
foreach(i=levels(interaction(td$id))) %dopar%{
res[td$id==i] <- mean(td$val[td$id!=i])}

doesn't do what I want (give the same result as the normal loop above). Any ideas what I am doing wrong or how I could somehow "hack" the .combine option in foreach in order to do what I want? Please note that the order of the id variable is not always the same in the original data set. Any hint would be very much appreciated!

解决方案

Your performance gain will be better by orders of magnitude if you use data.table for this instead of parallelization of a loop:

library(data.table)
DT <- data.table(td)

DT[, means := mean(DT[-.I, val]), by = id]

identical(DT$means, res)
#[1] TRUE

If you want to use foreach you'll need to combine it with a merge:

library(foreach)
res2 <- foreach(i=levels(interaction(td$id)), .combine=rbind) %do% {
  data.frame(level = i, means = mean(td$val[td$id!=i]))}

res2 <- merge(res2, td, by.x = "level", by.y = "id", sort = FALSE)

#    level    means val
# 1      1 1.111111   1
# 2      1 1.111111   1
# 3      2 1.111111   1
# 4      2 1.111111   1
# 5      3 1.111111   1
# 6      3 1.111111   1
# 7      4 1.111111   1
# 8      4 1.111111   1
# 9      5 1.000000   2
# 10     5 1.000000   2
# 11     6 1.111111   1
# 12     6 1.111111   1
# 13     7 1.111111   1
# 14     7 1.111111   1
# 15     8 1.111111   1
# 16     8 1.111111   1
# 17     9 1.111111   1
# 18     9 1.111111   1
# 19    10 1.111111   1
# 20    10 1.111111   1

这篇关于从foreach循环中赋值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆