如何使doSMP与plyr完美配合? [英] How do I make doSMP play nicely with plyr?

查看:108
本文介绍了如何使doSMP与plyr完美配合?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此代码有效:

library(plyr)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=FALSE) 

此代码失败:

library(doSMP)
workers <- startWorkers(2)
registerDoSMP(workers)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE) 
stopWorkers(workers)

>Error in do.ply(i) : task 3 failed - "subscript out of bounds"
In addition: Warning messages:
1: <anonymous>: ... may be used in an incorrect context: ‘.fun(piece, ...)’

2: <anonymous>: ... may be used in an incorrect context: ‘.fun(piece, ...)’

我正在使用R 2.1.12,plyr 1.4和doSMP 1.0-1.有没有人想办法解决这个问题?

I am using R 2.1.12, plyr 1.4 and doSMP 1.0-1. Has anyone figured out a way around this?

针对安德烈,这是进一步的说明:

edit: In response to Andrie, here is a further illustration:

system.time(ddply(x, .(V), function(df) Sys.sleep(1), .parallel=FALSE)) #1
system.time(ddply(x, .(V), function(df) Sys.sleep(1), .parallel=TRUE)) #2
library(doSMP)
workers <- startWorkers(2)
registerDoSMP(workers)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
system.time(ddply(x, .(V), function(df) Sys.sleep(1), .parallel=FALSE)) #3
system.time(ddply(x, .(V), function(df) Sys.sleep(1), .parallel=TRUE)) #4
stopWorkers(workers)

前三个功能正常工作,但它们都需要大约3秒钟.功能#2发出警告,提示没有并行后端注册,因此顺序执行.函数4的错误与我在原始帖子中提到的错误相同.

The first three functions work, but they all take about 3 seconds. Function #2 gives a warning that no parallel backend is registered, and thus executes sequentially. Function #4 gives the same error I referenced in my original post.

/edit:好奇者和好奇者:在我的Mac上,以下作品有效:

/edit: curioser and curiouser: On my mac, the following works:

library(plyr)
library(doMC)
registerDoMC()
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE)

但这失败了:

library(plyr)
library(doSMP)
workers <- startWorkers(2)
registerDoSMP(workers)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE) 
stopWorkers(workers)

这也失败了:

library(plyr)
library(snow)
library(doSNOW)
cl <- makeCluster(2, type = "SOCK")
registerDoSNOW(cl)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE) 
stopCluster(cl)

所以我认为foreach的各种并行后端是不可互换的.

So I suppose the various parallel back ends for foreach are not interchangeable.

推荐答案

虽然@hadley对问题的回答很好,但我想补充一点,我认为plyr现在可以与其他foreach并行后端一起使用.这是链接到一个博客条目,其中包含将plyr与doSNOW结合使用的示例:

While the question has been answered well by @hadley, I want to add that I think plyr now works with other foreach parallel back-ends. Here is a link to a blog entry containing an example where plyr is used in conjunction with doSNOW:

这篇关于如何使doSMP与plyr完美配合?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆