反序列化错误(socklist [[n]]):在Unix上连接错误 [英] Error in unserialize(socklist[[n]]) : error reading from connection on Unix

查看:611
本文介绍了反序列化错误(socklist [[n]]):在Unix上连接错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经试过在有20个CPU的Unix机器上运行下面的代码,使用R foreach parallel doParallel 派对包(我的目标是使派对/ varimp函数在多个CPU上并行工作):
$ b $ pre $ parallel_compute_varimp < - 函数(object,mincriterion = 0,conditional = FALSE,threshold = 0.2,
nperm = 1,OOB = TRUE,pre1.0_0 =条件)
{
响应< - 对象@响应
输入< - 对象@ data @ get(输入)
xnames< - colnames(input)
inp< - initVariableFrame(input,trafo = NULL)
y< - object @ responses @ variables [[1]]
error < - function (x,oob)mean((levels(y)[sapply(x,which.max)]!= y)[oob])

w< - object @ initweights
perror< ; - 矩阵(0,nrow = nperm * length(object @ ensemble),ncol = length(xnames))
colnames(perror)< - xnames

data = foreach(b = 1:length(object @ ensemble),.packages = c(party,stats),.combine = rbind)%dopar%
{
try({
tree< - object @ ensemble [[b]]
oob < - object @ weights [[b]] == 0

p < - .Call(R_predict,tree,inp, (varIDs(tree))){{}}中的(j,mincriterion,-1L,PACKAGE =party)

eoob< - error(p,oob)


for(per 1 in:nperm){
if(conditional || pre1.0_0){
tmp< - inp
ccl< - create_cond_list(条件,阈值,xnames [j],输入)
if(is.null(ccl)){
perm < - sample(which(oob))
}
else {
perm < - conditional_perm(ccl,xnames,input,tree,oob)
}
tmp @ variables [[j]] [which(oob)] < - tmp @ variables [[j]] [perm]
p < - .Call(R_predict,tree,tmp, minCriterion,-1L,PACKAGE =party)
}
else {
p < - .Call(R_predict,tree,inp,mincriterion,as.integer(j),PACKAGE =party)
}
perror [b,j]< - (error(p,oob) - eoob)
}
}

########
#返回数据%dopar%循环数据变量
perror [b,]
########

})#结束尝试
}#结束LOOP WITH PARALLEL COMPUTING
$ b perror = data
perror< - as.data.frame(perror)
return(MeanDecreaseAccuracy = colMeans(perror))
}

environment(parallel_compute_varimp)< - asNamespace('party')


cl < - makeCluster(detectCores())
registerDoParallel(cl, cores = detectCores())
< ...>
system.time(data.cforest.varimp< - parallel_compute_varimp(data.cforest,conditional = TRUE))

但我收到一个错误:

 > system.time(data.cforest.varimp<  -  parallel_compute_varimp(data.cforest,conditional = TRUE))
反序列化错误(socklist [[n]]):连接错误读取
定时停止:58.302 13.197 709.307

代码在4个CPU上处理较小的数据集。 b
$ b

我的想法已经不多了。有人可能会提出一种方法来达到我的目标,在并行CPU上运行派对varimp函数?

解决方案

错误:



pre $ 错误反序列化(socklist [[n]]):连接错误
pre>

表示主进程在调用unserialize从套接字连接中读取其中一个worker时出错。这可能意味着相应的工作人员死亡,从而放弃了套接字连接的结束。不幸的是,它可能已经死了很多原因,其中许多是非常系统特定的。

通常你可以通过使用makeClusteroutfile 选项,以便工作人员生成的错误消息不会丢失。我通常推荐使用 outfile =,如在此答案中所述。请注意,outfile选项在下雪和并行包中都是相同的。



您也可以通过注册顺序来验证foreach循环是否正常工作backend:

  registerDoSEQ()

如果幸运的话,foreach循环会在顺序执行时失败,因为通常很容易找出问题所在。


I have tried running the following code on a Unix machine with 20 CPU, using R foreach, parallel, doParallel, and party packages (my objective is to have the party / varimp function working on several CPUs in parallel):

parallel_compute_varimp <- function (object, mincriterion = 0, conditional = FALSE, threshold = 0.2, 
    nperm = 1, OOB = TRUE, pre1.0_0 = conditional) 
{
    response <- object@responses
    input <- object@data@get("input")
    xnames <- colnames(input)
    inp <- initVariableFrame(input, trafo = NULL)
    y <- object@responses@variables[[1]]
    error <- function(x, oob) mean((levels(y)[sapply(x, which.max)] != y)[oob])

    w <- object@initweights
    perror <- matrix(0, nrow = nperm * length(object@ensemble), ncol = length(xnames))
    colnames(perror) <- xnames

    data = foreach(b = 1:length(object@ensemble), .packages = c("party","stats"), .combine = rbind) %dopar%
    {
        try({
            tree <- object@ensemble[[b]]
            oob <- object@weights[[b]] == 0

            p <- .Call("R_predict", tree, inp, mincriterion, -1L, PACKAGE = "party")

            eoob <- error(p, oob)

            for (j in unique(varIDs(tree))) {
                for (per in 1:nperm) {
                    if (conditional || pre1.0_0) {
                      tmp <- inp
                      ccl <- create_cond_list(conditional, threshold, xnames[j], input)
                      if (is.null(ccl)) {
                        perm <- sample(which(oob))
                      }
                      else {
                        perm <- conditional_perm(ccl, xnames, input, tree, oob)
                      }
                      tmp@variables[[j]][which(oob)] <- tmp@variables[[j]][perm]
                      p <- .Call("R_predict", tree, tmp, mincriterion, -1L, PACKAGE = "party")
                    }
                    else {
                      p <- .Call("R_predict", tree, inp, mincriterion, as.integer(j), PACKAGE = "party")
                    }
                    perror[b, j] <- (error(p, oob) - eoob)
                }
            }

            ########
            # return data to the %dopar% loop data variable
            perror[b, ]
            ########

        }) # END OF TRY
    } # END OF LOOP WITH PARALLEL COMPUTING

    perror = data
    perror <- as.data.frame(perror)
    return(MeanDecreaseAccuracy = colMeans(perror))
}

environment(parallel_compute_varimp) <- asNamespace('party')


cl <- makeCluster(detectCores())
registerDoParallel(cl, cores = detectCores())
<...>
system.time(data.cforest.varimp <- parallel_compute_varimp(data.cforest, conditional = TRUE))

but I am getting an error:

> system.time(data.cforest.varimp <- parallel_compute_varimp(data.cforest, conditional = TRUE))
Error in unserialize(socklist[[n]]) : error reading from connection
Timing stopped at: 58.302 13.197 709.307

The code was working with a smaller dataset on 4 CPUs.

I am running out of ideas. Can someone suggest a way to reach my objective of running party package varimp function on parallel CPUs?

解决方案

The error:

Error in unserialize(socklist[[n]]) : error reading from connection

means that the master process got an error when calling unserialize to read from the socket connection to one of the workers. That probably means that the corresponding worker died, thus dropping its end of the socket connection. Unfortunately, it may have died for any number of reasons, many of which are very system specific.

You can usually figure out why the worker died by using the makeCluster "outfile" option so that the error message generated by the worker isn't thrown away. I usually recommend using outfile="" as described in this answer. Note that the "outfile" option works the same in both the snow and parallel packages.

You could also verify that your foreach loop works correctly when executed sequentially by registering the sequential backend:

registerDoSEQ()

If you're lucky, the foreach loop will fail when executed sequentially, since it's usually easier to figure out what is going wrong.

这篇关于反序列化错误(socklist [[n]]):在Unix上连接错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆