如何阻止R离开僵尸进程 [英] How to stop R from leaving zombie processes behind

查看:141
本文介绍了如何阻止R离开僵尸进程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



  library(doMC)
library(doParallel)
registerDoMC(4)
timing< - system.time(fitall < - foreach(i = 1:1000,.combine =c)%dopar%{
print(i)
})

我启动 R 和看看进程表:

 >系统(ps -efl)
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
4 S chbr 1 0 5 80 0 - 21399 wait 10:58? 00:00:00 / usr / local / lib / R / bin / exec / R --no-save --no-restore
0 S chbr 9 1 0 80 0 - 1113 wait 10:58? 00:00:00 sh -c ps -efl
0 R chbr 10 9 0 80 0 - 4294 - 10:58? 00:00:00 ps -efl

如果我使用上面提到的简单循环 doMC doParallel 在后面留下一个僵尸进程。输出 ps -efl 运行循环后:

 >系统(ps -efl)
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
4 S chbr 1 0 4 80 0 - 25256 wait 11:00? 00:00:00 / usr / local / lib / R / b
1 Z chbr 10 1 0 80 0 - 0 exit 11:00? 00:00:00 [R]< defunct>
0 S chbr 12 1 0 80 0 - 1113等待11:00? 00:00:00 sh -c ps -efl
0 R chbr 13 12 0 80 0 - 4294 - 11:00? 00:00:00 ps -efl

如果我不重复 registerDoMC(4)再次没有额外的僵尸进程被创建。但是,如果我发行 registerDoMC(4),则会创建一个额外的僵尸进程:

 >系统(ps -efl)
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
4 S chbr 1 0 0 80 0 - 25554等待11:00? 00:00:01 / usr / local / lib / R / b
1 Z chbr 21 1 0 80 0 - 0 exit 11:02? 00:00:00 [R]< defunct>
1 z chbr 22 1 0 80 0 - 0 exit 11:02? 00:00:00 [R]< defunct>
0 S chbr 26 1 0 80 0 - 1113等待11:03? 00:00:00 sh -c ps -efl
0 R chbr 27 26 0 80 0 - 4294 - 11:03? 00:00:00 ps -efl

这就是我认为它可以是 doMC 这是做不应该做的事情。如果doMC造成这种情况,有没有办法阻止 doMC 离开僵尸进程? ( stopCluster()不起作用,因为没有创建群集。)

 > sessionInfo()
R开发中(unstable)(2014-08-16 r66404)
平台:x86_64-unknown-linux-gnu(64位)

语言环境:$ LC_CTYPE = en_IE.UTF-8 LC_NUMERIC = C
[3] LC_TIME = en_IE.UTF-8 LC_COLLATE = en_IE.UTF-8
[5] LC_MONETARY = en_IE.UTF- LC_MESSAGES = en_IE.UTF-8
[7] LC_PAPER = en_IE.UTF-8 LC_NAME = C
[9] LC_ADDRESS = C LC_TELEPHONE = C
[11] LC_MEASUREMENT = en_IE.UTF -8 LC_IDENTIFICATION = C

附加的基本软件包:
[1] parallel stats graphics grDevices utils数据集方法
[8] base

其他附加软件包:
[1] doParallel_1.0.8 doMC_1.3.3 iterators_1.0.7 foreach_1.4.2

通过命名空间加载(而不是附加):
[1] codetools_0.2-8 compiler_3.2.0


解决方案

这与foreach无关或者doMC;正如Steve Weston在回答其他StackOverflow查询时指出的那样,doMC基本上只是mclapply的一个包装,你可以看到通过对mclapply的简单调用创建的僵尸进程:



<$ (并行)
mclapply(rep(5,4),rnorm)

在我的系统中,这会留下两个僵尸进程:

  [richcalaway @ richcalaway-pc〜] $ ps -efl | grep defunct 
1 Z 1660945517 28701 28624 0 77 0 - 0出口12:00 pts / 1 00:00:00 [R]< defunct>
1 Z 1660945517 28702 28624 0 78 0 - 0出口12:00 pts / 1 00:00:00 [r] <不存在>
0 S 1660945517 28704 28308 0 78 0 - 15306 pipe_w 12:00 pts / 2 00:00:00 grep defunct

在正常情况下,这些僵尸进程不会造成任何麻烦,并且在R会话结束时会消失。您可以通过使用doParallel和一个fork群集来避免它们,而不是使用doMC。



干杯,



首席项目经理

革命分析

Here is a little reproducible example:

library(doMC)
library(doParallel)
registerDoMC(4)
    timing <- system.time( fitall <- foreach(i=1:1000, .combine = "c") %dopar% {
                print(i)
            })

I start up R and look at the process table:

> system("ps -efl")
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S chbr         1     0  5  80   0 - 21399 wait   10:58 ?        00:00:00 /usr/local/lib/R/bin/exec/R --no-save --no-restore
0 S chbr         9     1  0  80   0 -  1113 wait   10:58 ?        00:00:00 sh -c ps -efl
0 R chbr        10     9  0  80   0 -  4294 -      10:58 ?        00:00:00 ps -efl

If I use the aformentioned simple for loop doMC or doParallel leave a zombie process behind. Output of ps -efl after running the loop:

> system("ps -efl")
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S chbr         1     0  4  80   0 - 25256 wait   11:00 ?        00:00:00 /usr/local/lib/R/b
1 Z chbr        10     1  0  80   0 -     0 exit   11:00 ?        00:00:00 [R] <defunct>
0 S chbr        12     1  0  80   0 -  1113 wait   11:00 ?        00:00:00 sh -c ps -efl
0 R chbr        13    12  0  80   0 -  4294 -      11:00 ?        00:00:00 ps -efl

If I repeat the loop without issuing registerDoMC(4) again no additional zombie process gets created. However, if I issue registerDoMC(4) an additional zombie process gets created:

> system("ps -efl")
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S chbr         1     0  0  80   0 - 25554 wait   11:00 ?        00:00:01 /usr/local/lib/R/b
1 Z chbr        21     1  0  80   0 -     0 exit   11:02 ?        00:00:00 [R] <defunct>
1 Z chbr        22     1  0  80   0 -     0 exit   11:02 ?        00:00:00 [R] <defunct>
0 S chbr        26     1  0  80   0 -  1113 wait   11:03 ?        00:00:00 sh -c ps -efl
0 R chbr        27    26  0  80   0 -  4294 -      11:03 ?        00:00:00 ps -efl

That's how I figured it could be doMC which is doing something that should not be done. If doMC is causing this is there a way to stop doMC from leaving zombie processes behind? (stopCluster() does not work as no cluster gets created in the first place.)

> sessionInfo()
R Under development (unstable) (2014-08-16 r66404)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_IE.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_IE.UTF-8        LC_COLLATE=en_IE.UTF-8    
 [5] LC_MONETARY=en_IE.UTF-8    LC_MESSAGES=en_IE.UTF-8   
 [7] LC_PAPER=en_IE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_IE.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] doParallel_1.0.8 doMC_1.3.3       iterators_1.0.7  foreach_1.4.2   

loaded via a namespace (and not attached):
[1] codetools_0.2-8 compiler_3.2.0

解决方案

This really has nothing to do with foreach or doMC; as Steve Weston has pointed out in answer to other StackOverflow queries, doMC is essentially just a wrapper for mclapply, and you can see zombie processes created with a simple call to mclapply:

library(parallel)
mclapply(rep(5,4), rnorm)

On my system, this leaves two zombie processes:

[richcalaway@richcalaway-pc ~]$ ps -efl | grep defunct
1 Z 1660945517 28701 28624  0 77  0 -     0 exit   12:00 pts/1    00:00:00 [R] <defunct>
1 Z 1660945517 28702 28624  0 78  0 -     0 exit   12:00 pts/1    00:00:00 [R] <defunct>
0 S 1660945517 28704 28308  0 78  0 - 15306 pipe_w 12:00 pts/2    00:00:00 grep defunct

Under normal circumstances, these zombie processes won't cause any trouble, and they do disappear when the R session ends. You can avoid them by using doParallel and a fork cluster instead of using doMC.

Cheers,

Rich Calaway

Principal Program Manager

Revolution Analytics

这篇关于如何阻止R离开僵尸进程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆