终止从套接字服务器派生的僵尸子进程 [英] Terminating zombie child processes forked from socket server

查看:61
本文介绍了终止从套接字服务器派生的僵尸子进程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


免责声明



我很清楚,在这种情况下,对于套接字服务器,PHP可能不是最佳选择。请不要建议
不同的语言/平台-相信我-我从所有
的方向都听到过。


Unix环境中工作并使用 PHP 5.2.17 ,我的情况如下-我在PHP中构建了一个与Flash客户端进行通信的套接字服务器。我的第一个麻烦是,每个传入连接都会阻止顺序连接,直到完成处理为止。我通过使用PHP的 pcntl_fork()解决了这个问题。我成功地产生了许多子进程(将其PID保存在父进程中),这些子进程负责将消息广播给其他客户端,因此释放了父进程并允许其继续处理下一个连接。 / p>

我现在的主要问题是处理/处理这些死亡/僵尸子进程的收集并终止它们。我已经(一遍又一遍地)阅读了有关 pcntl_fork(),并意识到父进程负责清理其子进程。当子进程执行 exit(0)时,父进程会从其子进程收到SIGNAL。我可以使用 pcntl_signal()函数捕获该信号来设置信号处理程序



我的signal_handler看起来像这样:

  declare(滴答声= 1); 
函数sig_handler($ signo){
全球$ forks; //这是一个数组,其中包含所有子PID的
foreach($ forks AS $ key => $ childPid){
echo我的孩子{$ childPid}是否消失了?。PHP_EOL;
if(posix_kill($ childPid,9)){
echo孩子{$ childPid}不幸丧命!。PHP_EOL;
未设置($ forks [$ key]);
}
}
}

我确实是看到两个回显都包括需要删除的相关且正确的子PID,但似乎

  posix_kill($ childPid,9)

我理解为 kill -9 $的同义词childPid 返回TRUE,尽管实际上并没有删除该过程...



posix_kill 的手册页:


成功则返回TRUE,失败则返回FALSE。







我正在使用 ps 命令监视子进程。它们在系统上看起来像这样:

  web5 5296 5234 0 14:51吗? 00:00:00 [php]< defunct> 
web5 5321 5234 0 14:51? 00:00:00 [php]< defunct>
web5 5466 5234 0 14:52吗? 00:00:00 [php]< defunct>

如您所见,所有这些进程都是父进程的子进程,其PID为 5234



我在我的理解中缺少什么吗?我似乎已经设法使所有工作正常运行(并且确实可以运行),但是系统上留下了无数僵尸进程!



我的僵尸启示录计划很艰难稳定-

,但是即使 sudo kill -9 也不杀死僵尸子进程,我该怎么办?






10天后更新



经过一些额外的研究,我自己回答了这个问题,如果您仍然可以承受我的要求,请随意进行

解决方案


我保证在最后个解决方案:P


好的...就在这里,十天后,我相信我已经解决了这个问题。我不想添加到已经很久的帖子中,所以我将在此答案中包括我尝试过的一些内容。



接受 @sym的建议,并进一步阅读文档和对文档的评论,即 pcntl_waitpid() 描述状态:


如果pid请求的孩子在通话时已经退出(所谓的

僵尸进程),该函数立即返回。子

使用的所有系统资源都将被释放...


所以我设置了 pcntl_signal()像这样的处理程序-

  function sig_handler($ signo){
全局$ childProcesses;
$ pid = pcntl_waitpid(-1,$ status,WNOHANG);
echo发出警报!;
if($ pid!= 0){
if(posix_kill($ pid,9)){
echo孩子{$ pid}不幸丧命!。PHP_EOL;
unset($ childProcesses [$ pid]);
}
}
}
//这些定义信号处理
// pcntl_signal(SIGTERM, sig_handler);
// pcntl_signal(SIGHUP, sig_handler);
// pcntl_signal(SIGINT, sig_handler);
pcntl_signal(SIGCHLD, sig_handler);

为完成操作,我将提供用于分叉子进程的实际代码-

 函数broadcastData($ socketArray,$ data){
全局$ db,$ childProcesses;
$ pid = pcntl_fork();
if($ pid == -1){
//出问题了(在这里处理错误)
//记录错误,向管理员发送电子邮件,拉急停等...
echo无法fork()!!;
} elseif($ pid == 0){
//这部分仅在子级中执行
foreach($ socketArray AS $ socket){
//这里,但本质是这个
socket_write($ socket,$ msg,strlen($ msg));

// TODO:考虑在这里为每个客户进行额外的分叉。
}
//在此处触发信号
exit(0);
}

//如果子进程没有在上面退出,则此代码将是
//由父级和子级执行。在我的情况下,孩子将
//永远不会到达这些命令。
$ childProcesses [] = $ pid;
//子进程现在与父进程占用相同的数据库
//连接(在我的情况下为mysql)。我们必须
// //重新初始化父级的数据库连接,以便继续使用它。
$ db = dbEngine :: factory(_dbEngine);
}

是的……与代码:P



所以这看起来很棒,我看到了:

$的回声b $ b

发出警报!儿童12345不幸死亡!


但是,当套接字服务器循环进行下一次迭代时, socket_select()函数未能引发此错误:


PHP警告:socket_select():无法选择[4]:系统调用中断...


服务器现在将进入植物状态,完全不了解他周围的世界,除了手动发出的kill命令外,不响应其他任何请求根终端。






我不会弄清楚为什么会发生这种情况,或者我在调试之后会做些什么...只能说这是令人沮丧的一周...



很多咖啡,眼睛疼痛和10天后...



请打鼓



TL& DR-解决方案:



在php套接字文档中的2007年评论中,在此处中提及并在此教程中.org / rel = nofollow noreferrer> stuporglue (搜索良好育儿),就可以简单地忽略来自子进程的信号( SIGCHLD )传递 SIG_IGN pcntl_signal()函数-

  pcntl_signal(SIGCHLD,SIG_IGN); 

引用该链接的博客文章:


如果我们忽略SIGCHLD,子进程将在完成后自动获得。


相信它或不是-我加入了 pcntl_signal()行,删除了所有其他处理程序以及与孩子打交道的东西,它起作用了!不再有< defunct> 进程了!



对我而言,它真的不感兴趣我确切地知道子进程何时终止或者是谁死亡,我对它们完全不感兴趣-只是他们没有闲逛并崩溃了我的整个服务器:P


Disclaimer

I am well aware that PHP might not have been the best choice in this case for a socket server. Please refrain from suggesting different languages/platforms - believe me - I've heard it from all directions.

Working in a Unix environment and using PHP 5.2.17, my situation is as follows - I have constructed a socket server in PHP that communicates with flash clients. My first hurtle was that each incoming connection blocked the sequential connections until it had finished being processed. I solved this by utilizing PHP's pcntl_fork(). I was successfully able to spawn numerous child processes (saving their PID in the parent) that took care of broadcasting messages to the other clients and therefore "releasing" the parent process and allowing it to continue to process the next connection[s].

My main issue right now is dealing/handling with the collection of these dead/zombie child processes and terminating them. I have read (over and over) the relevant PHP manual pages for pcntl_fork() and realize that the parent process is in charge of cleaning up its children. The parent process receives a SIGNAL from its child when the child executes an exit(0). I am able to "catch" that signal using the pcntl_signal() function to setup a signal handler.

My signal_handler looks like this :

declare(ticks = 1); 
function sig_handler($signo){ 
  global $forks; // this is an array that holds all the child PID's
  foreach($forks AS $key=>$childPid){
    echo "has my child {$childPid} gone away?".PHP_EOL;
    if (posix_kill($childPid, 9)){
      echo "Child {$childPid} has tragically died!".PHP_EOL;
      unset($forks[$key]);
    }
  }
}

I am indeed seeing both echo's including the relevant and correct child PID that needs to be removed but it seems that

posix_kill($childPid, 9)

Which I understand to be synonymous with kill -9 $childPid is returning TRUE although it is in fact NOT removing the process...

Taken from the man pages of posix_kill :

Returns TRUE on success or FALSE on failure.


I am monitoring the child processes with the ps command. They appear like this on the system :

web5      5296  5234  0 14:51 ?        00:00:00 [php] <defunct>
web5      5321  5234  0 14:51 ?        00:00:00 [php] <defunct>
web5      5466  5234  0 14:52 ?        00:00:00 [php] <defunct>

As you can see all these processes are child processes of the parent which has the PID of 5234

Am I missing something in my understanding? I seem to have managed to get everything to work (and it does) but I am left with countless zombie processes on the system!

My plans for a zombie apocalypse are rock solid -
but what on earth can I do when even sudo kill -9 does not kill the zombie child processes?


Update 10 Days later

I've answered this question myself after some additional research, if you are still able to stand my ramblings proceed at will.

解决方案

I promise there is a solution at the end :P

Alright... so here we are, 10 days later and I believe that I have solved this issue. I didn't want to add onto an already longish post so I'll include in this answer some of the things that I tried.

Taking @sym's advice, and reading more into the documentation and the comments on the documentation, the pcntl_waitpid() description states :

If a child as requested by pid has already exited by the time of the call (a so-called
"zombie" process), the function returns immediately. Any system resources used by the child
are freed...

So I setup my pcntl_signal() handler like this -

function sig_handler($signo){ 
    global $childProcesses;
    $pid = pcntl_waitpid(-1, $status, WNOHANG);
    echo "Sound the alarm! ";
    if ($pid != 0){
        if (posix_kill($pid, 9)){
            echo "Child {$pid} has tragically died!".PHP_EOL;
            unset($childProcesses[$pid]);
        }
    }
}
// These define the signal handling
// pcntl_signal(SIGTERM, "sig_handler");
// pcntl_signal(SIGHUP,  "sig_handler");
// pcntl_signal(SIGINT, "sig_handler");
pcntl_signal(SIGCHLD, "sig_handler");

For completion, I'll include the actual code I'm using for forking a child process -

function broadcastData($socketArray, $data){
        global $db,$childProcesses;
        $pid = pcntl_fork();
        if($pid == -1) {
                // Something went wrong (handle errors here)
                // Log error, email the admin, pull emergency stop, etc...
                echo "Could not fork()!!";
        } elseif($pid == 0) {
                // This part is only executed in the child
                foreach($socketArray AS $socket) {
                        // There's more happening here but the essence is this
                        socket_write($socket,$msg,strlen($msg));

                        // TODO : Consider additional forking here for each client. 
                }
                // This is where the signal is fired
                exit(0);
        }

        // If the child process did not exit above, then this code would be
        // executed by both parent and child. In my case, the child will 
        // never reach these commands. 
        $childProcesses[] = $pid;
        // The child process is now occupying the same database 
        // connection as its parent (in my case mysql). We have to
        // reinitialize the parent's DB connection in order to continue using it. 
        $db = dbEngine::factory(_dbEngine); 
}

Yea... That's a ratio of 1:1 comments to code :P

So this was looking great and I saw the echo of :

Sound the alarm! Child 12345 has tragically died!

However when the socket server loop did it's next iteration, the socket_select() function failed throwing this error :

PHP Warning: socket_select(): unable to select [4]: Interrupted system call...

The server would now go into a vegetative state totally oblivious to the world around him, not responding to any requests other than manual kill commands from a root terminal.


I'm not going to get into why this was happening or what I did after that to debug it... lets just say it was a frustrating week...

much coffee, sore eyes and 10 days later...

Drum roll please

TL&DR - The Solution :

Mentioned here in a comment from 2007 in the php sockets documentation and in this tutorial on stuporglue (search for "good parenting"), one can simply "ignore" signals comming in from the child processes (SIGCHLD) by passing SIG_IGN to the pcntl_signal() function -

pcntl_signal(SIGCHLD, SIG_IGN);

Quoting from that linked blog post :

If we are ignoring SIGCHLD, the child processes will be reaped automatically upon completion.

Believe it or not - I included that pcntl_signal() line, deleted all the other handlers and things dealing with the children and it worked! There were no more <defunct> processes left hanging around!

In my case, it really did not interest me to know exactly when a child process died, or who it was, I wasn't interested in them at all - just that they didn't hang around and crash my entire server :P

这篇关于终止从套接字服务器派生的僵尸子进程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆