妖怪服务器中的python子进程死锁 [英] python subprocess deadlock in demonized server

查看:198
本文介绍了妖怪服务器中的python子进程死锁的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为dar设置远程备份服务器,这些行。如果可能的话,我真的很想用python做所有的管道工作,但是我问了一个单独的问题。 / p>

子进程中使用netcat.Popen(cmd,shell = True),我成功进行了差异备份,因为在dar网站上的示例中。仅有的两个问题是:


  1. 我不知道如何以这种方式动态分配端口号

  2. 如果我在后台执行服务器,它将失败。为什么?

更新:这似乎与netcat无关;



这是我的代码:

 从套接字导入套接字,AF_INET,SOCK_STREAM 
导入os,sys
导入SocketServer
导入子进程

类DarHandler(SocketServer.BaseRequestHandler):
def handle(self):
print('enter handler')
data = self.request.recv(1024).strip()
print('got:'+ data)$ b如果数据=='xform',则为$ b:
cmd1 ='nc -dl 41201 | dar_slave档案/远程主机| nc -l 41202’
print(cmd1)
cmd2 =’nc -dl 41200 | dar_xform -s 10k-存档/差异备份'
print(cmd2)
proc1 = subprocess.Popen(cmd1,shell = True)
proc2 = subprocess.Popen(cmd2,shell = True)
print('发送端口号')
self.request.send('41200')
print('等待')
结果= str(proc1.wait())
print('nc-dar_slave-nc返回'+结果)
结果= str(proc2.wait())
print('nc-dar_xform返回'+结果)
其他:
结果='错误请求'
self.request.send(结果)
print('发送结果,退出处理程序')

myaddress =(''localhost ',18010)
def server():
服务器= SocketServer.TCPServer(myaddress,DarHandler)
print('listening')
server.serve_forever()

def client():
袜子=套接字(AF_INET,SOCK_STREAM)
print('connecting')
sock.connect(('localhost',18010))
print('connected,send request')
sock.send('xform')
print('waiting for response')
port = sock.recv(1024)
print('got:'+ port)
尝试:
os.unlink('toslave')
除外:
pass
os.mkfifo('toslave ')
cmd1 ='nc -w3 localhost 41201< toslave
cmd2 = nc -w3 localhost 41202 | dar -B config / test.dcf -A--o toslave -c-| nc -w3 localhost'+端口
print(cmd2)
proc1 =子进程.Popen(cmd1,shell = True)
proc2 =子进程.Popen(cmd2,shell = True)
print('waiting')
result2 = proc2.wait()
result1 = proc1.wait()
print('nc< fifo返回:'+ str(result1))
print('nc-dar-nc返回:'+ str(result2))
结果= sock.recv(1024)
print('接收:'+结果)
袜子。 close()
print('socket close,exiting')

if __name__ == __main__:
如果sys.argv [1] .startswith('serv') :
server()
else:
client()

在服务器上会发生以下情况:

  $ python clientserver.py serve& 
[1] 4651
$听
进入处理程序
得到了:xform
nc -dl 41201 | dar_slave档案/远程主机| nc -l 41202
nc -dl 41200 | dar_xform -s 10k-归档文件/差异备份
发送端口号
等待

[1] +已停止的python clientserver.py服务

这是客户端上发生的事情:

  $ python clientserver.py客户端
正在连接
已连接,发送请求
等待响应
得到了:41200
nc -w3 localhost 41202 | dar -B config / test.dcf -A--o toslave -c-| nc -w3 localhost 41200
等待
致命错误,正在中止操作
在管道上读取损坏的数据
nc nc-dar-nc返回:1

客户端也挂起,我必须用键盘中断将其杀死。

解决方案


  1. 使用 Popen.communicate()代替 Popen.wait()的值。



    wait()的python文档指出:



    警告:如果子进程生成足够的输出到stdout或stderr管道,使其阻塞,等待OS管道缓冲区接受更多数据。


  2. Dar及其相关可执行文件如果它们不是交互式运行,应该得到-Q


  3. 何时同步多个进程,请确保先在最弱的链接上调用 communicate():<$ c $ dar_slave 在<$ c之前$ c> dar_xform 和 dar cat 之前。在问题中此操作已正确完成,但值得注意。


  4. 清理共享资源。客户端进程正在打开一个套接字,而dar_xform仍在从该套接字读取。 dar和朋友在不关闭套接字的情况下完成尝试在初始套接字上发送/接收数据将导致死锁。


这是一个有效的示例不使用 shell = True 或netcat。这样做的优点是我可以动态分配辅助端口,因此可以想象同时为多个备份客户端提供服务。


I'm trying to set up a remote backup server for dar, along these lines. I'd really like to do all the piping with python if possible, but I've asked a separate question about that.

Using netcat in subprocess.Popen(cmd, shell=True), I succeeded in making a differential backup, as in the examples on the dar site. The only two problems are:

  1. I don't know how to assign port numbers dynamically this way
  2. If I execute the server in the background, it fails. Why?

Update: This doesn't seem to be related to netcat; it hangs even without netcat in the mix.

Here's my code:

from socket import socket, AF_INET, SOCK_STREAM
import os, sys
import SocketServer
import subprocess

class DarHandler(SocketServer.BaseRequestHandler):
    def handle(self):
        print('entering handler')
        data = self.request.recv(1024).strip()
        print('got: ' + data)
        if data == 'xform':
            cmd1 = 'nc -dl 41201 | dar_slave archives/remotehost | nc -l 41202'
            print(cmd1)
            cmd2 = 'nc -dl 41200 | dar_xform -s 10k - archives/diffbackup'
            print(cmd2)
            proc1 = subprocess.Popen(cmd1, shell=True)
            proc2 = subprocess.Popen(cmd2, shell=True)
            print('sending port number')
            self.request.send('41200')
            print('waiting')
            result = str(proc1.wait())
            print('nc-dar_slave-nc returned ' + result)
            result = str(proc2.wait())
            print('nc-dar_xform returned ' + result)
        else:
            result = 'bad request'
        self.request.send(result)
        print('send result, exiting handler')

myaddress = ('localhost', 18010)
def server():
    server = SocketServer.TCPServer(myaddress, DarHandler)
    print('listening')
    server.serve_forever()

def client():
    sock = socket(AF_INET, SOCK_STREAM)
    print('connecting')
    sock.connect(('localhost', 18010))
    print('connected, sending request')
    sock.send('xform')
    print('waiting for response')
    port = sock.recv(1024)
    print('got: ' + port)
    try:
        os.unlink('toslave')
    except:
        pass
    os.mkfifo('toslave')
    cmd1 = 'nc -w3 localhost 41201 < toslave'
    cmd2 = 'nc -w3 localhost 41202 | dar -B config/test.dcf -A - -o toslave -c - | nc -w3 localhost ' + port
    print(cmd2)
    proc1 = subprocess.Popen(cmd1, shell=True)
    proc2 = subprocess.Popen(cmd2, shell=True)
    print('waiting')
    result2 = proc2.wait()
    result1 = proc1.wait()
    print('nc<fifo returned: ' + str(result1))
    print('nc-dar-nc returned: ' + str(result2))
    result = sock.recv(1024)
    print('received: ' + result)
    sock.close()
    print('socket closed, exiting')

if __name__ == "__main__":
    if sys.argv[1].startswith('serv'):
        server()
    else:
        client()

Here's what happens on the server:

$ python clientserver.py serve &
[1] 4651
$ listening
entering handler
got: xform
nc -dl 41201 | dar_slave archives/remotehost | nc -l 41202
nc -dl 41200 | dar_xform -s 10k - archives/diffbackup
sending port number
waiting

[1]+  Stopped                 python clientserver.py serve

Here's what happens on the client:

$ python clientserver.py client
connecting
connected, sending request
waiting for response
got: 41200
nc -w3 localhost 41202 | dar -B config/test.dcf -A - -o toslave -c - | nc -w3 localhost 41200
waiting
FATAL error, aborting operation
Corrupted data read on pipe
nc<fifo returned: 1
nc-dar-nc returned: 1

The client also hangs, and I have to kill it with a keyboard interrupt.

解决方案

  1. Use Popen.communicate() instead of Popen.wait().

    The python documentation for wait() states:

    Warning: This will deadlock if the child process generates enough output to a stdout or stderr pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that.

  2. Dar and its related executables should get a -Q if they aren't running interactively.

  3. When syncronizing multiple processes, make sure to call communicate() on the 'weakest link' first: dar_slave before dar_xform and dar before cat. This was done correctly in the question, but it's worth noting.

  4. Clean up shared resources. The client process is holding open a socket from which dar_xform is still reading. Attempting to send/recv data on the initial socket after dar and friends are finished without closing that socket will therefore cause a deadlock.

Here is a working example which doesn't use shell=True or netcat. An advantage of this is I can have the secondary ports assigned dynamically and therefore could conceivably serve multiple backup clients simultaneously.

这篇关于妖怪服务器中的python子进程死锁的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆