使用生成器作为子流程输入;得到“对关闭文件的I/O操作";例外 [英] use generator as subprocess input; got "I/O operation on closed file" exception

查看:130
本文介绍了使用生成器作为子流程输入;得到“对关闭文件的I/O操作";例外的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大文件,需要馈入另一个命令才能处理.我可以将处理后的数据另存为临时文件,但要避免.我编写了一个生成器,该生成器一次处理每一行,然后按照脚本将其作为输入输入到外部命令.但是在第二轮循环中出现了对关闭文件的I/O操作"异常:

I have a large file that needs to be processed before feeding to another command. I could save the processed data as a temporary file but would like to avoid it. I wrote a generator that processes each line at a time then following script to feed to the external command as input. however I got "I/O operation on closed file" exception at the second round of the loop:

cmd = ['intersectBed', '-a', 'stdin', '-b', bedfile]
p = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
for entry in my_entry_generator: # <- this is my generator
    output = p.communicate(input='\t'.join(entry) + '\n')[0]
    print output

我读了另一个使用p.stdin.write的类似问题.但仍然有同样的问题.

I read another similar question that uses p.stdin.write. but still had the same problem.

我做错了什么?

我将以下两个语句替换为以下内容(感谢SpliFF):

[edit] I replaced last two statements with following (thanks SpliFF):

    output = p.communicate(input='\t'.join(entry) + '\n')
    if output[1]: print "error:", output[1]
    else: print output[0]

查看外部程序是否存在任何错误.但不是. p.communicate行仍然有相同的例外.

to see if there was any error by the external program. But no. Still have the same exception at p.communicate line.

推荐答案

subprocess.Popen对象的communicate方法只能被调用一次.它的作用是将输入的内容发送给进程 ,同时读取所有的stdout和stderr输出.所谓全部",是指它等待进程退出,以便知道所有输出. communicate返回后,该进程将不再存在.

The communicate method of subprocess.Popen objects can only be called once. What it does is it sends the input you give it to the process while reading all the stdout and stderr output. And by "all", I mean it waits for the process to exit so that it knows it has all output. Once communicate returns, the process no longer exists.

如果要使用communicate,则必须在循环中重新启动该过程,或者给它一个字符串,该字符串是生成器输入的 all .如果要进行流式通信,一点一点地发送数据,则不必使用communicate.相反,您需要在从p.stdoutp.stderr读取时写入p.stdin.这样做很棘手,因为您无法确定哪个输出是由哪个输入引起的,并且因为您很容易陷入死锁.有第三方库可以帮助您解决此问题,例如Twisted.

If you want to use communicate, you have to either restart the process in the loop, or give it a single string that is all the input from the generator. If you want to do streaming communication, sending data bit by bit, then you have to not use communicate. Instead, you would need to write to p.stdin while reading from p.stdout and p.stderr. Doing this is tricky, because you can't tell which output is caused by which input, and because you can easily run into deadlocks. There are third-party libraries that can help you with this, like Twisted.

如果要交互执行 ,发送一些数据,然后等待并处理结果,然后再发送更多数据,事情会变得更加困难.您可能应该使用pexpect这样的第三方库.

If you want to do this interactively, sending some data and then waiting for and processing the result before sending more data, things get even harder. You should probably use a third-party library like pexpect for that.

当然,如果您可以仅在循环内开始该过程,那么这会容易得多:

Of course, if you can get away with just starting the process inside the loop, that would be a lot easier:

cmd = ['intersectBed', '-a', 'stdin', '-b', bedfile]
for entry in my_entry_generator:
    p = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    output = p.communicate(input='\t'.join(entry) + '\n')[0]
    print output

这篇关于使用生成器作为子流程输入;得到“对关闭文件的I/O操作";例外的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆