将大量数据写入标准输入 [英] writing large amount of data to stdin
问题描述
我正在向标准输入写入大量数据.
I am writing a large amount of data to stdin.
我如何确保它不会阻塞?
How do i ensure that it is not blocking?
p=subprocess.Popen([path],stdout=subprocess.PIPE,stdin=subprocess.PIPE)
p.stdin.write('A very very very large amount of data')
p.stdin.flush()
output = p.stdout.readline()
在我读取一个大字符串并写入后,它似乎挂在 p.stdin.write()
处.
It seems to hang at p.stdin.write()
after i read a large string and write to it.
我有一大堆文件,它们将按顺序写入标准输入(>1k 个文件)
I have a large corpus of files which will be written to stdin sequentially(>1k files)
所以发生的事情是我正在运行一个循环
So what happens is that i am running a loop
#this loop is repeated for all the files
for stri in lines:
p=subprocess.Popen([path],stdout=subprocess.PIPE,stdin=subprocess.PIPE)
p.stdin.write(stri)
output = p.stdout.readline()
#do some processing
它以某种方式挂在文件号上.400. 该文件是一个大文件,带有长字符串.
It somehow hangs at file no. 400. The file is a large file with long strings.
我怀疑是阻塞问题.
这只会在我从 0 到 1000 迭代时发生.但是,如果我从文件 400 开始,则不会发生错误
This only happens if i iterate from 0 to 1000. However, if i were to start from file 400, the error would not happen
推荐答案
为了以可移植的方式避免死锁,请在单独的线程中写入孩子:
To avoid the deadlock in a portable way, write to the child in a separate thread:
#!/usr/bin/env python
from subprocess import Popen, PIPE
from threading import Thread
def pump_input(pipe, lines):
with pipe:
for line in lines:
pipe.write(line)
p = Popen(path, stdin=PIPE, stdout=PIPE, bufsize=1)
Thread(target=pump_input, args=[p.stdin, lines]).start()
with p.stdout:
for line in iter(p.stdout.readline, b''): # read output
print line,
p.wait()
参见 Python:从 subprocess.communicate() 读取流输入
这篇关于将大量数据写入标准输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!