Python:传递子流程的大型标准输出时,奇怪的挂起行为 [英] Python: Strange hanging behavior when piping large stdout of a subprocess

查看:155
本文介绍了Python:传递子流程的大型标准输出时,奇怪的挂起行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我当前正在呼叫ffmpeg,以从视频文件中提取二进制数据流,然后将该二进制数据放入列表中.此数据流中有很多数据,大约4,000 kb.这是代码

I am currently calling ffmpeg to extract a binary data stream from a video file, and then putting that binary data into a list. There is a lot of data in this data stream, about 4,000 kb. Here is the code

# write and call ffmpeg command, piping stdout
cmd = "ffmpeg -i video.mpg -map 0:1 -c copy -f data -"
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)

# read from stdout, byte by byte
li = []
for char in iter(lambda: proc.stdout.read(1), ""):
    li.append(char)

这很好.但是,如果我从stdout中取出正在读取的部分,它会开始工作,但会挂起:

This works fine. However, if I take out the part where I am reading from stdout, it starts working but then hangs:

cmd = "ffmpeg -i video.mpg -map 0:1 -c copy -f data -"
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)
time.sleep(10)

我必须在末尾添加time.sleep(10),否则该过程将在subprocess之前结束,从而导致此错误:

I had to add time.sleep(10) at the end or else the process would end before the subprocess, causing this error:

av_interleaved_write_frame(): Invalid argument
Error writing trailer of pipe:: Invalid argument
size=       0kB time=00:00:00.00 bitrate=N/A speed=N/A
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing ove
rhead: 0.000000%
Conversion failed!

调用subprocess.call(cmd, stdout=subprocess.PIPE)subprocess.call(cmd)也会导致挂起(后者仅在控制台中显示stdout,而前者则不显示).

Calling either subprocess.call(cmd, stdout=subprocess.PIPE) or subprocess.call(cmd) also cause hanging (the latter just displays the stdout in the console while the former doesn't).

stdout读取是否有防止这种挂起的东西(例如可能清除了缓冲区),还是我不知不觉地在其他地方引入了错误?我担心这么小的更改会导致程序中断.它并不能激发很大的信心.

Is there something about reading from stdout that prevents this hanging (like perhaps the buffer getting cleared), or am I unknowingly introducing a bug elsewhere? I'm worried that such a small change causes the program to break; it doesn't inspire very much confidence.

此代码的另一个问题是,我需要从另一个线程的列表中进行读取.这可能意味着我需要使用Queue.但是,当我执行以下代码时,与列表等效的3秒相比,它花费了11秒,而不是3秒:

The other issue with this code is that I need to read from the list from another thread. This might mean I need to use a Queue. But when I execute the below code, it takes 11 seconds as opposed to 3 seconds with the list equivalent:

cmd = "ffmpeg -i video.mpg -loglevel panic -hide_banner -map 0:1 -c copy -f data -"
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)

q = Queue()

for char in iter(lambda: proc.stdout.read(1), ""):
    q.put(char)

我应该使用其他数据结构吗?

Should I be using another data structure?

推荐答案

  1. 一次从管道读取一个字节的数据确实效率很低.您应该阅读更大的块.

  1. Reading data from the pipe one byte at a time is really inefficient. You should read bigger chunks.

执行子流程,然后在不等待子流程完成的情况下终止父流程会导致管道错误,并且您会注意到子流程将失败.

Executing the subprocess and then terminating the parent without waiting for the child to finish will cause a broken pipe error and the subprocess will fail, as you noticed.

如果OS缓冲区已满(即,如果您不像您的情况那样从管道中读取),则调用subprocess.call(cmd, stdout=subprocess.PIPE)将阻止/停止编写器.

Calling subprocess.call(cmd, stdout=subprocess.PIPE) will block/stall the writer if the OS buffer gets filled (ie. if you don't read from the pipe like in your case).

Queue就可以了

这篇关于Python:传递子流程的大型标准输出时,奇怪的挂起行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆