如何解码视频(内存文件/字节字符串)并在python中逐帧执行? [英] How to decode a video (memory file / byte string) and step through it frame by frame in python?

查看:200
本文介绍了如何解码视频(内存文件/字节字符串)并在python中逐帧执行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 python 进行一些基本的图像处理,并希望对其进行扩展以逐帧处理视频.

I am using python to do some basic image processing, and want to extend it to process a video frame by frame.

我从服务器获取视频作为 blob - .webm 编码 - 并将其作为字节字符串保存在 python 中 (b'\x1aE\xdf\xa3\xa3B\x86\x81\x01B\xf7\x81\x01B\xf2\x81\x04B\xf3\x81\x08B\x82\x88matroskaB\x87\x81\x04B\x85\x81\x02\x18S\x80g\x01\xff\xff\xff\xffxff\x15I\xa9f\x99*\xd7\xb1\x83\x0fB@M\x80\x86ChromeWA\x86Chrome\x16T\xaek\xad\xae\xab\xd7\x81\x01s\xc5\x87\x04\xe\x16\t^\x8c\x83\x81\x01\x86\x8fV_MPEG4/ISO/AVC\xe0\x88\xb0\x82\x02\x80\xba\x82\x01\xe0\x1fC\xb6u\x01\xff\xff\xff\xff\xff\xff ...).

I get the video as a blob from a server - .webm encoded - and have it in python as a byte string (b'\x1aE\xdf\xa3\xa3B\x86\x81\x01B\xf7\x81\x01B\xf2\x81\x04B\xf3\x81\x08B\x82\x88matroskaB\x87\x81\x04B\x85\x81\x02\x18S\x80g\x01\xff\xff\xff\xff\xff\xff\xff\x15I\xa9f\x99*\xd7\xb1\x83\x0fB@M\x80\x86ChromeWA\x86Chrome\x16T\xaek\xad\xae\xab\xd7\x81\x01s\xc5\x87\x04\xe8\xfc\x16\t^\x8c\x83\x81\x01\x86\x8fV_MPEG4/ISO/AVC\xe0\x88\xb0\x82\x02\x80\xba\x82\x01\xe0\x1fC\xb6u\x01\xff\xff\xff\xff\xff\xff ...).

我知道有 cv.VideoCapture,它几乎可以满足我的需求.问题是我必须首先将文件写入磁盘,然后再次加载它.将字符串包装起来似乎更简洁,例如,将其放入 IOStream,并将其提供给某个进行解码的函数.

I know that there is cv.VideoCapture, which can do almost what I need. The problem is that I would have to first write the file to disk, and then load it again. It seems much cleaner to wrap the string, e.g., into an IOStream, and feed it to some function that does the decoding.

在python中是否有一种干净的方法可以做到这一点,或者正在写入磁盘并再次加载它?

Is there a clean way to do this in python, or is writing to disk and loading it again the way to go?

推荐答案

根据 this 帖子,您不能使用 cv.VideoCapture 在内存流中解码.
您可以通过管道"解码流FFmpeg.

According to this post, you can't use cv.VideoCapture for decoding in memory stream.
you may decode the stream by "piping" to FFmpeg.

解决方案有点复杂,写入磁盘要简单得多,而且可能是更干净的解决方案.

The solution is a bit complicated, and writing to disk is much simpler, and probably cleaner solution.

我正在使用 FFmpeg(和 FFprobe)发布解决方案.
FFmpeg 有 Python 绑定,但解决方案是使用 subprocess<作为外部应用程序执行 FFmpeg/a> 模块.
(Python 绑定与 FFmpeg 一起工作良好,但管道到 FFprobe 不是.
我使用的是 Windows 10,我将 ffmpeg.exeffprobe.exe 放在执行文件夹中(您也可以设置执行路径).
对于 Windows,请下载最新的(静态喜欢的)稳定版本.

I am posting a solution using FFmpeg (and FFprobe).
There are Python bindings for FFmpeg, but the solution is executing FFmpeg as an external application using subprocess module.
(The Python binding is working well with FFmpeg, but piping to FFprobe is not).
I am using Windows 10, and I put ffmpeg.exe and ffprobe.exe in the execution folder (you may set the execution path as well).
For Windows, download the latest (statically liked) stable version.

我创建了一个执行以下操作的独立示例:

I created a standalone example that performs the following:

  • 生成合成视频,并将其保存到 WebM 文件(用作测试的输入).
  • 将文件作为二进制数据读入内存(用来自服务器的 blob 替换它).
  • 通过管道将二进制流传输到 FFprobe,以查找视频分辨率.
    如果预先知道分辨率,您可以跳过此部分.
    通过管道连接到 FFprobe 会使解决方案变得比应有的复杂.
  • 通过管道将二进制流传输到 FFmpeg stdin 进行解码,并从 stdout 管道中读取解码后的原始帧.
    写入 stdin 是使用 Python 线程分块完成的.
    (使用 stdinstdout 而不是命名管道的原因是为了 Windows 兼容性).
  • Generate synthetic video, and save it to WebM file (used as input for testing).
  • Read file into memory as binary data (replace it with your blob from the server).
  • Pipe the binary stream to FFprobe, for finding the video resolution.
    In case the resolution is known from advance, you may skip this part.
    Piping to FFprobe makes the solution more complicated than it should have.
  • Pipe the binary stream to FFmpeg stdin for decoding, and read decoded raw frames from stdout pipe.
    Writing to stdin is done in chunks using Python thread.
    (The reason for using stdin and stdout instead of named pipes is for Windows compatibility).

管道架构:

 --------------------  Encoded      ---------  Decoded      ------------
| Input WebM encoded | data        | ffmpeg  | raw frames  | reshape to |
| stream (VP9 codec) | ----------> | process | ----------> | NumPy array|
 --------------------  stdin PIPE   ---------  stdout PIPE  -------------

代码如下:

import numpy as np
import cv2
import io
import subprocess as sp
import threading
import json
from functools import partial
import shlex

# Build synthetic video and read binary data into memory (for testing):
#########################################################################
width, height = 640, 480
sp.run(shlex.split('ffmpeg -y -f lavfi -i testsrc=size={}x{}:rate=1 -vcodec vp9 -crf 23 -t 50 test.webm'.format(width, height)))

with open('test.webm', 'rb') as binary_file:
    in_bytes = binary_file.read()
#########################################################################


# https://stackoverflow.com/questions/5911362/pipe-large-amount-of-data-to-stdin-while-using-subprocess-popen/14026178
# https://stackoverflow.com/questions/15599639/what-is-the-perfect-counterpart-in-python-for-while-not-eof
# Write to stdin in chunks of 1024 bytes.
def writer():
    for chunk in iter(partial(stream.read, 1024), b''):
        process.stdin.write(chunk)
    try:
        process.stdin.close()
    except (BrokenPipeError):
        pass  # For unknown reason there is a Broken Pipe Error when executing FFprobe.


# Get resolution of video frames using FFprobe
# (in case resolution is know, skip this part):
################################################################################
# Open In-memory binary streams
stream = io.BytesIO(in_bytes)

process = sp.Popen(shlex.split('ffprobe -v error -i pipe: -select_streams v -print_format json -show_streams'), stdin=sp.PIPE, stdout=sp.PIPE, bufsize=10**8)

pthread = threading.Thread(target=writer)
pthread.start()

pthread.join()

in_bytes = process.stdout.read()

process.wait()

p = json.loads(in_bytes)

width = (p['streams'][0])['width']
height = (p['streams'][0])['height']
################################################################################


# Decoding the video using FFmpeg:
################################################################################
stream.seek(0)

# FFmpeg input PIPE: WebM encoded data as stream of bytes.
# FFmpeg output PIPE: decoded video frames in BGR format.
process = sp.Popen(shlex.split('ffmpeg -i pipe: -f rawvideo -pix_fmt bgr24 -an -sn pipe:'), stdin=sp.PIPE, stdout=sp.PIPE, bufsize=10**8)

thread = threading.Thread(target=writer)
thread.start()


# Read decoded video (frame by frame), and display each frame (using cv2.imshow)
while True:
    # Read raw video frame from stdout as bytes array.
    in_bytes = process.stdout.read(width * height * 3)

    if not in_bytes:
        break  # Break loop if no more bytes.

    # Transform the byte read into a NumPy array
    in_frame = (np.frombuffer(in_bytes, np.uint8).reshape([height, width, 3]))

    # Display the frame (for testing)
    cv2.imshow('in_frame', in_frame)

    if cv2.waitKey(100) & 0xFF == ord('q'):
        break

if not in_bytes:
    # Wait for thread to end only if not exit loop by pressing 'q'
    thread.join()

try:
    process.wait(1)
except (sp.TimeoutExpired):
    process.kill()  # In case 'q' is pressed.
################################################################################

cv2.destroyAllWindows()

备注:

  • 如果您收到类似找不到文件:ffmpeg..."之类的错误,请尝试使用完整路径.
    例如(在 Linux 中):'/usr/bin/ffmpeg -i pipe: -f rawvideo -pix_fmt bgr24 -an -sn pipe:'
  • In case you are getting an error like "file not found: ffmpeg...", try using full path.
    For example (in Linux): '/usr/bin/ffmpeg -i pipe: -f rawvideo -pix_fmt bgr24 -an -sn pipe:'

这篇关于如何解码视频(内存文件/字节字符串)并在python中逐帧执行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆