从FFProbe STDERR打印字符串时输出混乱 [英] Mangled output when printing strings from FFProbe STDERR

查看:275
本文介绍了从FFProbe STDERR打印字符串时输出混乱的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试制作一个简单的函数来包装FFProbe ,并且大多数数据都可以正确检索.

I'm trying to make a simple function to wrap around FFProbe, and most of the data can be retrieved correctly.

问题是,当同时使用Windows Command Prompt和Git Bash for Windows将字符串打印到命令行时,输出看起来混乱和混乱.

The problem is when actually printing the strings to the command line using both Windows Command Prompt and Git Bash for Windows, the output appears mangled and out of order.

某些歌曲(尤其是文件Imagine Dragons - Hit Parade_ Best of the Dance Music Charts\80 - Beazz - Lime (Extended Mix).flac)缺少元数据.我不知道为什么,但是下面函数返回的字典是空的.

Some songs (specifically the file Imagine Dragons - Hit Parade_ Best of the Dance Music Charts\80 - Beazz - Lime (Extended Mix).flac) are missing metadata. I don't know why, but the dictionary the function below returns is empty.

FFProbe将其结果输出到stderr,该结果可以通过管道传输到subprocess.PIPE,进行解码和解析.我选择了正则表达式作为解析位.

FFProbe outputs its results to stderr which can be piped to subprocess.PIPE, decoded, and parsed. I chose regex for the parsing bit.

这是下面我的代码的精简版,对于输出,请查看 Github要点.

This is a slimmed down version of my code below, for the output take a look at the Github gist.

#! /usr/bin/env python3
# -*- coding: utf-8 -*-

import os

from glob import glob
from re import findall, MULTILINE
from subprocess import Popen, PIPE


def glob_from(path, ext):
    """Return glob from a directory."""
    working_dir = os.getcwd()
    os.chdir(path)

    file_paths = glob("**/*." + ext)

    os.chdir(working_dir)

    return file_paths


def media_metadata(file_path):
    """Use FFPROBE to get information about a media file."""
    stderr = Popen(("ffprobe", file_path), shell=True, stderr=PIPE).communicate()[1].decode()

    metadata = {}

    for match in findall(r"(\w+)\s+:\s(.+)$", stderr, MULTILINE):
        metadata[match[0].lower()] = match[1]

    return metadata


if __name__ == "__main__":
    base = "C:/Users/spike/Music/Deezloader"

    for file in glob_from(base, "flac"):
        meta = media_metadata(os.path.join(base, file))
        title_length = meta.get("title", file) + " - " + meta.get("length", "000")

        print(title_length)

输出要点 输出原始

我不明白为什么只有当使用python的print函数打印到控制台时,输出(可以从正则表达式模式中有效地检索字符串,但是在打印时输出的格式却奇怪地格式化)为什么会出现混乱.无论如何构建字符串以打印,连接,以逗号分隔的参数,无论如何.

I don't understand why the output (the strings can be retrieved from the regex pattern effectively, however the output is strangely formatted when printing) appears disordered only when printing to the console using python's print function. It doesn't matter how I build the string to print, concatenation, comma-delimited arguments, whatever.

我最后以歌曲的长度开头,然后以歌曲名称结尾,但两者之间没有空格.由于某些原因,破折号挂在了尽头.根据前面代码中的print语句,格式应为Title - 000({title} - {length}),但输出看起来更像000Title -.为什么?

I end up with the length of the song first, and the song name second but without space between the two. The dash is hanging off the end for some reason. Based on the print statement in the code before, the format should be Title - 000 ({title} - {length}) but the output looks more like 000Title -. Why?

推荐答案

我通过与我相关的问题.

I solved this by the accepted answer in my related question.

我忘记了每行末尾的回车.给出的解决方案如下:

I had forgotten about the return carriage at the end of each line. Solutions given are as follows:

  1. 在子流程调用中使用universal_newlines=True.
    • stderr = Popen(("ffprobe", file_path), shell=True, stderr=PIPE, universal_newlines=True).communicate()[1]
  1. Use universal_newlines=True in the subprocess call.
    • stderr = Popen(("ffprobe", file_path), shell=True, stderr=PIPE, universal_newlines=True).communicate()[1]

stderr处的行中去除空格.

  • *.communicate()[1].decode().rstrip()最终去除所有空格.
  • *.communicate()[1].decode().strip()去除周围的所有空间.
  • *.communicate()[1].decode()[:-2]删除最后两个字符.
  • *.communicate()[1].decode().rstrip() to strip all whitespace at the end.
  • *.communicate()[1].decode().strip() to strip all wightspace around.
  • *.communicate()[1].decode()[:-2] to remove the last two characters.

在正则表达式模式中吞下\r.

Swallowing \r in the regex pattern.

  • findall(r"(\w+)\s+:\s(.+)\r$", stderr, MULTILINE)

这都非常有帮助,但是我没有使用这些建议.

我不知道FFPROBE将JSON输出提供给STDOUT,但是它确实提供了.做到这一点的代码如下.

I didn't know that FFPROBE offers JSON output to STDOUT, but it does. The code to do that is below.

#! /usr/bin/env python3
# -*- coding: utf-8 -*-
from json import loads
from subprocess import check_output, DEVNULL, PIPE


def arg_builder(args, kwargs, defaults={}):
    """Build arguments from `args` and `kwargs` in a shell-lexical manner."""
    for key, val in defaults.items():
        kwargs[key] = kwargs.get(key, val)

    args = list(args)

    for arg, val in kwargs.items():
        if isinstance(val, bool):
            if val:
                args.append("-" + arg)
        else:
            args.extend(("-" + arg, val))

    return args


def run_ffprobe(file_path, *args, **kwargs):
    """Use FFPROBE to get information about a media file."""
    return loads(check_output(("ffprobe", arg_builder(args, kwargs, defaults={"show_format": True}),
                               "-of", "json", file_path), shell=True, stderr=DEVNULL))

您可能还会从arg_builder()中获得一些使用.这并不完美,但是对于简单的shell命令来说已经足够了.它并不是白痴证明,它是假定程序员不会破坏任何东西的情况而写的,但有一些漏洞.

You might also get some use out of the arg_builder(). It isn't perfect, but it's good enough for simple shell commands. It isn't made to be idiot proof, it was written with a few holes assuming that the programmer won't break anything.

这篇关于从FFProbe STDERR打印字符串时输出混乱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆