使用自定义视频编写器库编写音频的错误 [英] Bug writing audio using custom video writer library

查看:70
本文介绍了使用自定义视频编写器库编写音频的错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试包装一些方便的C ++代码,这些代码旨在使用VFW在Windows上生成视频和音频,C ++库位于

I'm trying to wrap a little handy piece of C++ code that is intended to generate video+audio on windows using VFW, the C++ library lives here and the descriptions says:

使用Windows视频(因此不可移植).方便,如果你想 快速在某处录制视频,而不会觉得很费力 VfW会自己为您编写文档.

Uses Video for Windows (so it's not portable). Handy if you want to quickly record a video somewhere and don't feel like wading through the VfW docs yourself.

我想在Python上使用该C ++库,所以我决定使用swig对其进行包装.

I'd like to use that C++ library on Python so I've decided to wrap it up using swig.

问题是,在编码音频时遇到了一些问题,由于某种原因,我试图理解生成的视频为何损坏,似乎音频未正确写入视频文件中.这意味着,如果我尝试使用VLC或任何类似的视频播放器打开视频,则会收到一条消息,指出视频播放器无法识别音频或视频编解码器.视频图像很好,所以我将音频写入文件的方式肯定是一个问题.

Thing is, I'm having some problems when it comes to encode the audio, for some reason I'm trying to understand why the generated video is broken, it seems the audio has not been written properly in the video file. That means, if I try to open the video with VLC or any similar video player I'll get a message saying the video player can't identify the audio or video codec. The video images are fine so it's definitely a problem with the way I'm writing the audio to the file.

我同时附加了swig接口和一个小的Python测试,该测试试图成为原始

I'm attaching both the swig interface and a little Python test that's trying to be a port of the original c++ test.

aviwriter.i

%module aviwriter

%{
#include "aviwriter.h"
%}

%typemap(in) (const unsigned char* buffer) (char* buffer, Py_ssize_t length) %{
  if(PyBytes_AsStringAndSize($input,&buffer,&length) == -1)
    SWIG_fail;
  $1 = (unsigned char*)buffer;
%}

%typemap(in) (const void* buffer) (char* buffer, Py_ssize_t length) %{
  if(PyBytes_AsStringAndSize($input,&buffer,&length) == -1)
    SWIG_fail;
  $1 = (void*)buffer;
%}


%include "aviwriter.h"

test.py

import argparse
import sys
import struct
from distutils.util import strtobool

from aviwriter import AVIWriter


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("-audio", action="store", default="1")
    parser.add_argument('-width', action="store",
                        dest="width", type=int, default=400)
    parser.add_argument('-height', action="store",
                        dest="height", type=int, default=300)
    parser.add_argument('-numframes', action="store",
                        dest="numframes", type=int, default=256)
    parser.add_argument('-framerate', action="store",
                        dest="framerate", type=int, default=60)
    parser.add_argument('-output', action="store",
                        dest="output", type=str, default="checker.avi")

    args = parser.parse_args()

    audio = strtobool(args.audio)
    framerate = args.framerate
    num_frames = args.numframes
    width = args.width
    height = args.height
    output = args.output

    writer = AVIWriter()

    if not writer.Init(output, framerate):
        print("Couldn't open video file!")
        sys.exit(1)

    writer.SetSize(width, height)

    data = [0]*width*height
    sampleRate = 44100
    samples_per_frame = 44100 / framerate
    samples = [0]*int(samples_per_frame)

    c1, s1, f1 = 24000.0, 0.0, 0.03
    c2, s2, f2 = 1.0, 0.0, 0.0013

    for frame in range(num_frames):
        print(f"frame {frame}")

        i = 0
        for y in range(height):
            for x in range(width):
                on = ((x + frame) & 32) ^ ((y+frame) & 32)
                data[i] = 0xffffffff if on else 0xff000000
                i += 1
        writer.WriteFrame(
            struct.pack(f'{len(data)}L', *data),
            width*4
        )

        if audio:
            for i in range(int(samples_per_frame)):
                c1 -= f1*s1
                s1 += f1*c1
                c2 += f2*s2
                s2 -= f2*c2

                val = s1 * (0.75 + 0.25 * c2)
                if(frame == num_frames - 1):
                    val *= 1.0 * (samples_per_frame - 1 - i) / \
                        samples_per_frame
                samples[i] = int(val)

                if frame==0:
                    print(f"i={i} val={int(val)}")

            writer.WriteAudioFrame(
                struct.pack(f'{len(samples)}i', *samples),
                int(samples_per_frame)
            )

    writer.Exit()

我不认为samples是错误生成的,因为我已经将python端生成的值与c ++端生成的值进行了比较,尽管只是为第0帧编写的数据包.

I don't think samples is being generated incorrectly as I've already compared the values generated on the python side with the values generated on the c++ side, just the packet written for frame 0 though.

我怀疑什么是我在swig上创建类型图的方式,也许那不好,或者问题出在writer.WriteAudioFrame(struct.pack(f'{len(samples)}i', *samples), int(samples_per_frame))行中,我不知道可能是什么,毫无疑问,我将音频缓冲区从Python发送到C ++包装程序的方式不好.

Some of my suspicions about what's wrong is the way I've created the typemap on swig, maybe that's not good... or maybe the problem lives in the line writer.WriteAudioFrame(struct.pack(f'{len(samples)}i', *samples), int(samples_per_frame)), I don't know what could be, definitely the way I'm sending the audio buffer from Python to the C++ wrapper is not good.

那么,您是否知道如何解决所附加的代码,以便test.py能够生成与c ++测试类似的具有正确音频的视频?

So, would you know how to fix the attached code so test.py will be able to generate a video with the right audio similarly to the c++ test?

生成成功后,视频将显示一个带有催眠正弦波作为音频背景的魔术滚动棋盘:D

When generated ok, the video will display a magic scrolling checkerboard with hypnotic sinewaves as audio backdrop :D

附加说明:

1)似乎上面的代码未使用writer.SetAudioFormat,而功能AVIFileCreateStreamAAVIStreamSetFormat则需要.问题是我不知道如何在swig上导出此结构,这样我就可以在Mmreg.h中以与test.cpp相同的方式在Python上使用它,我已经看到该结构看起来像这样:

1) It seems the above code is not using writer.SetAudioFormat wich is needed for the functions AVIFileCreateStreamA and AVIStreamSetFormat. Problem is I don't know how to export this structure on swig, that way I'd be able to use it on Python the same way than test.cpp, from Mmreg.h I've seen the structure looks like this:

typedef struct tWAVEFORMATEX
{
    WORD    wFormatTag;        /* format type */
    WORD    nChannels;         /* number of channels (i.e. mono, stereo...) */
    DWORD   nSamplesPerSec;    /* sample rate */
    DWORD   nAvgBytesPerSec;   /* for buffer estimation */
    WORD    nBlockAlign;       /* block size of data */
    WORD    wBitsPerSample;    /* Number of bits per sample of mono data */
    WORD    cbSize;            /* The count in bytes of the size of
                                    extra information (after cbSize) */

} WAVEFORMATEX;

不幸的是,我不知道如何将这些东西包装在aviwriter.i上?我尝试使用%include windows.i并将这些内容直接包含在块%{ ... %}中,但我所遇到的只是一堆错误:/

Unfortunately I don't know how to wrap that stuff on aviwriter.i? I've tried using %include windows.i and include the stuff directly on a block %{...%} but all I've got were a bunch of errors :/

2)我最好不要修改aviwriter.h&& aviwriter.cpp完全是因为这基本上是外部工作代码.

2) I'd prefer not modifying neither aviwriter.h && aviwriter.cpp at all as that's basically external working code.

3)假设我能够包装WAVEFORMATEX以便可以在Python上使用它,那么您如何使用与test.cpp类似的memset呢?即:memset(&wfx,0,sizeof(wfx));

3) Assuming I'm able to wrap the WAVEFORMATEX so I can use it on Python, how'd you use memset similarly to test.cpp? ie: memset(&wfx,0,sizeof(wfx));

推荐答案

两个建议:

  • 首先,按照C ++测试,将数据打包为short而不是int的音频格式.音频数据是16位而不是32位.使用"h"扩展名作为包装格式.例如,struct.pack(f'{len(samples)}h', *samples).

  • First, pack the data as short instead of int for the audio format, as per the C++ test. Audio data is 16-bit, not 32-bit. Use the 'h' extension for the packing format. For example, struct.pack(f'{len(samples)}h', *samples).

第二,请参阅下面的代码修改.通过编辑aviwriter.i通过SWIG公开WAVEFORMATX.然后从Python调用writer.SetAudioFormat(wfx).

Second, see code modification below. Expose WAVEFORMATX via SWIG, by editing aviwriter.i. Then call writer.SetAudioFormat(wfx) from Python.

在我的测试中,不需要memset().在python中,您可以将字段cbSize手动设置为零,这已经足够了.其他六个字段是必填字段,因此无论如何都将对其进行设置.似乎该结构不打算在将来进行修订,因为它没有结构大小字段,并且cbSize的语义(在结构的末尾添加任意数据)仍然与扩展名冲突.

In my tests, the memset() was not necessary. From python you could manually set the field cbSize to zero, that should be enough. The other six fields are mandatory so you'll be setting them anyways. It looks like this struct isn't meant to be revised in the future, because it does not have a struct size field, and also the semantics of cbSize (appending arbitrary data to the end of the struct) conflict with an extension anyways.

aviwriter.i:

aviwriter.i:

%inline %{
typedef unsigned short WORD;
typedef unsigned long DWORD;
typedef struct tWAVEFORMATEX
{
    WORD    wFormatTag;        /* format type */
    WORD    nChannels;         /* number of channels (i.e. mono, stereo...) */
    DWORD   nSamplesPerSec;    /* sample rate */
    DWORD   nAvgBytesPerSec;   /* for buffer estimation */
    WORD    nBlockAlign;       /* block size of data */
    WORD    wBitsPerSample;    /* Number of bits per sample of mono data */    
    WORD    cbSize;            /* The count in bytes of the size of
                                extra information (after cbSize) */
} WAVEFORMATEX;
%}

test.py:

from aviwriter import WAVEFORMATEX

稍后在test.py中:

later in test.py:

    wfx = WAVEFORMATEX()
    wfx.wFormatTag = 1 #WAVE_FORMAT_PCM
    wfx.nChannels = 1
    wfx.nSamplesPerSec = sampleRate
    wfx.nAvgBytesPerSec = sampleRate * 2
    wfx.nBlockAlign = 2
    wfx.wBitsPerSample = 16
    writer.SetAudioFormat(wfx)

关于SWIG的说明:由于aviwriter.h仅提供了tWAVEFORMATEX的前向声明,因此没有其他信息提供给SWIG,从而防止生成get/set包装器.您可以要求SWIG包装Windows头文件以声明该结构...并打开一罐蠕虫,因为这些头文件太大且太复杂,从而暴露出更多问题.相反,您可以按上述方式分别定义WAVEFORMATEX.但是,仍未声明C ++类型WORDDWORD.包括SWIG文件windows.i仅创建包装程序,例如,包装程序允许将Python脚本文件中的字符串"WORD"理解为指示内存中的16位数据.但是从C ++的角度来看,这并没有声明WORD类型.若要解决此问题,请在aviwriter.i%inline语句中为WORDDWORD添加typedefs,以强制SWIG将直接内联的代码复制到包装C ++文件中,以使声明可用.这也触发获取/设置包装器的生成.或者,如果您愿意对其进行编辑,则可以在aviwriter.h中包含该内联代码.

Notes on SWIG: Since aviwriter.h only provides a forward declaration of tWAVEFORMATEX, no other information is provided to SWIG, preventing get/set wrappers from being generated. You could ask SWIG to wrap a Windows header declaring the struct ... and open a can of worms because those headers are too large and complex, exposing further problems. Instead, you can individually define WAVEFORMATEX as done above. The C++ types WORD and DWORD still are not declared, though. Including the SWIG file windows.i only creates wrappers which, for example, allow string "WORD" in a Python script file to be understood as indicating 16-bit data in memory. But that doesn't declare the WORD type from a C++ perspective. To resolve this, adding typedefs for WORD and DWORD in this %inline statement in aviwriter.i forces SWIG to copy that code directly inlined into the wrapper C++ file, making the declarations available. This also triggers get/set wrappers to be generated. Alternately, you could include that inlined code inside aviwriter.h if you're willing to edit it.

简而言之,这里的想法是将所有类型完全封装到独立的标头或声明块中.请记住,.i和.h文件具有单独的功能(包装程序和数据转换,以及包装的功能).同样,请注意在aviwriter.i中如何两次包含aviwriter.h,一次是触发生成Python所需的包装器,一次是在生成的C ++所需的包装器代码中声明类型.

In short, the idea here is to fully enclose all types into standalone headers or declaration blocks. Remember that .i and .h file have separate functionality (wrappers and data conversion, versus functionality being wrapped). Similarly, notice how aviwriter.h is included twice in the aviwriter.i, once to trigger the generation of wrappers needed for Python, and once to declare types in the generated wrapper code needed for C++.

这篇关于使用自定义视频编写器库编写音频的错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆