如何使用Win32消除C ++中原始音频信号中的随机不连续性? [英] How to get rid of random discontinuities in raw audio signal in C++ using Win32?

查看:82
本文介绍了如何使用Win32消除C ++中原始音频信号中的随机不连续性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用Win32在C ++中以较小的时间间隔连续无缝地将原始音频数据馈送到循环缓冲区中.WAVEHDR 的 header.lpData 缓冲区包含原始音频数据,并通过调用 waveInAddBuffer(wi,& header,sizeof(WAVEHDR)); 该缓冲区以较小的时间间隔周期性地被覆盖.下图显示了该问题:

I want to continuously and seamlessly feed raw audio data into a cyclic buffer in small time intervals in C++ using Win32. The header.lpData buffer of WAVEHDR contains the raw audio data and by calling waveInAddBuffer(wi, &header, sizeof(WAVEHDR)); this buffer is cyclically overwritten in small time intervals. The image below shows the problem:

虽然缓冲区反复被小块覆盖(从左到右,当前偏移量由洋红色线显示,并且在洋红色线处具有不连续性的波浪中可见),但在随机位置的波浪中还有其他不连续性(黄色闪电).几年前,我已经用Java编写了同样的东西,并且在音频输入中没有中断的情况下,它可以完美地工作.

While the buffer repeatedly is overwritten in small chunks (from left to right, current offset is displayed by magenta line and visible in the wave having a discontinuity at the magenta line), there are additional discontinuities in the wave at random places (yellow lightning). I've written the same thing in Java a few years ago and there it works flawlessly without discontinuities in the audio input.

我在做错什么吗?或者这是Win32音频库中的错误?

Is there something I'm doing wrong or is this a bug in the Win32 audio library?

这是我的C ++代码的相关部分:

Here is the relevant part of my C++ code:

#define VC_EXTRALEAN
#pragma comment(lib,"winmm.lib")
#include <Windows.h>

const int sample_rate = 4*4096; // must be supported by microphone
const int sample_size = 4096; // must be a power of 2

const int buffer_size = 2*sample_size;
char* buffer = new char[buffer_size];
float* wave = new float[sample_size];
int offset = 0;

void convert(float* const wave, const char* const buffer, int offset) {
    const float scale = 4.0f/65536.0f;
    for(int i=0; i<sample_size; i++) {
        const uint p = (offset-1+sample_size-i)%(buffer_size/2);
        wave[i] = scale*(float)((buffer[2*p+1]<<8)|(buffer[2*p]&0xFF));
    }
}

int main() {
    for(uint i=0; i<buffer_size; i++) buffer[i] = 0;
    for(uint i=0; i<sample_size; i++) wave[i] = 0.0f;

    WAVEFORMATEX wfx = {};
    wfx.wFormatTag = WAVE_FORMAT_PCM;    // PCM is standard
    wfx.nChannels = 1;                   // 1 channel (mono)
    wfx.nSamplesPerSec = sample_rate;    // sample_rate
    wfx.wBitsPerSample = 16;             // 16 bit samples
    wfx.nBlockAlign = wfx.wBitsPerSample*wfx.nChannels/8;
    wfx.nAvgBytesPerSec = wfx.nBlockAlign*wfx.nSamplesPerSec*wfx.nChannels;
    wfx.cbSize = 0;
    HWAVEIN wi;                          // open recording device
    WAVEHDR header = {};                 // initialize header empty
    header.dwFlags = 0;                  // clear the 'done' flag
    header.dwBytesRecorded = 0;          // tell it no bytes have been recorded
    header.lpData = buffer;              // give it a pointer to our buffer
    header.dwBufferLength = buffer_size; // tell it the size of that buffer in bytes
    waveInOpen(&wi, WAVE_MAPPER, &wfx, NULL, NULL, CALLBACK_NULL|WAVE_FORMAT_DIRECT);
    waveInStart(wi); // start recording
    waveInPrepareHeader(wi, &header, sizeof(WAVEHDR)); // prepare header

    while(true) {
        waveInAddBuffer(wi, &header, sizeof(WAVEHDR)); // read in new audio data into buffer
        offset = header.dwBytesRecorded; // get offset of to which point the buffer is overwritten
    
        convert(wave, buffer, offset);
        // plot wave and offset

        sleep(1.0/120.0); // time in seconds
    }
    waveInUnprepareHeader(wi, &header, sizeof(WAVEHDR));
    waveInStop(wi); // once the user hits escape, stop recording, and clean up
    waveInClose(wi);
}

我尝试了@Adrian McCarthy的解决方案,但该解决方案无法按照评论中的说明进行操作.修改后的代码是:

I tried the solution from @Adrian McCarthy and it does not work as pointed out in the comment. The modified code is:

#define VC_EXTRALEAN
#pragma comment(lib,"winmm.lib")
#include <Windows.h>

const int sample_rate = 4*4096; // must be supported by microphone
const int sample_size = 4096; // must be a power of 2

const uint buffer_size = 2*sample_size/8; // make buffers 1/8 the size of the total wave buffer
char* buffer1 = new char[buffer_size];
char* buffer2 = new char[buffer_size];
float* wave = new float[sample_size];
int offset = 0;

void convert(float* const wave, const char* const buffer, int offset) {
    const float scale = 4.0f/65536.0f;
    for(int i=sample_size-1; i>=offset/2; i--) {
        wave[i] = wave[i-offset/2];
    }
    for(int i=0; i<offset/2; i++) {
        const uint p = offset/2-1-i;
        wave[i] = scale*(float)((buffer[2*p+1]<<8)|(buffer[2*p]&0xFF));
    }
}

int main() {
    for(uint i=0; i<buffer_size; i++) buffer1[i] = 0;
    for(uint i=0; i<buffer_size; i++) buffer2[i] = 0;
    for(uint i=0; i<sample_size; i++) wave[i] = 0.0f;

    WAVEFORMATEX wfx = {};
    wfx.wFormatTag = WAVE_FORMAT_PCM;    // PCM is standard
    wfx.nChannels = 1;                   // 1 channel (mono)
    wfx.nSamplesPerSec = sample_rate;    // sample_rate
    wfx.wBitsPerSample = 16;             // 16 bit samples
    wfx.nBlockAlign = wfx.wBitsPerSample*wfx.nChannels/8;
    wfx.nAvgBytesPerSec = wfx.nBlockAlign*wfx.nSamplesPerSec*wfx.nChannels;
    wfx.cbSize = 0;
    HWAVEIN wi;                             // open recording device
    WAVEHDR* pCurrent = new WAVEHDR();      // initialize header empty
    pCurrent->dwFlags = 0;                  // clear the 'done' flag
    pCurrent->dwBytesRecorded = 0;          // tell it no bytes have been recorded
    pCurrent->lpData = buffer1;             // give it a pointer to our buffer
    pCurrent->dwBufferLength = buffer_size; // tell it the size of that buffer in bytes
    WAVEHDR* pNext = new WAVEHDR();         // initialize header empty
    pNext->dwFlags = 0;                     // clear the 'done' flag
    pNext->dwBytesRecorded = 0;             // tell it no bytes have been recorded
    pNext->lpData = buffer2;                // give it a pointer to our buffer
    pNext->dwBufferLength = buffer_size;    // tell it the size of that buffer in bytes
    waveInOpen(&wi, WAVE_MAPPER, &wfx, NULL, NULL, CALLBACK_NULL|WAVE_FORMAT_DIRECT);
    waveInStart(wi); // start recording
    waveInPrepareHeader(wi, pCurrent, sizeof(WAVEHDR)); // prepare header
    waveInPrepareHeader(wi, pNext   , sizeof(WAVEHDR)); // prepare header

    while(true) {
        do {
            waveInAddBuffer(wi, pCurrent, sizeof(WAVEHDR));
            sleep(0.001);
        } while((pCurrent->dwFlags&WHDR_DONE)==0);
        pCurrent->dwFlags &= ~WHDR_DONE;
        swap(pCurrent, pNext);

        offset = pCurrent->dwBytesRecorded; // get offset of to which point the buffer is overwritten
    
        convert(wave, buffer1, offset);
        // plot wave and offset

        sleep(1.0/120.0); // time in seconds
    }
    waveInUnprepareHeader(wi, pCurrent, sizeof(WAVEHDR));
    waveInUnprepareHeader(wi, pNext   , sizeof(WAVEHDR));
    waveInStop(wi); // once the user hits escape, stop recording, and clean up
    waveInClose(wi);
}

结果:

推荐答案

问题:

  • 您的线程正在与系统线程竞争,该系统线程正在填充缓冲区并更新标头中的字段.当您读取 dwBytesRecorded 字段时,可以获得的值小于缓冲区中实际的字节数.填充缓冲区的线程有时会更新 dwBytesRecorded ,但是随着记录的继续,该数字将在一秒钟后过期.乐观地认为,在另一个线程可能正在写入它的同时读取DWORD是安全的.

  • Your thread is racing with a system thread that's filling the buffer and updating the fields in the header. When you read the dwBytesRecorded field, you can get a value less than the number of bytes actually in the buffer. The thread filling the buffer will occasionally update dwBytesRecorded, but that number will be out-of-date a split second later, as the recording continues. And that's optimistically assuming that reading a DWORD while another thread may be writing to it is safe.

再次添加缓冲区时,音频系统认为这是一个新缓冲区,只要当前缓冲区已满就可以切换到该缓冲区.您正在传递相同的缓冲区,希望它从头开始就开始填充它.但这也可能会扭曲标题中的Reserved字段并创建不一致的状态.

When you add the buffer again, the audio system believes this is a new buffer to switch to as soon as the current one is full. You're passing it the same buffer, hoping it will just start filling it from the beginning. But it might also be twiddling the Reserved fields in the header and create an inconsistent state.

我不确定您使用的是哪个 sleep 函数,但是其中大多数不能/不需要等待精确的时间.Win32 Sleep 将至少等待 指定的毫秒数,然后将线程标记为可以运行,但是直到调度程序绕开它才真正运行做到这一点.实际上,这可能不是问题,因为您的缓冲区为500毫秒,比睡眠中的不确定性大一个数量级.

I'm not sure which sleep function you're using, but most of them can't/don't wait for a precise amount of time. The Win32 Sleep will wait at least the number of milliseconds specified and then mark the thread as ready-to-run, but it doesn't actually run until the scheduler gets around to it. In practice, this might not be a problem, since your buffer is 500 milliseconds, which is an order of magnitude larger than the uncertainty from the sleep.

实现此目的的典型方法是在两个(或多个)缓冲区之间进行乒乓操作.您添加了两个非常短的缓冲区,然后等待第一个缓冲区在其标头中设置 WHDR_DONE 标志[请参见注释].然后,您一次处理整个第一个缓冲区,而系统继续记录到第二个缓冲区中.处理完缓冲区后,请重新添加它,然后等待另一个缓冲区准备就绪.

The typical way to implement this is to ping-pong between two (or more) buffers. You add two very short buffers, and wait for the first one to get the WHDR_DONE flag set in its header [see Note]. You then process the entire first buffer at once while the system continues to record into the second buffer. Once you're done processing a buffer, you re-add it, and then wait for the other buffer to become ready.

// Given two buffers `ping` and `pong` with corresponding WAVEHDRs
// `ping_header` and `pong_header`...
WAVEHDR *pCurrent = ping_header;
WAVEHDR *pNext = pong_header;
waveInAddBuffer(wi, pCurrent, sizeof(WAVEHDR));
waveInAddBuffer(wi, pNext, sizeof(WAVEHDR));

for (;;) {
  // wait for the current buffer to fill
  while ((pCurrent->dwFlags & WHDR_DONE) == 0) {}  // SEE NOTE

  // As recording continues with *pNext, process and draw
  // the data from pCurrent->lpData.

  // Now that we're done processing pCurrent, we can re-add it so
  // the system has a place to record when pNext is full.
  waveInAddBuffer(wi, pCurrent, sizeof(WAVEHDR));
  // What was next becomes current, and the new next is the old current.
  swap(pCurrent, pNext);
}

请注意,您的两个缓冲区可能很短.我建议使用16到20毫秒:比Windows上默认的15.6毫秒计时器大,但仍要考虑每次循环迭代中要处理多少数据.

Note that your two buffers can be pretty short. I'd recommend 16-20 ms: larger than the default 15.6 ms timer on Windows, but still in the ballpark of how much data you were trying to process in each loop iteration.

这里繁忙的等待循环并不是很好-它可以将内核驱动到100%,而无需做有用的工作.但是,如果处理时间接近记录下一个缓冲区所花费的时间,则它不会旋转太多.(从技术上讲,在另一个线程可能正在更新它的同时,读取变量还是仍然存在数据争用的问题,但是我们只是在注意是否变高了,因此在实践中可能还可以.)

The busy wait loop here isn't great--it can drive a core to 100% without doing useful work. But if the processing time is close to the time it takes to record the next buffer, then it won't spin too much. (And technically, you still have the same data-race issue of reading a variable while another thread may be updating it, but we're just watching for the bit to go high, so it's probably OK in practice.)

Wave音频API并非为超高速处理而设计.它们旨在用于Windows程序.而不是忙于等待标志,您应该处理

The wave audio APIs weren't designed for extreme high-speed processing. They were intended for Windows programs. Instead of busy waiting for the flag, you were expected to process the MM_WIM_DATA message in a window's window procedure, which would avoid the busy waiting and the data races, but add a bit of message passing overhead as each buffer completes.

2020-07-19

2020-07-19

注意:@ProjectPhysX指出我的代码大纲中 WHDR_DONE 的繁忙等待循环不起作用.编译器可以自由地假设该值不会改变,并且可能会优化代码以测试该标志一次,然后永久旋转.这是允许的,因为在我们的等待线程和设置标志的线程之间的数据竞争意味着代码具有未定义的行为".如果我们控制了两个线程,则可以使用任何类型的同步方案来消除数据争用,并且这将起作用.但是我们无权访问音频系统中运行的线程.

Note: @ProjectPhysX pointed out that the busy wait loop for WHDR_DONE in my code outline doesn't work. The compiler is free to assume that the value never changes and likely optimizes the code to test the flag once and then spin forever. That's allowed because the data race between our waiting thread and the thread that sets the flag means the code has "undefined behavior". If we controlled both threads, we could use any sort of synchronization scheme to eliminate the data race, and this would work. But we don't have access to the thread(s) running in the audio system.

Wave音频API旨在通过向缓冲区发送窗口消息来通知客户端何时完成缓冲区.对于连续记录来说,这很好用,但是这意味着采用事件驱动的方法,而消息传递的开销 可能会限制程序处理样本的速度.XAudio2或Windows Core Audio都更适合于高速音频工作.使用一对(或链)小缓冲区的想法非常普遍,类似于使用后缓冲区或交换链的图形程序.

The wave audio APIs were designed to notify the client when a buffer is done by sending it a window message. That works fine for continuous recording, but it means taking an event-driven approach and the overhead of message passing may limit how fast the program can process the samples. Either of XAudio2 or Windows Core Audio would be more appropriate for high-speed audio work. The idea of using a pair (or chain) of small buffers is pretty universal, and is analogous to graphics programs using a back buffer or swap chain.

这篇关于如何使用Win32消除C ++中原始音频信号中的随机不连续性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆