如何从PyAudio中包含的声音数据中删除爆音 [英] How to remove pops from concatented sound data in PyAudio

查看:171
本文介绍了如何从PyAudio中包含的声音数据中删除爆音的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何删除通过将音调声音片段连接在一起而构成的音频中的爆裂"和咔嗒"声?

How do you remove "popping" and "clicking" sounds in audio constructed by concatenating sound tonal sound clips together?

我有此PyAudio代码,用于生成一系列音调:

I have this PyAudio code for generating a series of tones:

import time
import math
import pyaudio

class Beeper(object):

    def __init__(self, **kwargs):
        self.bitrate = kwargs.pop('bitrate', 16000)
        self.channels = kwargs.pop('channels', 1)
        self._p = pyaudio.PyAudio()
        self.stream = self._p.open(
            format = self._p.get_format_from_width(1), 
            channels = self.channels, 
            rate = self.bitrate, 
            output = True,
        )
        self._queue = []

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        self.stream.stop_stream()
        self.stream.close()

    def tone(self, frequency, length=1000, play=False, **kwargs):

        number_of_frames = int(self.bitrate * length/1000.)

        ##TODO:fix pops?
        g = get_generator()
        for x in xrange(number_of_frames):
            self._queue.append(chr(int(math.sin(x/((self.bitrate/float(frequency))/math.pi))*127+128)))

    def play(self):
        sound = ''.join(self._queue)
        self.stream.write(sound)
        time.sleep(0.1)

with Beeper(bitrate=88000, channels=2) as beeper:
    i = 0
    for f in xrange(1000, 800-1, int(round(-25/2.))):
        i += 1
        length = log(i+1) * 250/2./2.
        beeper.tone(frequency=f, length=length)
    beeper.play()

但是当音调改变时,音频中会有一个独特的流行"声,我不确定如何将其删除.

but when the tones changes, there's a distinctive "pop" in the audio, and I'm not sure how to remove it.

起初,我以为是因为我正在立即播放每个剪辑而引起流行,并且在生成剪辑时每次播放之间的时间足以使音频趋于平坦.但是,当我将所有片段连接成一个字符串并播放时,流行音乐仍然存在.

At first, I thought the pop was occurring because I was immediately playing each clip, and the time between each playback when I generate the clip was enough of a delay to cause the audio to flatline. However, when I concatenated all the clips into a single string and played that, the pop was still there.

然后,我认为每个剪辑的边界处的正弦波都不匹配,因此我尝试将当前音频剪辑的前N帧与上一个剪辑的后N帧取平均.没有效果.

Then, I thought the sine-waves weren't matching at the boundaries for each clip, so I tried to average the first N frames of the current audio clip with the last N frames of the previous clip, but that also had no effect.

我做错了什么?我该如何解决?

What am I doing wrong? How do I fix this?

推荐答案

您为自己编写的答案可以解决问题,但实际上并不是正确的方法.

The answer you've written for yourself will do the trick but isn't really the correct way to do this type of thing.

问题之一是您通过与1进行比较来检查正弦波的尖端"或峰值.并非所有正弦频率都将达到该值,或者可能需要大量循环才能达到此目的.

One of the problems is your checking for the "tip" or peak of the sine wave by comparing against 1. Not all sine frequencies will hit that value or may require a large number of cycles to do so.

从数学上讲,对于所有K的整数值,正弦波的峰值位于sin(pi/2 + 2piK).

Mathematically speaking, the peak of the sine is at sin(pi/2 + 2piK) for all integer values of K.

要计算给定频率的正弦,请使用公式y = sin(2pi * x * f0/fs),其中x是采样数,f0是正弦频率,fs是采样率.

To compute sine for a given frequency you use the formula y = sin(2pi * x * f0/fs) where x is the sample number, f0 is the sine frequency and fs is the sample rate.

对于一个很好的数字,例如48kHz采样率下的1kHz,则当x = 12时:

For a nice number like 1kHz at 48kHz sample rate, when x=12 then:

sin(2pi * 12 * 1000/48000) = sin(2pi * 12/48) = sin(pi/2) = 1

但是在997Hz这样的频率下,真实峰值落在样本12之后的样本的一小部分.

However at a frequency like 997Hz then the true peak falls a fraction of a sample after sample 12.

sin(2pi * 12 * 997/48000) = 0.99087178042
sin(2pi * 12 * 997/48000) = 0.99998889671
sin(2pi * 12 * 997/48000) = 0.99209828673

将波形拼接在一起的一种更好的方法是跟踪一个音调的相位,并将其用作下一个音调的起始相位.

A better method of stitching the waveforms together is to keep track of the phase from one tone and use that as the starting phase for the next.

首先,对于给定的频率,您需要计算出相位增量,请注意,这与对样本进行分解后的操作相同:

First, for a given frequency you need to figure out the phase increment, notice it is the same as what you are doing with the sample factored out:

phInc = 2*pi*f0/fs

接下来,计算正弦波并更新代表当前相位的变量.

Next, compute the sine and update a variable representing the current phase.

for x in xrange(number_of_frames):
    y = math.sin(self._phase);
    self._phase += phaseInc;

将它们放在一起:

def tone(self, frequency, length=1000, play=False, **kwargs):

    number_of_frames = int(self.bitrate * length/1000.)
    phInc = 2*math.pi*frequency/self.bitrate

    for x in xrange(number_of_frames):
        y = math.sin(self._phase)
        _phase += phaseInc;
        self._queue.append(chr(int(y)))

这篇关于如何从PyAudio中包含的声音数据中删除爆音的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆