Python:比较两个可能有噪音的音频文件 [英] Python: Compare two audio files which may have noise

查看:626
本文介绍了Python:比较两个可能有噪音的音频文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

出于项目目的,我正在录制来自舞台附近不同区域的音频剪辑(波形文件).我需要检查源音频是否;即:使用从附近地方录制的音频,可以在舞台附近的位置上听到舞台上的声音.

For a project purpose, I am recording audio clips(wave files) from different areas near a stage. I need to check if the source audio ie; the audio from the stage is highly audible in the nearby location of the stage using the audio recorded from the nearby places.

更清楚地讲,我在舞台附近的地方有麦克风,并且在舞台和这些附近的地方有音频剪辑.如何检查来自舞台的声音是否已接收到附近的位置,或者如何理解来自舞台的声音正在干扰附近的地方?

More clearly, I have microphones at nearby places of a stage and I have audio clips from stage and these nearby places. How can I check if the sound from the stage is received to the nearby location or how can I understand the sound from the stage is making a disturbance to the nearby places.

推荐答案

听起来像一个有趣的项目……提供了一种精确的方法,因为您的问题可能会涉及到感知和卷积神经网络等广阔领域……首先确保您的音频文件及时对齐...将音频样本窗口(例如2 ^ 14等于4096,或者始终是2的幂)输入FFT调用(离散傅立叶变换),该数组将为您提供一个数组的每个频率仓都有一个幅度(丢弃相位属性)...然后在您的舞台麦克风和周围的每个麦克风文件之间比较此FFT阵列...然后在向前滑动此采样窗口之后重复上述操作,直到您已经访问了完整的样本集...您可能想在上面使用此采样窗口的各种宽度进行尝试

Sounds like an interesting project ... to give a nuts and bolts approach since your question could tap into vast fields like perception and convolutional neural networks ... first assure your audio files are aligned in time ... feed a window of audio samples (say 2^14 that is 4096, or more yet always a power of 2) into a FFT call (Discrete Fourier Transform) which will give you an array of frequency bins each with a magnitude (discard the phase attribute) ... then compare this FFT array between your stage mic and each of surrounding mic files ... then repeat above after sliding this window of samples forward in time and repeat until you have visited the full set of samples ... you may want to try above using various widths of this sampling window

还尝试各种方法来比较这对麦克风信号之间的FFT阵列...在此比较中,应将FFT阵列中幅度最大的频率仓分配给更大的权重,因为您要避免允许低幅度的噪声频率仓使水浑浊-通过平方频率仓幅度来加重主要频率并衰减较安静的频率来实现...为简单起见,请使用正弦曲线作为音频信号-搜索移动应用:频率声音生成器-您将获得一个更简单的FFT阵列...这里的目标只是源音频中的一个频率出现在FFT输出分析中

also try various ways to compare the FFT array between the pair of mic signals ... the frequency bins in the FFT array with the greatest magnitudes should be given greater weight in this comparison since you want to avoid allowing noise in low magnitude freq bins to muddy the waters - do this by squaring the freq bin magnitudes to accentuate the dominate freqs and attenuate the quieter freqs ... for simplicity at the start use a sin curve as your audio signal - search for a mobile app : Frequency Sound Generator - you will get a simpler FFT array ... goal here is just that one frequency from your source audio will appear here in the FFT output analysis

要在唯一需要的库上执行,实际上是DFT调用,但是,如果您没有足够的时间自己动手制作上述方法,则这些python仓库可能会加快您的项目的运行速度

To perform above the only library you really need is the DFT call however if you do not have the luxury of time to roll your own to craft above approach these python repos may speed up your project

Librosa-用于音频和音乐分析的Python库

Librosa - Python library for audio and music analysis

https://librosa.github.io/
https://github.com/librosa/librosa

Madmom-Python音频和音乐信号处理库

Madmom - Python audio and music signal processing library

https://madmom.readthedocs.io/en/latest/modules/audio/cepstrogram.html?highlight=mfcc https://madmom.readthedocs.io https://github.com/CPJKU/madmom

但是我建议您避免使用上述库,而是自己滚动-YMMV

however I suggest you avoid using above libs and just roll your own - YMMV

这篇关于Python:比较两个可能有噪音的音频文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆