比较应用程序内的声音 [英] Compare sounds inside of the App

查看:84
本文介绍了比较应用程序内的声音的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以比较两个声音?
例如,应用程序已经有一个声音文件mp3或任何格式,是否可以比较任何静态声音文件和记录的声音在应用程序内?



欢迎任何评论。



回复

解决方案

三下) - http://www.dsprelated.com/showmessage/103820/1。 php



诀窍是从mp3获取解码的音频 - 如果它们只是短的hello声音,我会将它们存储在应用程序作为wav而不是解码它们(虽然我从来没有使用CoreAudio或任何其他框架,所以mp3解码到内存可能很容易)。



当你有你的参考wav和你记录的wav,按照上面的帖子中的步骤:


1做任何必要的转换。 wav文件到离散时间
信号:



http://www.sonicspot.com/guide/wavefiles.html



2时间扭曲可能或可能不需要取决于两个采样率之间的差异



http://en.wikipedia.org/wiki/Dynamic_time_warping



3时间扭曲后,截断两个信号,使其持续时间为
相当。



4从DFT的两个信号计算标准能量谱密度(ESD):



http://en.wikipedia.org/wiki/Power_spectrum



6计算两个
信号的归一化ESD之间的均方误差(MSE):



http://en.wikipedia.org/wiki/Mean_squared_error



两个信号的归一化ESD的
之间的MSE是
接近度的良好度量。如果你说,10 .wav
文件,其中2个几乎是
相同,但其他不是,两个
是close应该有一个
相对低MSE。两个完美的
相同的信号显然会有
MSE为零。理想地,两个具有不同时间尺度的等效
信号,
(20秒人类谈话与
5秒花栗鼠),不同能量

chipmunk)和不同的阶段
(抽样开始与稍有不同的
瞬间连续时间
输入);应当仍然具有零的MSE,
,但是
DSP中固有的量化误差将产生MSE略微大于
比零。



http://en.wikipedia.org/wiki/Minimum_mean-square_error


你应该得到两个不同的MSE值,一个在你的male->录音轨道之间,一个在你的female->录音轨道之间。与最低差异的比较可能是正确的性别。



我承认我从来没有尝试这样做,它看起来很难 - 好运! >

Is it possible to compare two sounds ? for example app have already a sound file mp3 or any format, is it possible to compare any static sound file and recorded sound inside of app ?

Any comments are welcomed.

Regards

解决方案

This forum thread has a good answer (about three down) - http://www.dsprelated.com/showmessage/103820/1.php.

The trick is to get the decoded audio from the mp3 - if they're just short 'hello' sounds, I'd store them inside the app as a wav instead of decoding them (though I've never used CoreAudio or any of the other frameworks before so mp3 decoding into memory might be easy).

When you've got your reference wav and your recorded wav, follow the steps in the post above :

1 Do whatever is necessary to convert .wav files to their discrete- time signals:

http://www.sonicspot.com/guide/wavefiles.html

2 time-warping might or might not be necessary depending on difference between two sample rates:

http://en.wikipedia.org/wiki/Dynamic_time_warping

3 After time warping, truncate both signals so that their durations are equivalent.

4 Compute normalized energy spectral density (ESD) from DFT's two signals:

http://en.wikipedia.org/wiki/Power_spectrum.

6 Compute mean-square-error (MSE) between normalized ESD's of two signals:

http://en.wikipedia.org/wiki/Mean_squared_error

The MSE between the normalized ESD's of two signals is good metric of closeness. If you have say, 10 .wav files, and 2 of them are nearly the same, but the others are not, the two that are close should have a relatively low MSE. Two perfectly identical signals will obviously have MSE of zero. Ideally, two "equivalent" signals with different time scales, (20-second human talking versus 5-second chipmunk), different energies (soft-spoken human verus yelling chipmunk), and different phases (sampling began at slightly different instant against continuous time input); should still have MSE of zero, but quantization errors inherent in DSP will yield MSE slightly greater than zero.

http://en.wikipedia.org/wiki/Minimum_mean-square_error

You should get two different MSE values, one between your male->recorded track and one between your female->recorded track. The comparison with the lowest difference is probably the correct gender.

I confess that I've never tried to do this and it looks very hard - good luck!

这篇关于比较应用程序内的声音的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆