测量两个短音频相似度的最简单算法 [英] Simplest algorithm of measuring how similar of two short audio
问题描述
问题是寻找任何开源或简单的实现来衡量 iOS 应用程序上两个音频之间的相似程度.
The question is to look for any open source or simple implementation to measure how similar between two audios on the iOS application.
简单来说,音频可以用一维向量来表示,来计算一维向量之间的距离.但音频长度会有所不同,因此需要一些预处理等.
Simply speaking, audio can be represented by 1-D vector, to calculate the distance between the 1D vector. But the audio length will be different, therefore need some pre-processing etc.
期待在这里得到一些线索,谢谢
Looking forward to get some clues here, thanks
推荐答案
使用DTW可以高效计算两个变长序列之间的相似度:
The similarity between two sequences of variable length can be efficiently calculated with DTW:
http://en.wikipedia.org/wiki/Dynamic_time_warping
这个算法很容易自己实现,维基页面上链接了很多现有的实现.
This algorithm is simple to implement yourself and there are quite many existing implementations linked on the wiki page.
简单来说,音频可以用一维向量表示,
Simply speaking, audio can represented by 1-D vector,
在帧上分割音频并将其转换为二维特征向量是合理的,其中对于每一帧,您都有一组对应于不同频段的值(特征).如果要处理音乐,每帧一个FFT是个好主意,对于语音,最好计算梅尔频率倒谱
It's reasonable to split the audio on frames and turn it into 2-D vector of features where for each frame you have an array of values(features) corresponding to the different frequency bands. If you want to deal with music, an FFT for every frame is a good idea, for speech, it's better to calculate mel-frequency cepstrum
同样,您可以将许多现有库用于 mel 频率特征,其中之一是语音识别工具包 CMUSphinx
Again, you can use many existing libraries for mel frequency features, one of them is a speech recognition toolkit CMUSphinx
这篇关于测量两个短音频相似度的最简单算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!