测量两个短音频相似度的最简单算法 [英] Simplest algorithm of measuring how similar of two short audio

查看:51
本文介绍了测量两个短音频相似度的最简单算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题是寻找任何开源或简单的实现来衡量 iOS 应用程序上两个音频之间的相似程度.

The question is to look for any open source or simple implementation to measure how similar between two audios on the iOS application.

简单来说,音频可以用一维向量来表示,来计算一维向量之间的距离.但音频长度会有所不同,因此需要一些预处理等.

Simply speaking, audio can be represented by 1-D vector, to calculate the distance between the 1D vector. But the audio length will be different, therefore need some pre-processing etc.

期待在这里得到一些线索,谢谢

Looking forward to get some clues here, thanks

推荐答案

使用DTW可以高效计算两个变长序列之间的相似度:

The similarity between two sequences of variable length can be efficiently calculated with DTW:

http://en.wikipedia.org/wiki/Dynamic_time_warping

这个算法很容易自己实现,维基页面上链接了很多现有的实现.

This algorithm is simple to implement yourself and there are quite many existing implementations linked on the wiki page.

简单来说,音频可以用一维向量表示,

Simply speaking, audio can represented by 1-D vector,

在帧上分割音频并将其转换为二维特征向量是合理的,其中对于每一帧,您都有一组对应于不同频段的值(特征).如果要处理音乐,每帧一个FFT是个好主意,对于语音,最好计算梅尔频率倒谱

It's reasonable to split the audio on frames and turn it into 2-D vector of features where for each frame you have an array of values(features) corresponding to the different frequency bands. If you want to deal with music, an FFT for every frame is a good idea, for speech, it's better to calculate mel-frequency cepstrum

同样,您可以将许多现有库用于 mel 频率特征,其中之一是语音识别工具包 CMUSphinx

Again, you can use many existing libraries for mel frequency features, one of them is a speech recognition toolkit CMUSphinx

这篇关于测量两个短音频相似度的最简单算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆