如何比较/匹配两个不同的声音片段 [英] How to compare / match two non-identical sound clips

查看:662
本文介绍了如何比较/匹配两个不同的声音片段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要每5秒钟进行一次简短的声音采样,然后将其上传到我们的云服务器.

I need to take short sound samples every 5 seconds, and then upload these to our cloud server.

然后,我需要找到一种方法来比较/检查该样本是否是完整的长音频文件的一部分.

I then need to find a way to compare / check if that sample is part of a full long audio file.

样本将通过电话麦克风记录下来,因此它们的确是不准确的.

The samples will be recorded from a phones microphone, so they will indeed not be exact.

我知道这个主题可能会变得技术性很复杂,但是我确信必须有一些库或在线服务可以帮助完成这种复杂的音频匹配/配对.

I know this topic can get quite technical and complex, but I am sure there must be some libraries or online services that can assist in this complex audio matching / pairing.

一个想法是使用音频到文本的转换服务,然后根据实际对话框进行匹配.但是,这对我而言并不有效.根据实际的声音频率或样式进行匹配的效率会更高.

One idea was to use a audio to text conversion service and then do matching based on the actual dialog. However this does not feel efficient to me. Where as matching based on actual sound frequencies or patterns would be a lot more efficient.

我知道有诸如Shazam之类的服务可以进行这种音频匹配.但是我想他们的服务都是适当的.

I know there are services out there such as Shazam that do this type of audio matching. However I would imagine their services are all propriety.

一些可能会影响它的因素:

Some factors that could influence it:

  • 两个音频样本均带有时间戳.因此,我们不必搜索整个声音片段.

推荐答案

要使您获得解答的动力,您需要专注于完成战斗并显示代码的可回答问题

To give you traction on getting an answer you need to focus on an answerable question where you have done battle and show your code

我会从头顶走过,从音频中抽出一个样本桶……然后将您的样本桶滑过多个样本并执行另一个样本桶抽取操作……让每个容器都包含同样包含的重叠样本在上一个存储桶以及下一个存储桶中……更少的样本,更快的计算,更多的样本,在一定程度上达到了更高的准确性YMMV

Off top of my head I would walk across the audio to pluck out a bucket of several samples ... then slide your bucket across several samples and perform another bucket pluck operation ... allow each bucket to contain overlap samples also contained in previous bucket as well as next bucket ... less samples quicker computation more samples greater accuracy to an extent YMMV

...将每个存储桶馈入傅立叶变换,以将时域输入音频呈现到其频域对应物中...记录到数据库中每个存储桶的FFT的显着属性,例如能量最大的X频率是多少( FFT上的最大幅度)

... feed each bucket into a Fourier Transform to render the time domain input audio into its frequency domain counterpart ... record into a database salient attributes of the FFT of each bucket like what are the X frequencies having most energy (greatest magnitude on your FFT)

...也可能会存储前X个频率相对于它们的能量的标准偏差(这些频率的分散程度)...根据需要定义其他此类属性...对于这样的频域方法,您可以工作由于FFT处理周期性的时间序列数据,因此每个存储桶中需要的样本相对较少,因此,如果将其馈送到语音或音乐等复杂音频的500毫秒中,则不再具有周期性音频,而是变得糊涂了

... also perhaps store the standard deviation of those top X frequencies with respect to their energy (how disperse are those frequencies) ... define additional such attributes as needed ... for such a frequency domain approach to work you need relatively few samples in each bucket since FFT works on periodic time series data so if you feed it 500 milliseconds of complex audio like speech or music you no longer have periodic audio, instead you have mush

然后,通过上述处理将所有现有音频发送出去之后,对您的实时新音频执行相同操作,然后确定哪个先前的音频包含与您当前的音频输入相匹配的最相似的存储桶序列...使用贝叶斯方法,以便您的猜测具有概率权重附件,可进行实时更新

Then once all existing audio has been sent through above processing do same to your live new audio then identify what prior audio contains most similar sequence of buckets matching your current audio input ... use a Bayesian approach so your guesses have probabilistic weights attached which lend themselves to real-time updates

听起来像一个很酷的项目,祝您好运...这是一些音频指纹资源

Sounds like a very cool project good luck ... here are some audio fingerprint resources

音频剪辑A是否出现在音频文件B中 在音频中检测音频[音频识别] 检测音频中的音频[音频识别]

does audio clip A appear in audio file B Detecting audio inside audio [Audio Recognition] Detecting audio inside audio [Audio Recognition]

在Arduino中通过FFT检测特定模式 从Arduino中的FFT检测特定模式

Detecting a specific pattern from a FFT in Arduino Detecting a specific pattern from a FFT in Arduino

使用AudioContext API进行音频指纹识别 https://news.ycombinator.com/item?id=21436414 https://iq.opengenus.org/audio-fingerprinting/

Audio Fingerprinting using the AudioContext API https://news.ycombinator.com/item?id=21436414 https://iq.opengenus.org/audio-fingerprinting/

Chromaprint是AcoustID项目的核心组件. 这是一个客户端库,实现了用于从任何音频源提取指纹的自定义算法 https://acoustid.org/chromaprint

Chromaprint is the core component of the AcoustID project. It's a client-side library that implements a custom algorithm for extracting fingerprints from any audio source https://acoustid.org/chromaprint

通过FFT检测特定模式 从Arduino中的FFT检测特定模式

Detecting a specific pattern from a FFT Detecting a specific pattern from a FFT in Arduino

作为节点流模块的音频界标指纹-nodejs将PCM音频信号转换为一系列音频指纹. https://github.com/adblockradio/stream-audio-fingerprint

Audio landmark fingerprinting as a Node Stream module - nodejs converts a PCM audio signal into a series of audio fingerprints. https://github.com/adblockradio/stream-audio-fingerprint

SO后续行动 如何比较/匹配两个不同的声音片段 如何比较/匹配两个不同的声音片段

SO followup How to compare / match two non-identical sound clips How to compare / match two non-identical sound clips

Python中的音频指纹识别 https://github.com/worldveil/dejavu

Audio fingerprinting and recognition in Python https://github.com/worldveil/dejavu

使用Python和Numpy进行音频指纹识别 http://willdrevo.com/fingerprinting-and-audio-recognition-with -python/

Audio Fingerprinting with Python and Numpy http://willdrevo.com/fingerprinting-and-audio-recognition-with-python/

MusicBrainz:开放音乐百科全书(musicbrainz.org) https://news.ycombinator.com/item?id=14478515

MusicBrainz: an open music encyclopedia (musicbrainz.org) https://news.ycombinator.com/item?id=14478515

https://acoustid.org/chromaprint Chromaprint如何工作? https://oxygene.sk/2011/01/how-does-chromaprint -work/

https://acoustid.org/chromaprint How does Chromaprint work? https://oxygene.sk/2011/01/how-does-chromaprint-work/

https://acoustid.org/

MusicBrainz是一个开放式音乐百科全书,它收集音乐元数据并将其提供给公众. https://musicbrainz.org/

MusicBrainz is an open music encyclopedia that collects music metadata and makes it available to the public. https://musicbrainz.org/

Chromaprint是AcoustID项目的核心组件. 这是一个客户端库,实现了用于从任何音频源提取指纹的自定义算法 https://acoustid.org/chromaprint

Chromaprint is the core component of the AcoustID project. It's a client-side library that implements a custom algorithm for extracting fingerprints from any audio source https://acoustid.org/chromaprint

音频匹配(音频指纹识别)

是否可以根据给定的wav文件比较两首相似的歌曲? 是否可以比较两首相似的歌曲给他们的wav文件?

Is it possible to compare two similar songs given their wav files? Is it possible to compare two similar songs given their wav files?

音频哈希 https://en.wikipedia.org/wiki/Hash_function#Finding_like_records

音频指纹 https://encrypted.google.com/search?hl = zh-CN& pws = 0& q = python + audio +指纹

ACRCloud https://www.acrcloud.com/ 如何使用Python和Gracenote识别音乐样本?

ACRCloud https://www.acrcloud.com/ How to recognize a music sample using Python and Gracenote?

作为节点流模块的音频界标指纹-nodejs将PCM音频信号转换为一系列音频指纹. https://github.com/adblockradio/stream-audio-fingerprint

Audio landmark fingerprinting as a Node Stream module - nodejs converts a PCM audio signal into a series of audio fingerprints. https://github.com/adblockradio/stream-audio-fingerprint

这篇关于如何比较/匹配两个不同的声音片段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆