音符音高的识别智能手机上 [英] Pitch recognition of musical notes on a smart phone

查看:672
本文介绍了音符音高的识别智能手机上的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用有限的资源,如CPU速度较慢,code尺寸和RAM,如何最好地检测音符,类似于电子或软件调谐器会做的间距?

With limited resources such as slower CPUs, code size and RAM, how best to detect the pitch of a musical note, similar to what an electronic or software tuner would do?

我应该是:


  • 吻FFT

  • FFTW

  • 离散小波变换

  • 自相关

  • 零分析

  • 八度行距过滤器

其他?

在概括地说,我所试图做的是要认识到一个单一的音符,两个八度低于中间-C以上两个八度,在任何(合理的)乐器演奏。我想是半音的20%以内 - 换句话说,如果用户播放太平坦或太尖,我需要区分。不过,我不需要调谐所要求的精度。

In a nutshell, what I am trying to do is to recognize a single musical note, two octaves below middle-C to two octaves above, played on any (reasonable) instrument. I'd like to be within 20% of the semitone - in other words, if the user plays too flat or too sharp, I need to distinguish that. However, I will not need the accuracy required for tuning.

推荐答案

如果你并不需要那么多的准确性,同比FFT可能就足够了。 窗口音频块第一,让你得到良好定义的高峰,然后找到第一个高峰显著

If you don't need that much accuracy, an FFT could be sufficient. Window the chunk of audio first so that you get well-defined peaks, then find the first significant peak.

滨宽度=采样率/ FFT的大小:

Bin width = sampling rate / FFT size:

基础范围从的20Hz到7kHz的,所以采样14千赫率就足够了。下一个标准采样率是22050赫兹。

Fundamentals range from 20 Hz to 7 kHz, so a sampling rate of 14 kHz would be enough. The next "standard" sampling rate is 22050 Hz.

FFT的大小,然后由你想要的precision确定。 FFT输出是在频率线性的,而乐音在频率对数,因此最坏的情况下precision将在低频率。在20赫兹半音的20%,则需要 1.2赫兹,这意味着的 18545 。 2的下一个功率为2 15 = 32768。这是数据1.5秒,并把我的笔记本电脑的处理器3毫秒来计算。

The FFT size is then determined by the precision you want. FFT output is linear in frequency, while musical tones are logarithmic in frequency, so the worst case precision will be at low frequencies. For 20% of a semitone at 20 Hz, you need a width of 1.2 Hz, which means an FFT length of 18545. The next power of two is 215 = 32768. This is 1.5 seconds of data, and takes my laptop's processor 3 ms to calculate.

这将不具有缺少基本,并找到工作的信号第一显著高峰期是有点困难(​​因为谐波往往比的根本更高),但你可以找出适合你的情况的一种方式。

This won't work with signals that have a "missing fundamental", and finding the "first significant" peak is somewhat difficult (since harmonics are often higher than the fundamental), but you can figure out a way that suits your situation.

自相关和谐波的产品系列在查找真正的根本的浪潮,而不是更好谐波之一,但我不认为他们处理以及与失谐,大部分仪器像钢琴或吉他是不和谐(谐波距离应该是什么略尖)。这真的取决于你的情况下,虽然。

Autocorrelation and harmonic product spectrum are better at finding the true fundamental for a wave instead of one of the harmonics, but I don't think they deal as well with inharmonicity, and most instruments like piano or guitar are inharmonic (harmonics are slightly sharp from what they should be). It really depends on your circumstances, though.

此外,您还可以通过只在感兴趣的特定频段计算,使用的的Chirp-Z变换

Also, you can save even more processor cycles by computing only within a specific frequency band of interest, using the Chirp-Z transform.

我已经写了href=\"http://gist.github.com/255291\" rel=\"nofollow\">在Python 几种不同的方法进行比较的目的

I've written up a few different methods in Python for comparison purposes.

这篇关于音符音高的识别智能手机上的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆