从音频文件中提取音调功能 [英] extracting pitch features from audio file
问题描述
我想从我会用一个分类问题的音频文件中提取沥青的特点。我使用的分类蟒蛇(SciPy的/ numpy的)。
I am trying to extract pitch features from an audio file which I would use for a classification problem. I am using python(scipy/numpy) for classification.
我想我可以使用 scipy.fft
得到频率的功能,但我不知道如何使用近似频率的音符。我研究了一下,发现我需要得到色度特性哪些频率映射到 12
箱的半音音阶的音符。
I think I can get frequency features using scipy.fft
but I don't know how to approximate musical notes using frequencies. I researched a bit and found that I need to get chroma features which map frequencies to 12
bins for notes of a chromatic scale.
我认为有MATLAB的色度的工具箱,但我不认为有什么similiar的蟒蛇。
I think there's a chroma toolbox for matlab but I don't think there's anything similiar for python.
我应该如何往前走这个?
谁能还建议您阅读材料,我应该考虑?
How should I go forward with this? Could anyone also suggest reading material I should look into?
推荐答案
您可以频率映射到音符:
You can map frequencies to musical notes:
与是要计算的MIDI音符数, 频率和室间距(在现代音乐的440.0赫兹是常见的)
with being the midi note number to be calculated, the frequency and the chamber pitch (in modern music 440.0 Hz is common).
正如你可能知道单频不会使音高。 间距源于基波声音的感觉,即声音,主要由单一频率的整数倍(=基波)的。
As you may know a single frequency doesn't make a musical pitch. "Pitch" arises from the sensation of the fundamental of harmonic sounds, i.e. sounds that mainly consist of integer multiples of one single frequency (= the fundamental).
如果你想拥有在Python色度功能,您可以使用布雷格曼音像资料工具箱 。需要注意的是色度功能不给你一个音高的八度的信息,所以你只要获取有关间距信息类。
If you want to have Chroma Features in Python, you can use the Bregman Audio-Visual Information Toolbox. Note that chroma features don't give you information about the octave of a pitch, so you just get information about the pitch class.
from bregman.suite import Chromagram
audio_file = "mono_file.wav"
F = Chromagram(audio_file, nfft=16384, wfft=8192, nhop=2205)
F.X # all chroma features
F.X[:,0] # one feature
提取音频间距信息的一般问题称为间距检测。
这篇关于从音频文件中提取音调功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!