为什么使用FFT对信号中的频率值进行四舍五入? [英] Why are frequency values rounded in signal using FFT?

查看:84
本文介绍了为什么使用FFT对信号中的频率值进行四舍五入?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我试图弄清楚如何在实践中使用DFT来检测信号中的普遍频率.我一直在努力思考傅立叶变换是什么以及DFT算法是如何工作的,但是显然我还有路要走.我已经编写了一些代码来生成信号(因为意图是要与音乐一起使用,所以我生成了一个大的C和弦,因此生成了奇怪的频率值),然后尝试返回到频率编号.这是我的代码

So, I am trying to figure out how to use DFT in practice to detect prevalent frequencies in a signal. I have been trying to wrap my head around what Fourier transforms are and how DFT algorithms work, but apparently I still have ways to go. I have written some code to generate a signal (since the intent is to work with music, I generated a major C chord, hence the weird frequency values) and then tried to work back to the frequency numbers. Here is the code I have

sr = 44100 # sample rate
x = np.linspace(0, 1, sr) # one second of signal
tpi = 2 * np.pi
data = np.sin(261.63 * tpi * x) + np.sin(329.63 * tpi * x) + np.sin(392.00 * tpi * x)
freqs = np.fft.fftfreq(sr)
fft = np.fft.fft(data)
idx = np.argsort(np.abs(fft))
fft = fft[idx]
freqs = freqs[idx]
print(freqs[-6:] * sr)

这给了我[-262. 262. -330. 330. -392. 392.] ,它不同于我编码的频率(261.63、329.63和392.0).我在做什么错以及如何解决?

This gives me [-262. 262. -330. 330. -392. 392.] which is different from the frequencies I encoded (261.63, 329.63 and 392.0). What am I doing wrong and how do I fix it?

推荐答案

实际上,如果帧持续T秒,则DFT的频率为k/T Hz,其中k为整数.结果,只要将这些频率识别为DFT大小的最大值,过采样就不会提高估计频率的准确性.相反,考虑较长的帧,持续100s,则会在DFT频率之间引起0.01Hz的间隔,这可能足以产生预期的频率. 通过将峰值的频率估计为相对于功率密度的平均频率,可以得到更好的结果.

Indeed, if the frame lasts T seconds, the frequencies of the DFT are k/T Hz, where k is an integer. As a consequence, oversampling does not improve the accuracy of the estimated frequency, as long as these frequencies are identifed as maxima of the magnitude of the DFT. On the contrary, considering longer frames lasting 100s would induce a spacing between the DFT frequencies of 0.01Hz, which might be good enough to produce the expected frequency. It is possible to due much better, by estimating the frequency of a peak as its mean frequency wih respect to power density.

图1:即使在应用了Tuckey窗口之后,窗口信号的DFT也不是狄拉克的和:在峰的底部仍然存在一些频谱泄漏.在估算频率时必须考虑此功率.

Figure 1: even after applying a Tuckey window, the DFT of the windowed signal is not a sum of Dirac: there is still some spectral leakage at the bottom of the peaks. This power must be accounted for as the frequencies are estimated.

另一个问题是帧的长度不是信号周期的倍数,无论如何它都不是周期性的.尽管如此,DFT的计算就像信号是周期性的,但在帧的边缘是不连续的.它会引起被称为光谱泄漏的杂散频率.窗口化是处理此类问题并减轻与人为间断相关的问题的参考方法.实际上,窗口的值在帧的边缘附近连续减小到零. 有一个窗口函数列表 scipy.signal .窗口的应用方式为:

Another issue is that the length of the frame is not a multiple of the period of the signal, which may not be periodic anyway. Nevertheless, the DFT is computed as if the signal were periodic but discontinuous at the edge of the frame. It induce spurous frequencies described as spectral leakage. Windowing is the reference method to deal with such problems and mitigate the problem related to the artificial discontinuity. Indeed, the value of a window continuously decrease to zero near the edges of the frame. There is a list of window functions and a lot of window functions are available in scipy.signal. A window is applied as:

tuckey_window=signal.tukey(len(data),0.5,True)
data=data*tuckey_window

那时,最大幅度的频率仍然是262、330和392.应用窗口只会使峰更明显:窗口信号的DFT具有三个不同的峰,每个峰都有一个中心波瓣和一个旁波瓣,具体取决于窗口的DFT. 这些窗口的波瓣是对称的:因此,相对于功率密度,中心频率可以计算为峰值的平均频率.

At that point, the frequencies exibiting the largest magnitude still are 262, 330 and 392. Applying a window only makes the peaks more visible: the DFT of the windowed signal features three distinguished peaks, each featuring a central lobe and side lobes, depending on the DFT of the window. The lobes of these windows are symmetric: the central frequency can therefore be computed as the mean frequency of the peak, with respect to power density.

import numpy as np
from scipy import signal
import scipy

sr = 44100 # sample rate
x = np.linspace(0, 1, sr) # one second of signal
tpi = 2 * np.pi
data = np.sin(261.63 * tpi * x) + np.sin(329.63 * tpi * x) + np.sin(392.00 * tpi * x)

#a window...
tuckey_window=signal.tukey(len(data),0.5,True)
data=data*tuckey_window

data -= np.mean(data)
fft = np.fft.rfft(data, norm="ortho")

def abs2(x):
        return x.real**2 + x.imag**2

fftmag=abs2(fft)[:1000]
peaks, _= signal.find_peaks(fftmag, height=np.max(fftmag)*0.1)
print "potential frequencies ", peaks

#compute the mean frequency of the peak with respect to power density
powerpeak=np.zeros(len(peaks))
powerpeaktimefrequency=np.zeros(len(peaks))
for i in range(1000):
    dist=1000
    jnear=0
    for j in range(len(peaks)):
        if dist>np.abs(i-peaks[j]):
             dist=np.abs(i-peaks[j])
             jnear=j
    powerpeak[jnear]+=fftmag[i]
    powerpeaktimefrequency[jnear]+=fftmag[i]*i


powerpeaktimefrequency=np.divide(powerpeaktimefrequency,powerpeak)
print 'corrected frequencies', powerpeaktimefrequency

由此得出的估计频率为261.6359 Hz,329.637Hz和392.0088 Hz:比262、330和392Hz好得多,并且满足这种纯无噪声输入信号所需的0.01Hz精度.

The resulting estimated frequencies are 261.6359 Hz, 329.637Hz and 392.0088 Hz: it much better than 262, 330 and 392Hz and it satisfies the required 0.01Hz accuracy for such a pure noiseless input signal.

这篇关于为什么使用FFT对信号中的频率值进行四舍五入?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆