Python中的快速傅立叶变换 [英] Fast Fourier Transform in Python

查看:101
本文介绍了Python中的快速傅立叶变换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对傅立叶理论是陌生的,并且我看过非常好的教程,内容涉及如何将fft应用于信号并绘制信号以查看其包含的频率.不知何故,他们所有人都将各种罪恶混合在一起作为他们的数据,而我在适应它来解决我的实际问题时遇到了麻烦.

我每天都有242个每小时的观测值,这意味着我的周期是24.因此,我希望在fft图上有一个大约24的峰值.

我的data.csv示例在这里:

我已经读到这可能是由于该系列的DC分量所致,所以在减去均值后我得到:

我很难解释这一点,峰值似乎是周期性发生的,但是以Hz为单位的值无法让我获得我的24值(总频率).有人知道如何解释吗?我想念什么?

解决方案

您看到的问题是因为条形太宽,而您只看到一个条形.您必须将条形的宽度更改为0.00001或更小才能看到它们.

而不是使用条形图,而是使用 fftfreq = np.fft.fftfreq(len(s))设置x轴,然后使用绘图函数 plt.plot(fftfreq),fft):

 将matplotlib.pyplot导入为plt将熊猫作为pd导入将numpy导入为np数据= pd.read_csv('data.csv',index_col = 0)data.index = pd.to_datetime(data.index)数据=数据['max_open_files'].astype(float).valuesN = data.shape [0]#元素数t = np.linspace(0,N * 3600,N)#将小时转换为秒s =数据fft = np.fft.fft(s)fftfreq = np.fft.fftfreq(len(s))T = t [1]-t [0]f = np.linspace(0,1/T,N)plt.ylabel(振幅")plt.xlabel(频率[Hz]")plt.plot(fftfreq,fft)plt.show() 

I am new to the fourier theory and I've seen very good tutorials on how to apply fft to a signal and plot it in order to see the frequencies it contains. Somehow, all of them create a mix of sines as their data and i am having trouble adapting it to my real problem.

I have 242 hourly observations with a daily periodicity, meaning that my period is 24. So I expect to have a peak around 24 on my fft plot.

A sample of my data.csv is here: https://pastebin.com/1srKFpJQ

Data plotted:

My code:

data = pd.read_csv('data.csv',index_col=0)
data.index = pd.to_datetime(data.index)
data = data['max_open_files'].astype(float).values

N = data.shape[0] #number of elements
t = np.linspace(0, N * 3600, N) #converting hours to seconds
s = data

fft = np.fft.fft(s)
T = t[1] - t[0]

f = np.linspace(0, 1 / T, N)
plt.ylabel("Amplitude")
plt.xlabel("Frequency [Hz]")
plt.bar(f[:N // 2], np.abs(fft)[:N // 2] * 1 / N, width=1.5)  # 1 / N is a normalization factor
plt.show()

This outputs a very weird result where it seems I am getting the same value for every frequency.

I suppose that the problems comes with the definition of N, t and T but I cannot find anything online that has helped me understand this clearly. Please help :)

EDIT1:

With the code provided by charles answer I have a spike around 0 that seems very weird. I have used rfft and rfftfreq instead to avoid having too much frequencies.

I have read that this might be because of the DC component of the series, so after substracting the mean i get:

I am having trouble interpreting this, the spikes seem to happen periodically but the values in Hz don't let me obtain my 24 value (the overall frequency). Anybody knows how to interpret this ? What am I missing ?

解决方案

The problem you're seeing is because the bars are too wide, and you're only seeing one bar. You will have to change the width of the bars to 0.00001 or smaller to see them show up.

Instead of using a bar chart, make your x axis using fftfreq = np.fft.fftfreq(len(s)) and then use the plot function, plt.plot(fftfreq, fft):

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

data = pd.read_csv('data.csv',index_col=0)
data.index = pd.to_datetime(data.index)
data = data['max_open_files'].astype(float).values

N = data.shape[0] #number of elements
t = np.linspace(0, N * 3600, N) #converting hours to seconds
s = data

fft = np.fft.fft(s)
fftfreq = np.fft.fftfreq(len(s))

T = t[1] - t[0]

f = np.linspace(0, 1 / T, N)
plt.ylabel("Amplitude")
plt.xlabel("Frequency [Hz]")
plt.plot(fftfreq,fft)
plt.show()

这篇关于Python中的快速傅立叶变换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆