如何在Python中使用最低的数据使用年度数据来平滑时间序列 [英] How to smooth timeseries with yearly data with lowess in python
问题描述
我每年都会对一些数据进行汇总,如下所示.
I have some data that were recoreded yearly as follows.
mydata = [0.6619346141815186, 0.7170140147209167, 0.692265510559082, 0.6394098401069641, 0.6030995845794678, 0.6500746607780457, 0.6013327240943909, 0.6273292303085327, 0.5865356922149658, 0.6477396488189697, 0.5827181339263916, 0.6496025323867798, 0.6589270234107971, 0.5498126149177551, 0.48638370633125305, 0.5367399454116821, 0.517595648765564, 0.5171639919281006, 0.47503289580345154, 0.6081966757774353, 0.5808742046356201, 0.5856912136077881, 0.5608134269714355, 0.6400936841964722, 0.6766082644462585]
corresponding_year = [1970,1971,1972,1973,1974,1975,1976,1977,1978,1979,1980,1981,1982,1983,1984,1985,1986,1987,1988,1989,1990,1991,1992,1993,1994]]
我使用了statsmodels
python软件包来计算lowess,如下所示.
I used statsmodels
python package to calculate lowess as follows.
import statsmodels.api as sm
lowess = sm.nonparametric.lowess
z = lowess(x, y, frac= 1./3, it=3)
我得到的输出如下.
[[1.96000000e+03, 6.95703548e-01],
[1.96100000e+03, 6.81750671e-01],
[1.96200000e+03, 6.68002318e-01],
[1.96300000e+03, 6.55138324e-01],
[1.96400000e+03, 6.38960761e-01],
[1.96500000e+03, 6.25042177e-01],
[1.96600000e+03, 6.18586936e-01],
[1.96700000e+03, 6.17026334e-01],
[1.96800000e+03, 6.14565102e-01],
[1.96900000e+03, 6.17610340e-01],
[1.97000000e+03, 6.20404414e-01],
[1.97100000e+03, 6.10193222e-01],
[1.97200000e+03, 5.90100648e-01],
[1.97300000e+03, 5.70935248e-01],
[1.97400000e+03, 5.47818726e-01],
[1.97500000e+03, 5.25788570e-01],
[1.97600000e+03, 5.18661218e-01],
[1.97700000e+03, 5.28921300e-01],
[1.97800000e+03, 5.42783400e-01],
[1.97900000e+03, 5.55425915e-01],
[1.98000000e+03, 5.71486587e-01],
[1.98100000e+03, 5.91539778e-01],
[1.98200000e+03, 6.13021691e-01],
[1.98300000e+03, 6.34508409e-01],
[1.98400000e+03, 6.57703989e-01]]
但是,我不清楚在statsmodel
中得到的两个值是什么.我做错了什么吗?此外,我还想知道frac
和it
这两个参数的作用?
However, I am not clear what are the two values I get in statsmodel
. Is there something I make wrong. Moreover, I would also like to know what the two paramers frac
and it
do?
此外,我还想使用seaborn
绘制平滑的时间序列.似乎seaborn支持lowess
.但是,它没有frac
和it
参数.请参见下面的代码.
Moreover, I would also like to plot the smoothed timeseries using seaborn
. It seems like seaborn supports lowess
. However, it does not have the frac
and it
parameters. See the code below.
import numpy as np
import seaborn as sns
x = np.arange(0, 10, 0.01)
ytrue = np.exp(-x / 5) + 2 * np.sin(x / 3)
y = ytrue + np.random.normal(size=len(x))
sns.regplot(x, y, lowess=True)
在这种情况下,是否可以使用statmodels
输出在seaborn
中绘制regplot
?
In that case, is it possible to draw regplot
in seaborn
using statmodels
output?
如果需要,我很乐意提供更多详细信息.
I am happy to provide more details if needed.
推荐答案
低结果,如下面的代码所示.请注意,lowess()
第一个参数是y
值(endog
),第二个参数是x
(exog
).默认结果是z[:,0]
是排序的x
-值,而z[:,1]
是相应的估计的y
-值.
The lowess result can be plotted as shown in the code below. Note that lowess()
first argument is the y
-value (endog
) and the second is the x
(exog
). The default result has z[:,0]
being the sorted x
-values and z[:,1]
the corresponding estimated y
-values.
import matplotlib.pyplot as plt
import statsmodels.api as sm
import numpy as np
mydata = [0.6619346141815186, 0.7170140147209167, 0.692265510559082, 0.6394098401069641, 0.6030995845794678, 0.6500746607780457, 0.6013327240943909, 0.6273292303085327, 0.5865356922149658, 0.6477396488189697, 0.5827181339263916, 0.6496025323867798, 0.6589270234107971, 0.5498126149177551, 0.48638370633125305, 0.5367399454116821, 0.517595648765564, 0.5171639919281006, 0.47503289580345154, 0.6081966757774353, 0.5808742046356201, 0.5856912136077881, 0.5608134269714355, 0.6400936841964722, 0.6766082644462585]
corresponding_year = [1970,1971,1972,1973,1974,1975,1976,1977,1978,1979,1980,1981,1982,1983,1984,1985,1986,1987,1988,1989,1990,1991,1992,1993,1994]
x = np.array(corresponding_year)
y = np.array(mydata)
z = sm.nonparametric.lowess(y, x, frac= 1./3, it=3)
plt.plot(x, y, color='dodgerblue')
plt.plot(z[:,0], z[:,1], 'ro-')
plt.show()
PS:要与同一地块上的海生regplot
进行比较,请称其为:
PS: To compare to the seaborn regplot
on the same plot, call it as:
sns.regplot(x, y, lowess=True, ax=plt.gca())
这篇关于如何在Python中使用最低的数据使用年度数据来平滑时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!