Keras LSTM:时间序列多步骤多特征预测-效果不佳 [英] Keras LSTM: a time-series multi-step multi-features forecasting - poor results

查看:2554
本文介绍了Keras LSTM:时间序列多步骤多特征预测-效果不佳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个时间序列数据集,其中包含全年的数据(日期是索引).数据每15分钟(全年)进行一次测量,结果每天需要96个时间步.数据已经标准化.变量是相关的.除了VAR以外的所有变量都是天气测量值.

I have a time series dataset containing data from a whole year (date is the index). The data was measured every 15 min (during whole year) which results in 96 timesteps a day. The data is already normalized. The variables are correlated. All the variables except the VAR are weather measures.

VAR在一天和一周的时间内是季节性的(因为周末看起来有些不同,但每个周末都更少). VAR值是固定的. 我想预测接下来两天(提前192步)和接下来7天(提前672步)的VAR值.

VAR is seasonal in a day period and in a week period (as it looks a bit different on weekend, but more less the same every weekend). VAR values are stationary. I would like to predict values of VAR for next two days (192 steps ahead) and for next seven days (672 steps ahead).

这是数据集的样本:

DateIdx               VAR       dewpt       hum         press       temp
2017-04-17 00:00:00   0.369397  0.155039    0.386792    0.196721    0.238889
2017-04-17 00:15:00   0.363214  0.147287    0.429245    0.196721    0.233333
2017-04-17 00:30:00   0.357032  0.139535    0.471698    0.196721    0.227778
2017-04-17 00:45:00   0.323029  0.127907    0.429245    0.204918    0.219444
2017-04-17 01:00:00   0.347759  0.116279    0.386792    0.213115    0.211111
2017-04-17 01:15:00   0.346213  0.127907    0.476415    0.204918    0.169444
2017-04-17 01:30:00   0.259660  0.139535    0.566038    0.196721    0.127778
2017-04-17 01:45:00   0.205564  0.073643    0.523585    0.172131    0.091667
2017-04-17 02:00:00   0.157650  0.007752    0.481132    0.147541    0.055556
2017-04-17 02:15:00   0.122101  0.003876    0.476415    0.122951    0.091667

输入数据集图

我决定在Keras中使用LSTM.有了全年的数据,我将过去329天的数据用作培训数据,其余的数据用于培训期间的验证. train_X->包含包括329天的VAR在内的整个度量 train_Y->仅包含329天的VAR.该值向前移动了一步. 其余的时间步分别进入test_X和test_Y.

I have decided to use LSTM in Keras. Having data from the whole year, I have used data from past 329 days as a training data and the rest for a validation during the training. train_X -> contains whole measures including VAR from 329 days train_Y -> contains only VAR from 329 days. The value is shifted one step ahead. The rest timesteps goes to test_X and test_Y.

这是我准备的代码train_X和train_Y:

Here is the code I prepare train_X and train_Y:

#X -> is the whole dataframe
#Y -> is a vector of VAR from whole dataframe, already shifted 1 step ahead

#329 * 96 = 31584
train_X = X[:31584]
train_X = train_X.reshape(train_X.shape[0],1,5)
train_Y = Y[:31584]  
train_Y = train_Y.reshape(train_Y.shape[0],1)

要预测下一个VAR值,我想使用过去的672个时间步长(整个星期的度量值).因此,我设置了batch_size=672,以使"fit"命令如下所示:

To predict next VAR value I would like to use past 672 timesteps (whole week measures). For this reason I have set batch_size=672, so that the ‘fit’ command look like this:

history = model.fit(train_X, train_Y, epochs=50, batch_size=672, validation_data=(test_X, test_Y), shuffle=False)  

这是我的网络体系结构:

Here is the architecture of my network:

model = models.Sequential()
model.add(layers.LSTM(672, input_shape=(None, 672), return_sequences=True))
model.add(layers.Dropout(0.2))
model.add(layers.LSTM(336, return_sequences=True))
model.add(layers.Dropout(0.2))
model.add(layers.LSTM(168, return_sequences=True))
model.add(layers.Dropout(0.2))
model.add(layers.LSTM(84, return_sequences=True))
model.add(layers.Dropout(0.2))
model.add(layers.LSTM(21, return_sequences=False))
model.add(layers.Dense(1))
model.compile(loss='mae', optimizer='adam')
model.summary()

在下面的图中,我们可以看到网络在50个历元之后学会了某些东西":

From the plot below we can see that the network has learn ‘something’ after 50 epochs:

从学习过程中绘制图

出于预测目的,我准备了一组数据,其中包含最后的672个步长(所有值)和96个没有VAR值的数据-应该进行预测.我还使用了自回归,因此我在每次预测后都更新了VAR,并将其用于下一个预测.

For the prediction purpose I have prepared a set of data containing last 672 steps with all values and 96 without VAR value – which should be predicted. I also used autoregression, so I updated VAR after each prediction and used it for next prediction.

predX数据集(用于预测)如下:

The predX dataset (used for prediction) looks like this:

print(predX['VAR'][668:677])

DateIdx            VAR
2017-04-23 23:00:00    0.307573
2017-04-23 23:15:00    0.278207
2017-04-23 23:30:00    0.284390
2017-04-23 23:45:00    0.309118
2017-04-24 00:00:00         NaN
2017-04-24 00:15:00         NaN
2017-04-24 00:30:00         NaN
2017-04-24 00:45:00         NaN
2017-04-24 01:00:00         NaN
Name: VAR, dtype: float64

这是我用来预测接下来的96个步骤的代码(自回归):

Here is the code (autoregression) I have used to predict next 96 steps:

stepsAhead = 96
historySteps = 672

for i in range(0,stepsAhead):
    j = i + historySteps
    ypred = model.predict(predX.values[i:j].reshape(1,historySteps,5))
    predX['VAR'][j] = ypred

不幸的是,结果非常差,而且与预期相差甚远:

Unfortunately the results are very poor and very far from the expectations:

包含预测数据的图

结果与前一天相结合:

结合前一天的预测数据

除了我做错了什么"问题外,我想问几个问题:

Except from the ‘What have I done wrong‘ question, I would like to ask a few questions:

第一季度.在模型确定过程中,我将整个历史记录分为672个大小.这是正确的吗?如何组织用于模型拟合的数据集?我有什么选择?我是否应该使用滑动窗口"方法(例如此处的链接: https://machinelearningmastery.com/promise-recurrent-neural-networks-time-series-forecasting/)?

Q1. During model fifing, I have just put the whole history in batches of 672 size. Is it correct? How should I organize the dataset for the model fitting? What options do I have? Should I use the "sliding window" approach (like in the link here: https://machinelearningmastery.com/promise-recurrent-neural-networks-time-series-forecasting/ )?

第二季度. 50个纪元足够吗?这里的惯例是什么?可能是网络不完善,导致预测不佳?到目前为止,我尝试了200个纪元,但结果相同.

Q2. Is the 50 epochs enough? What is the common practice here? Maybe the network is underfitted resulting in poor prediction? So far I tried 200 epoch with the same result.

第三季度.我应该尝试其他架构吗?拟议的网络足够大"来处理此类数据吗?也许有状态的"网络在这里是正确的方法?

Q3. Should I try a different architecture? Is the proposed network ‘big enough’ to handle such a data? Maybe a "stateful" network is the right approach here?

第四季度.我是否正确实现了自回归?是否有其他方法可以预测未来的许多步骤,例如像我这样的192或672?

Q4. Did I implement the autoregression correctly? Is there any other approach to make a prediction for many steps ahead e.g. 192 or 672 like in my case?

推荐答案

对于如何组织数据来训练RNN似乎有些困惑.因此,让我们涵盖以下问题:

It looks like there is a confusion on how to organise the data to train a RNN. So let's cover the questions:

  1. 拥有2D数据集(total_samples, 5)后,您可以使用 TimeseriesGenerator 创建一个滑动窗口将为您生成(batch_size, past_timesteps, 5).在这种情况下,您将使用.fit_generator训练网络.
  2. 如果得到相同的结果,则应间隔50个纪元.通常,您会根据网络的性能进行调整.但是,如果要比较两种不同的网络体系结构,则应将其固定.
  3. 架构非常庞大,因为您希望一次预测所有672个未来值.您可以设计网络,使其学会一次预测一种测量.在预测时,您可以预测一个点,然后再次输入以预测下一个点,直到获得672.
  4. 这与答案3相关联,您可以一次学习预测一个,并在训练后将预测链接到n个预测.
  1. Once you have a 2D dataset (total_samples, 5) you can use the TimeseriesGenerator to create a sliding window what will generate (batch_size, past_timesteps, 5) for you. In this case, you will use .fit_generator to train the network.
  2. If you get the same result, 50 epochs should be fine. You usually adjust based on the performance of your network. But you should keep it fixed if you are comparing two different network architectures.
  3. Architecture is really large as you aim to predict all 672 future values at once. You can design the network so it learns to predict one measurement at a time. At prediction time you can predict one point and feed that again to predict the next until you get 672.
  4. This ties into answer 3, you can learn to predict one at a time and chain the predictions to n number of predictions after training.

单点预测模型可能类似于:

The single point prediction model could look like:

model = Sequential()
model.add(LSTM(128, return_sequences=True, input_shape=(past_timesteps, 5))
model.add(LSTM(64))
model.add(Dense(1))

这篇关于Keras LSTM:时间序列多步骤多特征预测-效果不佳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆