如何在Keras的多元LSTM中处理多步时间序列预测 [英] How to deal with multi step time series forecasting in multivariate LSTM in keras

查看:225
本文介绍了如何在Keras的多元LSTM中处理多步时间序列预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Keras中的多元LSTM进行多步时间序列预测.具体来说,我最初每个时间步都有两个变量(var1和var2).遵循在线教程此处之后,我决定在时间上使用数据(t-2)和(t-1)来预测时间步t处var2的值.如样本数据表所示,我将前4列用作输入,将Y用作输出.我开发的代码可以在此处看到,但是我有三个问题.

I am trying to do multi-step time series forecasting using multivariate LSTM in Keras. Specifically, I have two variables (var1 and var2) for each time step originally. Having followed the online tutorial here, I decided to use data at time (t-2) and (t-1) to predict the value of var2 at time step t. As sample data table shows, I am using the first 4 columns as input, Y as output. The code I have developed can be seen here, but I have got three questions.

   var1(t-2)  var2(t-2)  var1(t-1)  var2(t-1)  var2(t)
2        1.5       -0.8        0.9       -0.5     -0.2
3        0.9       -0.5       -0.1       -0.2      0.2
4       -0.1       -0.2       -0.3        0.2      0.4
5       -0.3        0.2       -0.7        0.4      0.6
6       -0.7        0.4        0.2        0.6      0.7

  1. 问题1:我已经使用上面的数据训练了一个LSTM模型.这个模型做 很好地预测了时间步t的var2的值.但是,什么 如果我想在时间步t + 1预测var2.我觉得很难 因为模型无法在时间步长t告诉我var1的值.如果我想这样做,该如何修改代码建立模型?
  2. 第二季度:我已经看到很多问题,但是我仍然感到困惑.在 我的示例中,[样本,时间]中正确的时间步应该是什么 步骤,功能] 1或2?
  3. 第三季度:我刚刚开始研究LSTM.我有 在此处中阅读,LSTM的最大优势之一是:它 自己了解时间相关性/滑动窗口大小,然后 为什么我们必须总是将时间序列数据转换成类似以下格式 上面的表格?
  1. Q1: I have trained an LSTM model with the data above. This model does well in predicting the value of var2 at time step t. However, what if I want to predict var2 at time step t+1. I feel it is hard because the model cannot tell me the value of var1 at time step t. If I want to do it, how should I modify the code to build the model?
  2. Q2: I have seen this question asked a lot, but I am still confused. In my example, what should be the correct time step in [samples, time steps, features] 1 or 2?
  3. Q3: I just started studying LSTMs. I have read here that one of the biggest advantages of LSTM is that it learns the temporal dependence/sliding window size by itself, then why must we always covert time series data into format like the table above?

更新:LSTM结果(蓝线是训练序列,橙色线是地面真实情况,绿色是预测值)

Update: LSTM result (blue line is the training seq, orange line is the ground truth, green is the prediction)

推荐答案

问题1:

在您的桌子上,我看到您在单个序列上有一个滑动窗口,可以通过2个步骤制作许多较小的序列.

Question 1:

From your table, I see you have a sliding window over a single sequence, making many smaller sequences with 2 steps.

  • 要预测t,您将表格的第一行作为输入
  • 要预测t + 1,请以第二行为输入.

如果您不使用表格:请参阅问题3

If you're not using the table: see question 3

假设您使用该表作为输入,显然它是一个滑动窗口,需要两个时间步长,因此您的timeSteps是2.

Assuming you're using that table as input, where it's clearly a sliding window case taking two time steps as input, your timeSteps is 2.

您可能应该像var1var2那样按相同的顺序进行工作:

You should probably work as if var1 and var2 were features in the same sequence:

  • input_shape = (2,2)-两个时间步长和两个功能/变量.
  • input_shape = (2,2) - Two time steps and two features/vars.

我们不需要制作这样的表或构建滑动窗口盒.那是一种可能的方法.

We do not need to make tables like that or build a sliding window case. That is one possible approach.

您的模型实际上能够学习事物并确定此窗口本身的大小.

Your model is actually capable of learning things and deciding the size of this window itself.

如果一方面您的模型能够学习长时间的依赖关系,从而允许您不使用窗口,另一方面,它可以学习在序列的开始和中间识别不同的行为.在这种情况下,如果您要使用从中间(不包括起点)开始的序列进行预测,则您的模型可能会像起点一样工作,并预测不同的行为.使用窗户消除了这种非常长的影响.我猜哪个更好取决于测试.

If on one hand your model is capable of learning long time dependencies, allowing you not to use windows, on the other hand, it may learn to identify different behaviors at the beginning and at the middle of a sequence. In this case, if you want to predict using sequences that start from the middle (not including the beginning), your model may work as if it were the beginning and predict a different behavior. Using windows eliminate this very long influence. Which is better may depend on testing, I guess.

不使用Windows:

如果数据有800步,请一次输入全部800步进行训练.

If your data has 800 steps, feed all the 800 steps at once for training.

在这里,我们将需要分离两个模型,一个模型用于训练,另一个模型用于预测.在训练中,我们将利用参数return_sequences=True.这意味着对于每个输入步骤,我们都会得到一个输出步骤.

Here, we will need to separate two models, one for training, another for predicting. In training, we will take advantage of the parameter return_sequences=True. This means that for each input step, we will get an output step.

为了以后进行预测,我们将只需要一个输出,然后将使用return_sequences= False.如果我们将预测的输出用作后续步骤的输入,我们将使用stateful=True层.

For predicting later, we will want only one output, then we will use return_sequences= False. And in case we are going to use the predicted outputs as inputs for following steps, we are going to use a stateful=True layer.

培训:

将输入数据的形状设置为(1, 799, 2),从1到799的步长为1个序列.两个var的序列相同(2个特征).

Have your input data shaped as (1, 799, 2), 1 sequence, taking the steps from 1 to 799. Both vars in the same sequence (2 features).

将目标数据(Y)的形状也设置为(1, 799, 2),将相同的步骤从2移到800.

Have your target data (Y) shaped also as (1, 799, 2), taking the same steps shifted, from 2 to 800.

使用return_sequences=True建立模型.您可以使用timeSteps=799,但也可以使用None(允许可变数量的步骤).

Build a model with return_sequences=True. You may use timeSteps=799, but you may also use None (allowing variable amount of steps).

model.add(LSTM(units, input_shape=(None,2), return_sequences=True))
model.add(LSTM(2, return_sequences=True)) #it could be a Dense 2 too....
....
model.fit(X, Y, ....)

预测:

为了进行预测,现在使用return_sequences=False创建类似的模型.

For predicting, create a similar model, now with return_sequences=False.

复制权重:

newModel.set_weights(model.get_weights())

例如,您可以输入长度为800的输入(形状:(1,800,2))并预测下一步:

You can make an input with length 800, for instance (shape: (1,800,2)) and predict just the next step:

step801 = newModel.predict(X)

如果您想预测更多,我们将使用stateful=True图层.再次使用相同的模型,现在使用return_sequences=False(仅在上一个LSTM中,其他保持True)和stateful=True(所有它们).将input_shape更改为batch_input_shape=(1,None,2).

If you want to predict more, we are going to use the stateful=True layers. Use the same model again, now with return_sequences=False (only in the last LSTM, the others keep True) and stateful=True (all of them). Change the input_shape by batch_input_shape=(1,None,2).

#with stateful=True, your model will never think that the sequence ended  
#each new batch will be seen as new steps instead of new sequences
#because of this, we need to call this when we want a sequence starting from zero:
statefulModel.reset_states()

#predicting
X = steps1to800 #input
step801 = statefulModel.predict(X).reshape(1,1,2)
step802 = statefulModel.predict(step801).reshape(1,1,2)
step803 = statefulModel.predict(step802).reshape(1,1,2)
    #the reshape is because return_sequences=True eliminates the step dimension   

实际上,您可以使用单个stateful=Truereturn_sequences=True模型来完成所有工作,并且要注意两件事:

Actually, you could do everything with a single stateful=True and return_sequences=True model, taking care of two things:

  • 训练时,每个时期reset_states(). (使用手动循环和epochs=1进行训练)
  • 从多个步骤进行预测时,仅将输出的最后一个步骤作为所需结果.
  • When training, reset_states() for every epoch. (Train with a manual loop and epochs=1)
  • When predicting from more than one step, take only the last step of the output as the desired result.

这篇关于如何在Keras的多元LSTM中处理多步时间序列预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆