如何使用深度学习模型进行时间序列预测? [英] How to use deep learning models for time-series forecasting?

查看:168
本文介绍了如何使用深度学习模型进行时间序列预测?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有28天从机器(m1, m2, so on)记录的信号. (请注意:每天的每个信号的长度为360长).

machine_num, day1, day2, ..., day28
m1, [12, 10, 5, 6, ...], [78, 85, 32, 12, ...], ..., [12, 12, 12, 12, ...]
m2, [2, 0, 5, 6, ...], [8, 5, 32, 12, ...], ..., [1, 1, 12, 12, ...]
...
m2000, [1, 1, 5, 6, ...], [79, 86, 3, 1, ...], ..., [1, 1, 12, 12, ...]

我想预测接下来3天每台机器的信号序列.即在day29day30day31中. 但是,我没有天数293031的值.因此,我的计划是使用LSTM模型如下.

第一步是获取day 1的信号并要求为day 2预测信号,然后在下一步中获取days 1, 2的信号并要求为day 3预测信号,依此类推,所以当我到达day 28,时,网络中的所有信号最多可达28个,并被要求预测day 29等的信号.

我试图按如下方法制作单变量LSTM模型.

# univariate lstm example
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
# define dataset
X = array([[10, 20, 30], [20, 30, 40], [30, 40, 50], [40, 50, 60]])
y = array([40, 50, 60, 70])
# reshape from [samples, timesteps] into [samples, timesteps, features]
X = X.reshape((X.shape[0], X.shape[1], 1))
# define model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(3, 1)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=1000, verbose=0)
# demonstrate prediction
x_input = array([50, 60, 70])
x_input = x_input.reshape((1, 3, 1))
yhat = model.predict(x_input, verbose=0)
print(yhat)

但是,此示例非常简单,因为它没有像我这样的长序列.例如,我的m1数据如下所示.

m1 = [[12, 10, 5, 6, ...], [78, 85, 32, 12, ...], ..., [12, 12, 12, 12, ...]]

此外,我需要对天293031进行预测.在那种情况下,我不确定如何更改此示例以满足我的需求.我想特别地知道我选择的方向是否正确.如果是这样,该怎么做.

如果需要,我很乐意提供更多详细信息.

我提到了model.summary().

解决方案

模型和形状

由于这些是序列中的序列,因此您需要使用其他格式的数据.

尽管您可以像(machines, days, 360)一样简单地将360视为功能(可能在某种程度上起作用),但是对于健壮的模型(然后可能存在速度问题),您需要同时对待这两种情况作为序列.

然后我将使用像(machines, days, 360, 1)这样的数据和两个级别的重复发生率.

我们的模型input_shape将是(None, 360, 1)

模型案例1-仅当日重复发生

数据形状:(machines, days, 360)
对数据进行一些归一化处理.

这里是一个示例,但是模型可以很灵活,因为您可以添加更多层,尝试卷积等:

inputs = Input((None, 360)) #(m, d, 360)
outs = LSTM(some_units, return_sequences=False, 
            stateful=depends_on_training_approach)(inputs)  #(m, some_units)
outs = Dense(360, activation=depends_on_your_normalization)(outs) #(m, 360)
outs = Reshape((1,360)) #(m, 1, 360) 
    #this reshape is not necessary if using the "shifted" approach - see time windows below
    #it would then be (m, d, 360)

model = Model(inputs, outs)

根据每日序列的复杂程度,可以对此进行很好的预测,但是如果它们以复杂的方式演化,那么下一个模​​型会更好一些.

始终记住,您可以创建更多的层并进行探索以提高此模型的功能,这只是一个小例子

模型案例2-两级递归

数据形状:(machines, days, 360, 1)
对数据进行一些归一化.

有很多方法可以尝试执行此操作,但这是一个简单的方法.

inputs = Input((None, 360, 1)) #(m, d, 360, 1)

#branch 1
inner_average = TimeDistributed(
                    Bidirectional(
                        LSTM(units1, return_sequences=True, stateful=False),
                        merge_mode='ave'
                    )
                )(inputs) #(m, d, 360, units1)
inner_average = Lambda(lambda x: K.mean(x, axis=1))(inner_average) #(m, 360, units1)


#branch 2
inner_seq = TimeDistributed(
                LSTM(some_units, return_sequences=False, stateful=False)
            )(inputs) #may be Bidirectional too
            #shape (m, d, some_units)

outer_seq = LSTM(other_units, return_sequences = False, 
                 stateful=depends_on_training_approach)(inner_seq) #(m, other_units)

outer_seq = Dense(few_units * 360, activation = 'tanh')(outer_seq) #(m, few_units * 360)
    #activation = same as inner_average 


outer_seq = Reshape((360,few_units))(outer_seq) #(m, 360, few_units)


#join branches

outputs = Concatenate()([inner_average, outer_seq]) #(m, 360, units1+few_units)
outputs = LSTM(units, return_sequences=True, stateful= False)(outputs) #(m, 360,units)
outputs = Dense(1, activation=depends_on_your_normalization)(outputs) #(m, 360, 1)
outputs = Reshape((1,360))(outputs) #(m, 1, 360) for training purposes

model = Model(inputs, outputs)

这是一次尝试,我平均做了几天,但是我可以代替inner_average来做类似的事情:

#branch 1
daily_minutes = Permute((2,1,3))(inputs) #(m, 360, d, 1)
daily_minutes = TimeDistributed(
                    LSTM(units1, return_sequences=False, 
                         stateful=depends_on_training_approach)
                )(daily_minutes) #(m, 360, units1)

还有许多其他探索数据的方式,这是一个极富创造力的领域.例如,您可以在inner_average之后使用daily_minutes方法,但不包括K.mean lambda层.

时间窗口逼近

您的方法听起来不错.给出一个步骤来预测下一个,给出两个步骤来预测第三个,给出三个步骤来预测第四个.

以上模型适用于此方法.

请记住,非常短的输入可能没有用,并且可能会使您的模型变得更糟. (试着想象有多少步骤足以让您开始预测下一个步骤)

预处理数据并将其分组:

  • 长度= 4的组(例如)
  • 长度= 5的组
  • ...
  • 长度= 28的组

您将需要一个手动训练循环,在该循环中,每个阶段都要喂食这些组(不能一起喂不同的体重).


另一种方法是,给出所有步骤,使模型预测移位序列,例如:

  • inputs = original_inputs[:, :-1]#排除上一个训练日
  • outputs = original_inputs[:, 1:]#排除第一训练日

为使上面的模型适合这种方法,在每个将日维作为步骤的LSTM中都需要return_sequences=True(而不是inner_seq). (inner_average方法将失败,并且您将不得不通过return_sequences=True和紧随其后的另一个Permute((2,1,3))诉诸daily_minutes方法.

形状为:

  • branch1:(m, d, 360, units1)
  • branch2:(m, d, 360, few_units)-为此需要调整Reshape
    • 不需要使用1个时间步进行重塑,days尺寸将替换1.
    • 您可能需要使用Lambda层进行重塑,以考虑批次大小和可变天数(如果需要详细信息,请告诉我)

训练和预测

(很抱歉,现在没有时间详细介绍它)

然后,您可以按照此处,也包含一些链接,更加完整. (尽管要注意输出形状,但是在您的问题中,尽管时间步长可能为1,但我们始终保持其为时长)

要点是:

  • 如果您选择stateful=False:
    • 这意味着使用fit可以轻松进行培训(只要您不使用不同长度"方法);
    • 这还意味着您将需要使用stateful=True建立新模型,复制经过训练的模型的权重;
    • 然后您进行手动逐步预测
  • 如果您从一开始就选择stateful=True:
    • 这必然意味着手动训练循环(例如,使用train_on_batch);
    • 这必然意味着,当您要提交其序列不是最后一批的后代(如果您的批次包含完整序列的每批)时,将需要model.reset_states().
    • 无需构建新模型即可进行人工预测,但人工预测仍保持不变

I have signals recorded from machines (m1, m2, so on) for 28 days. (Note: each signal in each day is 360 length long).

machine_num, day1, day2, ..., day28
m1, [12, 10, 5, 6, ...], [78, 85, 32, 12, ...], ..., [12, 12, 12, 12, ...]
m2, [2, 0, 5, 6, ...], [8, 5, 32, 12, ...], ..., [1, 1, 12, 12, ...]
...
m2000, [1, 1, 5, 6, ...], [79, 86, 3, 1, ...], ..., [1, 1, 12, 12, ...]

I want to predict the signal sequence of each machine for next 3 days. i.e. in day29, day30, day31. However, I don't have values for days 29, 30 and 31. So, my plan was as follows using LSTM model.

The first step is to get signals for day 1 and asked to predict signals for day 2, then in the next step get signals for days 1, 2 and asked to predict signals for day 3, etc, so when I reach day 28, the network has all the signals up to 28 and is asked to predict the signals for day 29, etc.

I tried to do a univariant LSTM model as follows.

# univariate lstm example
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
# define dataset
X = array([[10, 20, 30], [20, 30, 40], [30, 40, 50], [40, 50, 60]])
y = array([40, 50, 60, 70])
# reshape from [samples, timesteps] into [samples, timesteps, features]
X = X.reshape((X.shape[0], X.shape[1], 1))
# define model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(3, 1)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=1000, verbose=0)
# demonstrate prediction
x_input = array([50, 60, 70])
x_input = x_input.reshape((1, 3, 1))
yhat = model.predict(x_input, verbose=0)
print(yhat)

However, this example is very simple since it does not have long sequences like mine. For example, my data for m1 would look as follows.

m1 = [[12, 10, 5, 6, ...], [78, 85, 32, 12, ...], ..., [12, 12, 12, 12, ...]]

Moreover, I need the prediction of day 29, 30, 31. In that case, I am unsure how to change this example to cater my needs. I want to sepcifically know if the direction I have chosen is correct. If so, how to do it.

I am happy to provide more details if needed.

EDIT:

I have mentioned the model.summary().

解决方案

Model and shapes

Since these are sequences in sequences, you need to use your data in a different format.

Although you could just go like (machines, days, 360) and simply treat the 360 as features (that could work up to some point), for a robust model (then maybe there is a speed problem) you'd need to treat both things as sequences.

Then I'd go with data like (machines, days, 360, 1) and two levels of recurrency.

Our models input_shape then would be (None, 360, 1)

Model case 1 - Only day recurrency

Data shape: (machines, days, 360)
Apply some normalization to the data.

Here, an example, but models can be flexible as you can add more layers, try convolutions, etc:

inputs = Input((None, 360)) #(m, d, 360)
outs = LSTM(some_units, return_sequences=False, 
            stateful=depends_on_training_approach)(inputs)  #(m, some_units)
outs = Dense(360, activation=depends_on_your_normalization)(outs) #(m, 360)
outs = Reshape((1,360)) #(m, 1, 360) 
    #this reshape is not necessary if using the "shifted" approach - see time windows below
    #it would then be (m, d, 360)

model = Model(inputs, outs)

Depending on the complexity of the intra-daily sequences, they could get well predicted with this, but if they evolve in a complex way, then the next model would be a little better.

Always remember that you can create more layers and explore things to increase the capability of this model, this is only a tiny example

Model case 2 - Two level recurrency

Data shape: (machines, days, 360, 1)
Apply some normalization to the data.

There are so many many ways to experiment on how to do this, but here is a simple one.

inputs = Input((None, 360, 1)) #(m, d, 360, 1)

#branch 1
inner_average = TimeDistributed(
                    Bidirectional(
                        LSTM(units1, return_sequences=True, stateful=False),
                        merge_mode='ave'
                    )
                )(inputs) #(m, d, 360, units1)
inner_average = Lambda(lambda x: K.mean(x, axis=1))(inner_average) #(m, 360, units1)


#branch 2
inner_seq = TimeDistributed(
                LSTM(some_units, return_sequences=False, stateful=False)
            )(inputs) #may be Bidirectional too
            #shape (m, d, some_units)

outer_seq = LSTM(other_units, return_sequences = False, 
                 stateful=depends_on_training_approach)(inner_seq) #(m, other_units)

outer_seq = Dense(few_units * 360, activation = 'tanh')(outer_seq) #(m, few_units * 360)
    #activation = same as inner_average 


outer_seq = Reshape((360,few_units))(outer_seq) #(m, 360, few_units)


#join branches

outputs = Concatenate()([inner_average, outer_seq]) #(m, 360, units1+few_units)
outputs = LSTM(units, return_sequences=True, stateful= False)(outputs) #(m, 360,units)
outputs = Dense(1, activation=depends_on_your_normalization)(outputs) #(m, 360, 1)
outputs = Reshape((1,360))(outputs) #(m, 1, 360) for training purposes

model = Model(inputs, outputs)

This is one attempt, I made an average of the days, but I could have made, instead of inner_average, something like:

#branch 1
daily_minutes = Permute((2,1,3))(inputs) #(m, 360, d, 1)
daily_minutes = TimeDistributed(
                    LSTM(units1, return_sequences=False, 
                         stateful=depends_on_training_approach)
                )(daily_minutes) #(m, 360, units1)

Many other ways of exploring the data are possible, this is a highly creative field. You could, for instance, use the daily_minutes approach right after the inner_average excluding the K.mean lambda layer.... you got the idea.

Time windows approach

Your approach sounds nice. Give one step to predict the next, give two steps to predic the third, give three steps to predict the fourth.

The models above are suited to this approach.

Keep in mind that very short inputs may be useless and may make your model worse. (Try to imagine how many steps would be reasonably enough for you to start predicting the next ones)

Preprocess your data and divide it in groups:

  • group with length = 4 (for instance)
  • group with length = 5
  • ...
  • group with length = 28

You will need a manual training loop where in each epoch you feed each of these groups (you can't feed different lenghts all together).


Another approach is, give all steps, make the model predict a shifted sequence like:

  • inputs = original_inputs[:, :-1] #exclude last training day
  • outputs = original_inputs[:, 1:] #exclude first training day

For making the models above suited to this approach, you need return_sequences=True in every LSTM that uses the day dimension as steps (not the inner_seq). (The inner_average method will fail, and you will have to resort to the daily_minutes approach with return_sequences=True and another Permute((2,1,3)) right after.

Shapes would be:

  • branch1 : (m, d, 360, units1)
  • branch2 : (m, d, 360, few_units) - needs to adjust the Reshape for this
    • The reshapes using 1 timestep will be unnecessary, the days dimension will replace the 1.
    • You may need to use Lambda layers to reshape considering the batch size and variable number of days (if details are needed, please tell me)

Training and predicting

(Sorry for not having the time for detailing it now)

You then can follow the approaches mentioned here and here too, more complete with a few links. (Take care with the output shapes, though, in your question, we are always keeping the time step dimension, even though it may be 1)

The important points are:

  • If you opt for stateful=False:
    • this means easy training with fit (as long as you didn't use the "different lengths" approach);
    • this also means you will need to build a new model with stateful=True, copy the weights of the trained model;
    • then you do the manual step by step prediction
  • If you opt for stateful=True from the beginning:
    • this necessarily means manual training loop (using train_on_batch for instance);
    • this necessarily means you will need model.reset_states() whenever you are going to present a batch whose sequences are not sequels of the last batch (every batch if your batches contain whole sequences).
    • don't need to build a new model to manually predict, but manual prediction remains the same

这篇关于如何使用深度学习模型进行时间序列预测?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆