序列到序列-用于时间序列预测 [英] Sequence to Sequence - for time series prediction

查看：103 发布时间：2020/6/21 19:41:07 tensorflow machine-learning keras attention-model sequence-to-sequence

本文介绍了序列到序列-用于时间序列预测的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已尝试构建一个序列到序列模型，以基于其前几个输入来预测传感器信号随时间的推移(请参见下图)

I've tried to build a sequence to sequence model to predict a sensor signal over time based on its first few inputs (see figure below)

该模型可以正常工作，但是我想为事情加分"，并尝试在两个LSTM层之间添加一个关注层.

The model works OK, but I want to 'spice things up' and try to add an attention layer between the two LSTM layers.

型号代码:

def train_model(x_train, y_train, n_units=32, n_steps=20, epochs=200,
                n_steps_out=1):

    filters = 250
    kernel_size = 3

    logdir = os.path.join(logs_base_dir, datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
    tensorboard_callback = TensorBoard(log_dir=logdir, update_freq=1)

    # get number of features from input data
    n_features = x_train.shape[2]
    # setup network
    # (feel free to use other combination of layers and parameters here)
    model = keras.models.Sequential()
    model.add(keras.layers.LSTM(n_units, activation='relu',
                                return_sequences=True,
                                input_shape=(n_steps, n_features)))
    model.add(keras.layers.LSTM(n_units, activation='relu'))
    model.add(keras.layers.Dense(64, activation='relu'))
    model.add(keras.layers.Dropout(0.5))
    model.add(keras.layers.Dense(n_steps_out))
    model.compile(optimizer='adam', loss='mse', metrics=['mse'])
    # train network
    history = model.fit(x_train, y_train, epochs=epochs,
                        validation_split=0.1, verbose=1, callbacks=[tensorboard_callback])
    return model, history

我查看了文档，但我有点迷失了.

I've looked at the documentation but I'm a bit lost. Any help adding the attention layer or comments on the current model would be appreciated

更新: 在Googeling之后，我开始认为我做错了所有事情，并重新编写了代码.

Update: After Googeling around, I'm starting to think I got it all wrong and I rewrote my code.

我正在尝试迁移在此 GitHub存储库中找到的seq2seq模型/a>.在存储库代码中，所展示的问题是预测基于某些早期样本的随机生成的正弦波.

I'm trying to migrate a seq2seq model that I've found in this GitHub repository. In the repository code the problem demonstrated is predicting a randomly generated sine wave baed on some early samples.

我有一个类似的问题，我正在尝试更改代码以适合我的需求.

I have a similar problem, and I'm trying to change the code to fit my needs.

差异:

我的训练数据形状为(439，5，20)439个不同的信号，5个时间步长，每个步长都具有20个特征
在拟合数据时我没有使用fit_generator

超级参数:

layers = [35, 35] # Number of hidden neuros in each layer of the encoder and decoder

learning_rate = 0.01
decay = 0 # Learning rate decay
optimiser = keras.optimizers.Adam(lr=learning_rate, decay=decay) # Other possible optimiser "sgd" (Stochastic Gradient Descent)

num_input_features = train_x.shape[2] # The dimensionality of the input at each time step. In this case a 1D signal.
num_output_features = 1 # The dimensionality of the output at each time step. In this case a 1D signal.
# There is no reason for the input sequence to be of same dimension as the ouput sequence.
# For instance, using 3 input signals: consumer confidence, inflation and house prices to predict the future house prices.

loss = "mse" # Other loss functions are possible, see Keras documentation.

# Regularisation isn't really needed for this application
lambda_regulariser = 0.000001 # Will not be used if regulariser is None
regulariser = None # Possible regulariser: keras.regularizers.l2(lambda_regulariser)

batch_size = 128
steps_per_epoch = 200 # batch_size * steps_per_epoch = total number of training examples
epochs = 100

input_sequence_length = n_steps # Length of the sequence used by the encoder
target_sequence_length = 31 - n_steps # Length of the sequence predicted by the decoder
num_steps_to_predict = 20 # Length to use when testing the model

编码器代码:

Encoder code:

# Define an input sequence.

encoder_inputs = keras.layers.Input(shape=(None, num_input_features), name='encoder_input')

# Create a list of RNN Cells, these are then concatenated into a single layer
# with the RNN layer.
encoder_cells = []
for hidden_neurons in layers:
    encoder_cells.append(keras.layers.GRUCell(hidden_neurons,
                                              kernel_regularizer=regulariser,
                                              recurrent_regularizer=regulariser,
                                              bias_regularizer=regulariser))

encoder = keras.layers.RNN(encoder_cells, return_state=True, name='encoder_layer')

encoder_outputs_and_states = encoder(encoder_inputs)

# Discard encoder outputs and only keep the states.
# The outputs are of no interest to us, the encoder's
# job is to create a state describing the input sequence.
encoder_states = encoder_outputs_and_states[1:]

解码器代码:

Decoder code:

# The decoder input will be set to zero (see random_sine function of the utils module).
# Do not worry about the input size being 1, I will explain that in the next cell.
decoder_inputs = keras.layers.Input(shape=(None, 20), name='decoder_input')

decoder_cells = []
for hidden_neurons in layers:
    decoder_cells.append(keras.layers.GRUCell(hidden_neurons,
                                              kernel_regularizer=regulariser,
                                              recurrent_regularizer=regulariser,
                                              bias_regularizer=regulariser))

decoder = keras.layers.RNN(decoder_cells, return_sequences=True, return_state=True, name='decoder_layer')

# Set the initial state of the decoder to be the ouput state of the encoder.
# This is the fundamental part of the encoder-decoder.
decoder_outputs_and_states = decoder(decoder_inputs, initial_state=encoder_states)

# Only select the output of the decoder (not the states)
decoder_outputs = decoder_outputs_and_states[0]

# Apply a dense layer with linear activation to set output to correct dimension
# and scale (tanh is default activation for GRU in Keras, our output sine function can be larger then 1)
decoder_dense = keras.layers.Dense(num_output_features,
                                   activation='linear',
                                   kernel_regularizer=regulariser,
                                   bias_regularizer=regulariser)

decoder_outputs = decoder_dense(decoder_outputs)

模型摘要:

Model Summary:

model = keras.models.Model(inputs=[encoder_inputs, decoder_inputs], 
outputs=decoder_outputs)
model.compile(optimizer=optimiser, loss=loss)
model.summary()

Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
encoder_input (InputLayer)      (None, None, 20)     0                                            
__________________________________________________________________________________________________
decoder_input (InputLayer)      (None, None, 20)     0                                            
__________________________________________________________________________________________________
encoder_layer (RNN)             [(None, 35), (None,  13335       encoder_input[0][0]              
__________________________________________________________________________________________________
decoder_layer (RNN)             [(None, None, 35), ( 13335       decoder_input[0][0]              
                                                                 encoder_layer[0][1]              
                                                                 encoder_layer[0][2]              
__________________________________________________________________________________________________
dense_5 (Dense)                 (None, None, 1)      36          decoder_layer[0][0]              
==================================================================================================
Total params: 26,706
Trainable params: 26,706
Non-trainable params: 0
__________________________________________________________________________________________________

尝试拟合模型时:

history = model.fit([train_x, decoder_inputs],train_y, epochs=epochs,
                        validation_split=0.3, verbose=1)

我收到以下错误:

When feeding symbolic tensors to a model, we expect the tensors to have a static batch size. Got tensor with shape: (None, None, 20)

我在做什么错了?

推荐答案

这是已编辑问题的答案

首先，当您调用fit时，decoder_inputs是张量，并且无法使用它来拟合模型.您引用的代码的作者使用零数组，因此您必须执行相同的操作(我在下面的虚拟示例中进行了此操作)

first of all, when you call fit, decoder_inputs is a tensor and you can't use it to fit your model. the author of the code you cited, use an array of zeros and so you have to do the same (I do it in the dummy example below)

其次，在模型摘要中查看输出层...它是3D的，因此您必须将目标作为3D阵列进行管理

secondly, look at your output layer in the model summary... it is 3D so you have to manage your target as 3D array

第三，解码器输入必须为1个特征尺寸，而不是您所报告的20个

thirdly, the decoder input must be 1 feature dimension and not 20 as you reported

设置初始参数

layers = [35, 35]
learning_rate = 0.01
decay = 0 
optimiser = keras.optimizers.Adam(lr=learning_rate, decay=decay)

num_input_features = 20
num_output_features = 1
loss = "mse"

lambda_regulariser = 0.000001
regulariser = None

batch_size = 128
steps_per_epoch = 200
epochs = 100

定义编码器

encoder_inputs = keras.layers.Input(shape=(None, num_input_features), name='encoder_input')

encoder_cells = []
for hidden_neurons in layers:
    encoder_cells.append(keras.layers.GRUCell(hidden_neurons,
                                              kernel_regularizer=regulariser,
                                              recurrent_regularizer=regulariser,
                                              bias_regularizer=regulariser))

encoder = keras.layers.RNN(encoder_cells, return_state=True, name='encoder_layer')
encoder_outputs_and_states = encoder(encoder_inputs)
encoder_states = encoder_outputs_and_states[1:] # only keep the states

定义解码器(输入1个特征维！)

define decoder (1 feature dimension input!)

decoder_inputs = keras.layers.Input(shape=(None, 1), name='decoder_input') #### <=== must be 1

decoder_cells = []
for hidden_neurons in layers:
    decoder_cells.append(keras.layers.GRUCell(hidden_neurons,
                                              kernel_regularizer=regulariser,
                                              recurrent_regularizer=regulariser,
                                              bias_regularizer=regulariser))

decoder = keras.layers.RNN(decoder_cells, return_sequences=True, return_state=True, name='decoder_layer')
decoder_outputs_and_states = decoder(decoder_inputs, initial_state=encoder_states)

decoder_outputs = decoder_outputs_and_states[0] # only keep the output sequence
decoder_dense = keras.layers.Dense(num_output_features,
                                   activation='linear',
                                   kernel_regularizer=regulariser,
                                   bias_regularizer=regulariser)

decoder_outputs = decoder_dense(decoder_outputs)

定义模型

model = keras.models.Model(inputs=[encoder_inputs, decoder_inputs], outputs=decoder_outputs)
model.compile(optimizer=optimiser, loss=loss)
model.summary()

Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
encoder_input (InputLayer)      (None, None, 20)     0                                            
__________________________________________________________________________________________________
decoder_input (InputLayer)      (None, None, 1)      0                                            
__________________________________________________________________________________________________
encoder_layer (RNN)             [(None, 35), (None,  13335       encoder_input[0][0]              
__________________________________________________________________________________________________
decoder_layer (RNN)             [(None, None, 35), ( 11340       decoder_input[0][0]              
                                                                 encoder_layer[0][1]              
                                                                 encoder_layer[0][2]              
__________________________________________________________________________________________________
dense_4 (Dense)                 (None, None, 1)      36          decoder_layer[0][0]              
==================================================================================================

这是我的虚拟数据.与您的形状相同.请注意decoder_zero_inputs，它的尺寸与y相同，但它是零的数组

this is my dummy data. the same as yours in shapes. pay attention to decoder_zero_inputs it has the same dimension of your y but is an array of zeros

train_x = np.random.uniform(0,1, (439, 5, 20))
train_y = np.random.uniform(0,1, (439, 56, 1))
validation_x = np.random.uniform(0,1, (10, 5, 20))
validation_y = np.random.uniform(0,1, (10, 56, 1))
decoder_zero_inputs = np.zeros((439, 56, 1)) ### <=== attention

拟合

history = model.fit([train_x, decoder_zero_inputs],train_y, epochs=epochs,
                     validation_split=0.3, verbose=1)

Epoch 1/100
307/307 [==============================] - 2s 8ms/step - loss: 0.1038 - val_loss: 0.0845
Epoch 2/100
307/307 [==============================] - 1s 2ms/step - loss: 0.0851 - val_loss: 0.0832
Epoch 3/100
307/307 [==============================] - 1s 2ms/step - loss: 0.0842 - val_loss: 0.0828

验证预测

pred_validation = model.predict([validation_x, np.zeros((10,56,1))])

这篇关于序列到序列-用于时间序列预测的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

序列到序列-用于时间序列预测 [英] Sequence to Sequence - for time series prediction

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

序列到序列-用于时间序列预测 [英] Sequence to Sequence - for time series prediction

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭