整形LSTM的数据,并将密集层的输出馈送到LSTM [英] Shaping data for LSTM, and feeding output of dense layers to LSTM

查看:67
本文介绍了整形LSTM的数据,并将密集层的输出馈送到LSTM的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为我要适应的模型找出正确的语法.这是一个时间序列预测问题,在将其输入LSTM之前,我想使用一些密集层来改善时间序列的表示形式.

I'm trying to figure out the proper syntax for the model I'm trying to fit. It's a time-series prediction problem, and I want to use a few dense layers to improve the representation of the time series before I feed it to the LSTM.

这是我正在处理的虚拟系列:

Here's a dummy series that I'm working with:

import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
import numpy as np
import keras as K
import tensorflow as tf

d = pd.DataFrame(data = {"x": np.linspace(0, 100, 1000)})
d['l1_x'] = d.x.shift(1)
d['l2_x'] = d.x.shift(2)
d.fillna(0, inplace = True)
d["y"] = np.sin(.1*d.x*np.sin(d.l1_x))*np.sin(d.l2_x)
plt.plot(d.x, d.y)

首先,我将安装没有密集层的LSTM.这要求我重塑数据:

First, I'll fit a LSTM with no dense layers preceeding it. This requires that I reshape the data:

X = d[["x", "l1_x", "l2_x"]].values.reshape(len(d), 3,1)
y = d.y.values

这正确吗?

这些教程使单个时间序列在第一个维度上应该具有1,然后是时间步数(1000),然后是协变量数(3).但是,当我这样做时,模型无法编译.

The tutorials make it seem like a single time series should have 1 in the first dimension, followed by the number of time steps (1000), followed by the number of covariates (3). But when I do that the model doesn't compile.

在这里我编译并训练模型:

Here I compile and train the model:

model = K.Sequential()
model.add(K.layers.LSTM(10, input_shape=(X.shape[1], X.shape[2]), batch_size = 1, stateful=True))
model.add(K.layers.Dense(1))
callbacks = [K.callbacks.EarlyStopping(monitor='loss', min_delta=0, patience=5, verbose=1, mode='auto', baseline=None, restore_best_weights=True)]
model.compile(loss='mean_squared_error', optimizer='rmsprop')

model.fit(X, y, epochs=50, batch_size=1, verbose=1, shuffle=False, callbacks = callbacks)
model.reset_states()

yhat = model.predict(X, 1)
plt.clf()
plt.plot(d.x, d.y)
plt.plot(d.x, yhat)

为什么我无法使模型过拟合?是因为我错误地重塑了数据吗?当我在LSTM中使用更多的节点时,它并不会真正变得更加适合.

How come I can't get the model to overfit?? Is it because I've reshaped my data wrong? It doesn't really get more over-fittey when I use more nodes in the LSTM.

(我也不清楚有状态"的含义.神经网络只是非线性模型.状态"指的是哪些参数,为什么要重置它们?)

(I'm also not clear on what it means to be "stateful". Neural networks are just nonlinear models. Which parameters are the "states" referring to and why would one want to reset them?)

如何在输入和LSTM之间插入密集层? 最后,我想添加一堆密集层,以便基本上在x上进行基础扩展,然后再进入LSTM.但是LSTM需要一个3D阵列,并且一个密集层吐出一个矩阵.我在这里做什么?这不起作用:

How do I interpose dense layers between the input and the LSTM? Finally, I'd like to add a bunch of dense layers, to basically do a basis expansion on x before it gets to the LSTM. But an LSTM wants a 3D array and a dense layer spits out a matrix. What do I do here? This doesn't work:

model = K.Sequential()
model.add(K.layers.Dense(10, activation = "relu", input_dim = 3))
model.add(K.layers.LSTM(3, input_shape=(10, X.shape[2]), batch_size = 1, stateful=True))
model.add(K.layers.Dense(1))

ValueError: Input 0 is incompatible with layer lstm_2: expected ndim=3, found ndim=2

推荐答案

第一个问题,我在做同样的事情,我没有收到任何错误,请分享您的错误.

For first question, i am doing same thing, i didn't get any error, please share your error.

注意:我将为您提供使用函数式API的示例,该API几乎没有更多自由(个人观点)

Note: I will give you example using functional API, which gives little more freedom(personal opinion)

from keras.layers import Dense, Flatten, LSTM, Activation
from keras.layers import Dropout, RepeatVector, TimeDistributed
from keras import Input, Model

seq_length = 15
input_dims = 10
output_dims = 8
n_hidden = 10
model1_inputs = Input(shape=(seq_length,input_dims,))
model1_outputs = Input(shape=(output_dims,))

net1 = LSTM(n_hidden, return_sequences=True)(model1_inputs)
net1 = LSTM(n_hidden, return_sequences=False)(net1)
net1 = Dense(output_dims, activation='relu')(net1)
model1_outputs = net1

model1 = Model(inputs=model1_inputs, outputs = model1_outputs, name='model1')

## Fit the model
model1.summary()


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_11 (InputLayer)        (None, 15, 10)            0         
_________________________________________________________________
lstm_8 (LSTM)                (None, 15, 10)            840       
_________________________________________________________________
lstm_9 (LSTM)                (None, 10)                840       
_________________________________________________________________
dense_9 (Dense)              (None, 8)                 88        
_________________________________________________________________

对于第二个问题,有两种方法:

To your second problem, there are two method:

  1. 如果您不按顺序发送数据,即(batch, input_dims),则使用
  1. If you are sending data without making sequence, which is of dims as (batch, input_dims), then use can use this method RepeatVector, which repeat the same weights by n_steps, which is nothing but rolling_steps in LSTM.

{

seq_length = 15
input_dims = 16
output_dims = 8
n_hidden = 20
lstm_dims = 10
model1_inputs = Input(shape=(input_dims,))
model1_outputs = Input(shape=(output_dims,))

net1 = Dense(n_hidden)(model1_inputs)
net1 = Dense(n_hidden)(net1)

net1 = RepeatVector(3)(net1)
net1 = LSTM(lstm_dims, return_sequences=True)(net1)
net1 = LSTM(lstm_dims, return_sequences=False)(net1)
net1 = Dense(output_dims, activation='relu')(net1)
model1_outputs = net1

model1 = Model(inputs=model1_inputs, outputs = model1_outputs, name='model1')

## Fit the model
model1.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_13 (InputLayer)        (None, 16)                0         
_________________________________________________________________
dense_13 (Dense)             (None, 20)                340       
_________________________________________________________________
dense_14 (Dense)             (None, 20)                420       
_________________________________________________________________
repeat_vector_2 (RepeatVecto (None, 3, 20)             0         
_________________________________________________________________
lstm_14 (LSTM)               (None, 3, 10)             1240      
_________________________________________________________________
lstm_15 (LSTM)               (None, 10)                840       
_________________________________________________________________
dense_15 (Dense)             (None, 8)                 88        
=================================================================

  1. 如果您要发送暗部序列(seq_len, input_dims),则可以> TimeDistributed > ,它会在整个序列上重复相同权重的致密层.

{

seq_length = 15
input_dims = 10
output_dims = 8
n_hidden = 10
lstm_dims = 6
model1_inputs = Input(shape=(seq_length,input_dims,))
model1_outputs = Input(shape=(output_dims,))

net1 = TimeDistributed(Dense(n_hidden))(model1_inputs)
net1 = LSTM(output_dims, return_sequences=True)(net1)
net1 = LSTM(output_dims, return_sequences=False)(net1)
net1 = Dense(output_dims, activation='relu')(net1)
model1_outputs = net1

model1 = Model(inputs=model1_inputs, outputs = model1_outputs, name='model1')

## Fit the model
model1.summary()


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_17 (InputLayer)        (None, 15, 10)            0         
_________________________________________________________________
time_distributed_3 (TimeDist (None, 15, 10)            110       
_________________________________________________________________
lstm_18 (LSTM)               (None, 15, 8)             608       
_________________________________________________________________
lstm_19 (LSTM)               (None, 8)                 544       
_________________________________________________________________
dense_19 (Dense)             (None, 8)                 72        
=================================================================

注意:为此,我在第一层中堆叠了两层,使用的是return_sequence,它在每个时间步长都返回输出,第二层使用该输出,即仅在最后一个time_step处返回输出.

Note: I stacked two layer, on doing so, in the first layer i used return_sequence, which return the output at each time step, which is used by second layer, where it is return output only at last time_step.

这篇关于整形LSTM的数据,并将密集层的输出馈送到LSTM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆